The minimum set of sub-models for simulating stand dynamics on an individual-tree basis consists of tree-level models for diameter increment and survival. Ingrowth model is a necessary third component in uneven-aged management. The development of this type of model set needs data from permanent plots, in which all trees have been numbered and measured at regular intervals for diameter and survival. New trees passing the ingrowth limit should also be numbered and measured. Unfortunately, few datasets meet all these requirements. The trees may not have numbers or the length of the measurement interval varies. Ingrowth trees may not have been measured, or the number tags may have disappeared causing errors in tree identification.
This article discussed and demonstrated the use of an optimization-based approach to individual-tree growth modelling, which makes it possible to utilize data sets having one or several of the above deficiencies. The idea is to estimate all parameters of the sub-models of a growth simulator simultaneously in such a way that, when simulation begins from the diameter distribution at the first measurement occasion, it yields a similar ending diameter distribution as measured in the second measurement occasion. The method was applied to Pinus patula permanent sample plot data from Kenya. In this dataset, trees were correctly numbered and identified but measurement interval varied from 1 to 13 years. Two simple regression approaches were used and compared to the optimization-based model recovery approach.
The optimization-based approach resulted in far more accurate simulations of stand basal area and number of surviving trees than the equations fitted through regression analysis.
The optimization-based modelling approach can be recommended for growth modelling when the modelling data have been collected at irregular measurement intervals.
The most flexible growth model type in irregular and mixed stands is a set of individual tree models, consisting of separate models for different species or species groups, or using indicator variables for species-specific growth patterns. Individual-tree models may be the best overall tool for predicting the dynamics of tree stands. Stand-level models may be more reliable in even-aged plantations but they may encounter problems in uneven-aged and mixed stands. The minimum set of sub-models for simulating stand development consists of individual-tree models for diameter increment and survival. Tree height, volume and biomass may be calculated with static models. If uneven-aged management is an option, a necessary third component is a model for ingrowth or regeneration.
The development of this type of model set needs data from permanent plots, in which all trees have been numbered and tagged, and measured at regular intervals for diameter and survival. New trees passing the ingrowth limit (i.e., the minimum measured diameter) between two measurement occasions should also be tagged, numbered and measured. Unfortunately, these requirements are not always met. There may be plenty of data, but their use in individual-tree modelling is complicated for instance due to the following reasons:
Trees have not been numbered. The data may be error-free but growth and survival information is not available for individual trees.
There are many tree-identification errors. This may happen for instance when tree identification is based on a certain measurement order of trees. Mortality and ingrowth may make it difficult to keep the same order in successive measurements, leading to situations in which the sequences of tree-level measurements are not always from the same tree.
Measurement interval varies, making it difficult to develop a growth model for a certain time step (e.g. 1 year or 5 years).
Pukkala et al. () proposed an optimization-based method, which can be used to recover individual-tree models using data in which individual trees are not identified. All parameters of the sub-models of a growth simulator were recovered simultaneously in such a way that the simulated stand development, when started from the initial diameter distribution, yielded the measured ending distribution. The parameters of the diameter increment, survival and ingrowth models for three different species were recovered simultaneously. The data came from the permanent plots of several silvicultural experiments where the tree diameters were measured accurately but the trees were not tagged or numbered. This made it impossible to calculate the growth and survival of individual trees or identify ingrowth trees. The measurement interval was not constant. Despite these problems, Pukkala et al. () were able to develop ingrowth models for different species, as well as individual-tree models for diameter increment and survival. The used methodology was tested in another dataset in which the measurement interval was constant, trees had number tags, and ingrowth trees were identified and numbered. In this dataset, the optimization-based recovery approach yielded models that were very similar to ordinary regression models.
The same method was used by de-Miguel et al. () to develop individual-tree diameter increment and survival models for balsa (Ochroma pyramidale) plantations in Bolivia. The data came from permanent plots. The purpose was to measure the trees always in the same order, which seemed easy because trees were planted in straight rows. However, high mortality rate typical to balsa resulted in a data set in which successive diameter recordings did not always represent the same tree. Tree identification errors were many, making it almost impossible to derive diameter increment and survival models by using ordinary regression analysis techniques. Yet, by using the optimization-based recovery technique, de-Miguel et al. () were able to develop plausible models for one-year diameter increment and tree survival in balsa plantations. The researchers tested the method using accurate data on another tropical plantation species, tejeyeque (Centrolobium tomentosum) (de-Miguel et al. ). The comparison showed that optimization resulted in models that were very similar to models fitted in regression analysis. In prediction, optimization-based models performed better than marginal and conditional predictions of mixed-effects models, and were almost as good as fixed-effects models.
The impact of irregular measurement interval on modelling has received some attention in earlier research (Cao ; Nord-Larsen ; Crecente-Campo et al. ). Assuming a constant growth between measurements can lead to under- and over-estimation of tree growth when growth dynamics are clearly nonlinear (Clutter ; McDill and Amateis ). McDill and Amateis () suggested the use of correction factors that force the interpolated growths to be consistent with the estimated growth function. Cao () proposed an iterative technique, in which the calculation of the interim values of stand level predictor variables was based on stand-level models. The approach of Nord-Larsen () places tree survival at the end of the growth period. Therefore, the effect of gradual mortality on the interim values of tree- and stand-level competition variables is ignored. Crecente-Campo et al. () used repeated fittings and simulations with the fitted models to calculate the interim values of predictors. Mortality was simulated by assuming that those trees will die that have the predicted survival probability less than an iteratively found threshold value.
Pinus patula is the most intensively utilized conifer in the tropics and sub tropics, where it is widely planted as an exotic (Wright ). The species is native to Mexico (Dvorak and Donahue ; Dvorak et al. ). P. patula has been introduced for instance to South Africa, Swaziland, Zimbabwe, Madagascar, Malawi, Tanzania, Kenya, Uganda, Angola and Cameroon (Wormald ). The plantations occur over a wide range of sites of varying productive potential and are of major importance in these regions (Record and Hess ; Isango and Nshubemuki ; Mabvurira and Musokonyi ; Muchiri and Muturi ).
P. patula plantations have been used for several purposes both within the area of natural distribution in Mexico and in the countries where it has been planted as exotic. The plantations have been established for commercial timber and pulp as well as wood extracts such as tannins. They also produce fuel-wood and raw material for charcoal. The plantations also play a role in the protection of watershed areas and restoration of degraded land (Palmer and Gibbs ). P. patula is also planted in windbreaks, as a shade tree for coffee and as an ornamental tree.
The current area of P. patula plantations in Kenya is about 100,000 ha (Muchiri and Muturi ). Despite its importance, no individual-tree growth models have been developed for Kenyan P. patula plantations. One reason for this situation is that, although there are permanent plot data in Kenya, they have not been measured systematically making the available data difficult to be used in modelling.
This study applied the optimization-based parameter recovery technique for P. patula plantations in Kenya. Trees have been carefully numbered and measured repeatedly in several plots during a 30-year period. The dataset is valuable since it covers the whole rotation period. The problem with the data is irregular measurement interval, ranging from 1 to 13 years. Two other approaches to deal with irregular measurement intervals were tested and compared with the optimization-based approach. The optimization-based method was used for the first time to recover mixed-effects models.
The data used in this study were collected from a P. patula experiment established in 1967 in Londiani Forest Station in Kericho County of Rift Valley province, situated at approximately latitude 0° 05’ S and longitude 35°53’ E at an altitude of 2,300–2,400 m above sea level. The area receives a mean annual rainfall of about 1,200 mm with a bimodal distribution pattern with long rains occurring between March and June, and the short rains between mid September and November (Jaetzold and Schmidt ). The mean annual temperature is 18°C with a maximum of 24°C.
The P. patula permanent sample plots were primarily established to observe the effect of stocking on P. patula growth. The study area consists of four randomized blocks with seven different spacings per block, resulting in 28 plots of 0.059 ha each. The plan was to measure diameter at breast height, diameter at ground level, tree height and survival annually from age 7 to 14 (1974–1980), followed by every 3 years up to the age of 19 years, and afterwards every 5 years up to rotation age of 30 years. However, the plan was not followed rigorously, and the actual interval varied from 1 to 13 years. The number of measurements in the same plot was 7–10, resulting in 6–9 periods per plot. However, since the measurement years were not always the same for all plots, the total number of periods was 13. The last measurement was conducted in 1996. At each measurement, tree diameter at 1.3 m height (dbh) for all trees, and tree heights of a sample of at least 8 trees per plot were recorded. This resulted in 13,483 diameter increment observations and 1,698 height observations (Table 1). Trees sampled for height were not necessarily the same in different measurements. Dead trees were recorded at each measurement.
A model for dominant height development along age was fitted for site index calculations. Dominant height was defined as the average height of 100 largest (in terms of dbh) trees per hectare. To calculate dominant height, a plot-wise measurement-specific diameter-height model was fitted and used to calculate the heights of trees that were not measured for height.
Several different equation forms (Peschel ; Schumacher ; Lundqvist ; Richards ; Sloboda ; McDill and Amateis ; Diéguez-Aranda et al. , ) derived by using either algebraic differential equations (ADA) or generalized algebraic differential equations (GADA) (Bailey and Clutter ; Cieszewski and Bailey ) were tested as candidate models for modelling dominant height growth. The selected model was used to estimate the site index of each plot using 30 years as index age.
All individual-tree height observations were used to fit an individual tree height model. Several variants of the model based on Stoffels and van Soest () modified by Tomé () were tested in height-diameter modelling. Tree height was predicted from tree diameter, dominant height and dominant diameter. The used model form guarantees that the simulated height development of individual trees is logically related to the dominant height development of the stand.
Diameter increment and survival modelling
The diameter increment model predicts increment as a function of different variables describing site quality, tree size and competition. Site quality was described by site index. Variables representing tree size were diameter at breast height (dbh), tree height and age. Variables that described competition included basal area of trees larger than the subject tree, stand basal area, and their transformations.
Considering the hierarchical structure of our data –multiple measurements taken from the same trees grouped into several plots within blocks– a mixed-effects modelling approach that allows for explicit description of the between- and within-plot correlations was used. However, fixed-effects models were also fitted because the estimation of the random parameters is often too complicated to be regularly used in forestry practice.
Three different approaches to deal with irregular measurement interval were used in diameter increment and survival modelling. The ‘constant rate approach’ assumes constant annual diameter growth and survival rate during the years between any two measurement occasions (Cao ; Crecente-Campo et al. ). This approach is henceforth referred to as ‘Regression 1’ approach. The annual diameter increment was obtained by dividing the periodical increment by the length of the period (expressed in years). Survival rate was assumed to be equal to the annual rate raised to the power of period length. Tests with a high number of different predictor combinations led to the following model form for diameter increment (sub-scripts for plot and block not shown):
where idaij is the annual diameter increment of tree i during period j, dij is the dbh of the same tree in the beginning of the period, BALij is basal area in larger trees, Gj is stand basal area (both in the beginning of period j), uj ~ N (0, σu) is random period factor and eij ~ N (0, σe) is residual. “Period” refers to a growth interval having the same starting and ending year. The years of each of the 13 different periods are shown in Tables 2 and 3. Random parameters for plot and block effects were also tested but they were very small and were therefore ignored. Preliminary models were fitted with random variables included also in the regression coefficients of predictors, but the improvements due to these additional parameters were marginal. Site index was not a significant predictor, most probably because the plots and blocks were near each other, resulting in almost similar site conditions in all plots.
The survival model representing the constant survival rate approach was
where sij is the probability to survive for one year, Stepj is the length of period j (years) (sStep is the probability to survive for Step years), T is stand age (years), DOM is a variable called dominance, calculated as DOMij = 1 – BALij/Gj, and uj ~ N (0, σu) is random period factor.
In the ‘Regression 2 approach’, the time interval between the measurement occasions was used as model predictor. The diameter increment model corresponding to this approach was:
where idpij is the diameter increment of tree i during period j (cm) and Stepj is the length of the period (years). The corresponding survival model was:
In the optimization-based approach the diameter increment and mortality models were fitted simultaneously via nonlinear optimization (Nelder and Mead ), using diameter distributions as modelling data instead of tree-level growth and survival data (Pukkala et al. ). Starting from the initial diameter distribution of each sample plot, the fitting procedure aims at reproducing the diameter distribution at the end of the measured growth interval by minimizing the sum of squared differences between measured and predicted cumulative diameter distributions of tree frequency and stand basal area. The optimized decision variables were the coefficients of the diameter increment and mortality models. Based on Pukkala et al. (), the objective function was defined as:
where θ is the set of coefficients (a0,…a4, b0,…b4, and the period factors of the growth and survival models, see Equations 1 and 2) estimated as arg min z(θ), K is the number of plots, Jk is the number of periods of plot k, Ijk is the number of 3-cm diameter classes in period j of plot k, Gjkm(dijk) and Gjks(dijk) are, respectively, measured and simulated cumulative basal area (m2 · ha–1) at diameter dijk (upper limit of diameter class i) at the end of period j of plot k, and Fjkm(dijk) and Fjks(dijk) are, respectively, measured and simulated cumulative number of trees per hectare at diameter dijk at the end of period j of plot k. The number of simultaneously estimated parameters was 36 (2 × 13 period factors and 2 × 5 fixed regression coefficients).
All models were also fitted as fixed-effects models. The models were compared by simulating the stand development of each plot from the beginning to the end of each measurement interval. RSME (square root on the mean of squared errors) and bias were calculated for the ending basal area, ending number of living trees per hectare, and basal area increment. The ‘Regression 2’ approach was used both with 1-year time step and by predicting the ending diameter and survival rate directly. When the time step was one year, the possible incomplete year at the end of the growth period was simulated by adding a fraction of annual growth to the diameter (growth in 0.67 years was assumed to be 0.67 times annual growth). The survival rate of the incomplete year was obtained as sStep where s is annual survival rate and Step is the length of the incomplete period in years.
Results and discussion
Diameter increment and survival models
The regression coefficients of the fixed predictors of mixed-effects diameter increment and survival models had the same signs in all three approaches (Tables 2 and 3). BAL and stand basal area decreased growth while tree size (dbh) had an increasing-decreasing effect. Increasing dominance improved survival and increasing age decreased it. Tree diameter improved survival, but the negative coefficients of untransformed diameter indicate that survival will start to decrease again at large diameters.
Visual inspection of the models revealed clear differences between modelling approaches (Figure 1). The ‘Regression 1’ approach predicts that diameter increment continues to increase with increasing tree diameter also in very large trees. However, this pattern could not be seen in the modelling data (Figure 2), suggesting that the fixed part of this mixed-effects model gives an erroneous picture on the growth pattern.
The obvious reason for the shape of ‘Regression 1’ model is that the random period factors also describe the influence of tree size. The assumption that the period factors are independent and identically distributed was not met. This can be seen from Figure 3, which shows that the period factor of ‘Regression 1’ model decreases with time. Decreasing period factor decreases the prediction for larger diameters associated with later periods, implying that although the fixed part of the mixed-effects model seems unrealistic, the full model may predict realistic growths. This result verifies the conclusion that it may be unwise to use the fixed part of mixed-effects models in growth simulators (e.g., Temesgen et al. ; Garber et al. ; Pukkala et al. ; Shater et al. ; Groom et al. ; Heiðarsson and Pukkala ; de-Miguel et al. ; de-Miguel et al. ). As discussed by Burkhart and Tomé () it would be better to refit the model without random parameters. Mixed-effects models can be used when the model is calibrated for a particular period. However, since calibration requires measurements from the same period, calibration cannot be done when one is predicting future growth.
The survival models are more similar to each other (Figure 4). In this case, the ‘Regression 2’ approach predicts lower annual survival rates than the other approaches (Figure 4 top). However, the differences between regression approaches 1 and 2 disappeared when survival rates were calculated for 5-year period (Figure 4 bottom). In this particular case, the optimization-based model deviated slightly from the others.
Since it may not be recommendable to use non-calibrated mixed-effects models, i.e., their conditional predictions assuming that random parameters are zero, all models were also fitted as fixed-effects model (Tables 4 and 5). The signs of the predictors again suggest similar growth and survival patterns in all models, but visual inspection reveals that the ‘Regression 2’ model for diameter increment differs a lot from the other models (Figure 5). When the projection period is longer than one year, differences in the shape of the relationship between diameter and diameter increment remain, but the magnitude of the difference is smaller for large trees (results not shown).
The fixed-effects survival models show similar patterns as the fixed parts of mixed-effects models (Figures 4 and 5). Also in this case the difference between ‘Regression 2’ model and the other modes gets smaller for projection periods longer than one year.
Performance of the models
The models were used in simulation software to simulate the growth of all plots for each period. The mixed-effects models were used with and without period factors. The predicted ending stand basal area and number of survivors were compared to their measured values. The RMSE (square root on the mean of squared errors) and bias were calculated also for the basal area increment of the period. Regression models fitted with the ‘Regression 2’ approach were used in two ways: by using one-year steps and by predicting the increment and survival of the whole period directly.
Of the full mixed-effects models, the optimization-based approach resulted in the best simulation results according to all criteria (Table 6, Figure 6). ‘Regression 2’ approach used with the true length of the period (instead of simulating in 1-year steps) was the second best in predicting basal area increment and ending basal area. However, using ‘Regression 2’ approach with 1-year time steps resulted in the lowest accuracy and precision.
As expected, the RMSEs and biases were larger (with some exceptions) when only the fixed parts of mixed models were used (Table 7). Ranking of the regression approaches was now less straightforward. The optimization-based approach was the best according to all criteria.
When the statistics were computed for the fixed-effects models, ‘Regression 1’ approach turned out to be better than ‘Regression 2’ approach, and optimization was again clearly better than the other methods (Table 8). All regression approaches resulted in very biased estimates of basal area increment.
Theoretically, full mixed-effects models should provide the most accurate and precise predictions. The second best should be fixed-effects models, and fixed parts of mixed-effects models should be the worst (e.g., Temesgen et al. ; Garber et al. ; Pukkala et al. ; Shater et al. ; Heiðarsson and Pukkala ). Comparison of Tables 6, 7 and 8 reveals that this was not always the case. A probable reason for the deviations from expectations is that the statistics in Tables 6, 7 and 8 were computed for variables other than the response variables of regression analyses.
The best models were the full mixed-effects models recovered by using the optimization-based approach. However, period factors cannot be used when predicting future growth. Therefore, the models that should be used in simulations are the fixed-effects models recovered with the optimization approach.
Two supplementary models were developed for simulating the development of stand and tree height. Tree height may be a predictor in volume, taper and biomass equations and it is therefore useful to have models for height as well.Of the tested dominant height models, the Hossfeld equation had good fitting statistics while being biologically acceptable. Some other functions had slightly better fitting statistic but they gave unrealistic predictions outside the range of variation of the modelling data. The selected dominant height model is:
H refers to dominant height T to stand age. The degree of explained variance was 0.916 and the RMSE was 1.984 m. When the model is used to calculate site index, T1 is replaced by measured stand age, H1 by measured dominant height and T2 by index age (30 years). When using the model to predict dominant height development, T1 is replaced by index age, H1 by site index, and T2 by the wanted projection age. Figure 7 shows that the model follows fairly well the measured sequences of dominant height in different plots.
The individual-tree height model is:
where h is tree height (m), H is dominant height (m), d is dbh (cm) and D is dominant diameter (cm). For this model, the degree of explained variance was 0.899 and the RMSE was 1.746 m. The model predicts that trees of a certain diameter are taller when the stand develops and its dominant height and dominant diameter increase (Figure 8).
The study showed that regression models based on the assumption of linear growth or constant survival rate between the measurement occasions, or using the length of measurement interval as a predictor, may lead to very biased predictions in growth simulations. Therefore, more sophisticated methods are needed to deal with irregular measurement intervals. The optimization method used in this study overcomes some of the problems related to other methods (Cao ; Nord-Larsen ). In addition, it works also in cases where varying measurement interval is not the only shortcoming in the data (Pukkala et al. , de-Miguel et al. ). The method can be used to calibrate an existing model or to fit a new model, or fit only a part of the models needed in simulation. For example, in the case that there are already models for diameter increment and survival, the method can be used to fit ingrowth models using data in which ingrowth trees cannot be separated from the other trees.
Earlier research has shown that the optimization-based modelling approach is competitive with individual-tree modelling based on regression analysis when there are no irregularities in the data (Pukkala et al. ; de-Miguel et al. ). This study showed that the method resulted in clearly better models than obtained with more simplistic approaches to deal with irregular measurement interval. The benefit would of course be smaller if there was less variation in measurement interval.
The drawback of the method is poorer parameter identifiability. The response variables of the models are not utilized in modelling, and the parameters of all models are estimated simultaneously. This may lead for instance to models that overestimate diameter growth, which is compensated for by underestimated survival rate. However, including both basal area and frequency distributions in the objective function decreases the likelihood of this kind of mutual cancellation of model errors. The problem may have been more serious if ingrowth model was estimated simultaneously with survival and diameter increment model. In this case, overestimated ingrowth may be compensated for by overestimated mortality among the smallest diameter classes.
The optimization-based model recovery method used in this study makes a higher number of different types of observational plots available for individual-tree growth modelling. This saves money and time. There is no need to have the trees numbered or have a constant measurement interval. The method is not sensitive to errors in tree identification.
Bailey RL, Clutter JL: Base-age invariant polymorphic site curves. For Sci 1974, 20: 155–159.
Cao QV (2004) Annual tree growth predictions from periodic measurements. In: Connor KF (ed) Proceedings of the 12th biennial southern silvicultural research conference, Gen. Tech. Rep. SRS-71, USDA, Forest Service, Southern Research Station, Asheville, NC, pp 212–215
Crecente-Campo F, Soares P, Tomé M, Diéguez-Aranda U: Modelling annual individual-tree growth and mortality of Scots pine with data obtained at irregular measurement intervals and containing missing observations. Forest Ecol Manage 2010, 260: 1965–1974. 10.1016/j.foreco.2010.08.044
de-Miguel S, Mehtätalo L, Shater Z, Kraid B, Pukkala T: Evaluating marginal and conditional predictions of taper models in the absence of calibration data. Can J Forest Res 2012, 42(7):1383–1394. 10.1139/x2012-090
de-Miguel S, Guzmán G, Pukkala T: A comparison of fixed- and mixed-effects modelling in tree growth and yield prediction of an indigenous neotropical species ( Centrolobium tomentosum ) in a plantation system. Forest Ecol Manage 2013, 291: 249–258. 10.1016/j.foreco.2012.11.026
de-Miguel S, Pukkala T, Morales M: Using optimization to solve tree misidentification and uneven measurement interval problems in individual-tree modeling of Balsa stand dynamics. Ecol Eng 2014, 69: 232–236. 10.1016/j.ecoleng.2014.04.008
Dvorak WS, Hodge GR, Kietzka JE, Malan F, Osorio LF, Stanger T: Pinus patula. In Conservation & Testing of Tropical & Subtropical Forest Tree Species by the CAMCORE Cooperative. College of Natural Resources, NCSU, Raleigh, NC. USA; 2000:149–173.
Isango JA, Nshubemuki L: Management of forest plantations in Tanzania with emphasis on planting and growth and yield. In Tree seedling production and management of plantation forests Edited by: Pukkala T, Eerikäinen K. 1998, 25–38.
Muchiri GM, Muturi MN: The state of forest plantations development, management and future focus in Kenya. In Tree seedling production and management of plantation forests Edited by: Pukkala T, Eerikäinen K. 1998, 11–20.
Sloboda B: Zur Darstellung von Wachstumprozessen mit Hilfe von Differentialgleichungen erster Ordung. Mitteillungen der Badenwürttem-bergischen Forstlichen Versuchs und Forschungsanstalt, Freiburg; 1971.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.
The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
Juma, R., Pukkala, T., de-Miguel, S. et al. Evaluation of different approaches to individual tree growth and survival modelling using data collected at irregular intervals – a case study for Pinus patula in Kenya.
For. Ecosyst.1, 14 (2014). https://doi.org/10.1186/s40663-014-0014-3