Estimating upper stem diameters and volume of Douglas-fir and Western hemlock trees in the Pacific northwest

Volume and taper equations are essential for obtaining estimates of total and merchantable stem volume. Taper functions provide advantages to merchantable volume equations because they estimate diameter inside or outside bark at specific heights on the stem, enabling the estimation of total and merchantable stem volume, volume of individual logs, and a height at a given diameter. Using data collected from 1218 trees (1093 Douglas-fir (Pseudotsuga menziesii (Mirbel) Franco) and 125 western hemlock (Tsuga heterophylla)), we evaluated the performance of one simple polynomial function and four variable-exponent taper functions in predicting upper stem diameter. Sample trees were collected from different parts of the states of Oregon, Washington, and California. We compared inside-bark volume estimates obtained from the selected taper equation with estimates obtained from a simple logarithmic volume equation for the data obtained in this study and the equations used by the Forest Inventory and Analysis program in the Pacific Northwest (FIA-PNW) in the state of California and western half of the states of Oregon and Washington. Variable exponent taper equations were generally better than the simple polynomial taper equations. The FIA-PNW volume equations performed fairly well but volume equations with fewer parameters fitted in this study provided comparable results. The RMSE obtained from taper-based volume estimates were also comparable with the RMSE of the FIA-PNW volume equations for Douglas-fir and western hemlock trees respectively. The taper equations fitted in this study provide added benefit to the users over the FIA-PNW volume equations by enabling the users to predict diameter at any height, height to a given diameter, and merchantable volume in addition to cubic volume including top and stump (CVTS) of Douglas-fir and western hemlock trees in the Pacific Northwest. The findings of this study also give more confidence to the users of FIA-PNW volume equations.


Background
Volume and taper equations are essential for obtaining estimates of total and merchantable stem volume.Volume equations relate total or merchantable stem volume with easily-measurable variables such as diameter at breast height (DBH; 1.37 m), total tree height, and other variables (e.g.height to crown base or crown ratio) through regression.Estimates of stem volume to certain diameter limits are critical to meet different timber utilization standards.
However, such utilization standards change in response to local market and economic conditions making the use of a fixed merchantable volume equation less attractive (Czaplewski et al. 1989).Taper functions provide advantages to merchantable volume equations because they estimate diameter inside or outside bark (dib or dob) at specific heights on the stem, enabling the estimation of total and merchantable stem volume, volume of individual logs (Kozak 1988), and a height at a given diameter (Li et al. 2012).
Numerous volume and taper equations have been published and are being used at different scales of forest management.There are also many more unpublished and proprietary equations developed and used by forest companies and agencies.The application of volume equations is not only crucial for economic valuation of timber resources but is also vital to the assessment of biomass availability and carbon sequestration (Poudel and Temesgen 2016).For example, the official U.S. forest carbon report to the United Nation Framework Convention on Climate Change (UNFCCC) is based on a component ratio method that converts the sound wood volume obtained from regional volume equations to stem biomass using wood density and bark and branch scaling factors.
Douglas-fir (Pseudotsuga menziesii (Mirb.)Franco) and western hemlock (Tsuga heterophylla (Raf.)Sarg.) are two major tree species in the Pacific Northwest (PNW -States of Oregon, Washington, and California) and account for a substantial portion of the live volume and biomass in the region.A variety of approaches to obtain total and merchantable volume in the PNW are in common use.The Forest Inventory and Analysis (FIA) program of the U.S. Forest Service in the PNW (FIA-PNW) estimates Douglas-fir cubic volume including top and stump (CVTS) using the Brackett (1977) equation in western Oregon and western Washington, the Summerfield (1980) equation in eastern Oregon and eastern Washington, and the MacLean and Berger (1976) equation in California.However, for western hemlock, it uses the Chambers and Foltz (1979) volume equation for all three states (OR, WA, and CA).The Oregon Department of Forestry uses taper functions associated with the Forest Projection and Planning System (Arney et al. 2004).The Washington Department of Natural Resources uses taper functions developed by Flewelling and Ernst (1996), Flewelling (1994), Kozak (1994), or the Brackett (1977) volume equation depending on species and location (east-side or west-side) to estimate volume.
Taper equations have been used in forestry for a long time and can be divided into two major groups.The first group of equations expresses tree form as a single continuous function (Newnham 1988(Newnham , 1992;;Kozak 1988Kozak , 2004)).The second group of equations (segmented taper equations) uses different models for various parts of the stem and joins these models in such a way that their first derivatives are equal at the point of intersection (Max and Burkhart 1976;Clark et al. 1991).
Differences in stand conditions affect tree form and thus tree volume (Bluhm et al. 2007).Accordingly, different model forms and fitting techniques have been used in developing volume and taper models in the past in different kinds of stands.These models range from simple polynomial to nonlinear and multivariate regression models (Kublin et al. 2008).Traditionally, attempts to improve the predictive ability of taper equations were made by the addition of auxiliary variables such as crown dimensions, stand and site variables, and upper stem diameter measurements.Recent studies, however, have focused on approaches to account for the observed between-tree variability in stem form (Trincado and Burkhart 2006), included stand density as explanatory variable (Sharma and Parton 2009), and calibration of taper equations using upper stem diameter measurements (e.g.Cao 2009;Aria-Rodil et al. 2014).
Selecting the best taper equation to predict upper stem diameters and consequently total tree volume or volume to a specific diameter or height is crucial for forest managers.Thus, the evaluation of different taper equations is critical.The objectives of this study were to: 1) fit taper equations for Douglas-fir and western hemlock trees; 2) examine the accuracy of these equations in predicting diameter and volume inside bark; 3) develop a simple volume equation based on DBH and tree height; and 4) compare the accuracy of taper-based volume estimates with the volume estimates obtained from the FIA-PNW equations and the simple volume equation fitted in this study.

Data
Data for this study came from three different sources and consisted of 1218 trees -1093 Douglas-fir and 125 western hemlock.The first set of data (DATASET I) consisted of measurements on 716 trees (615 Douglas-fir and 101 western hemlock) sampled in 1993 from the western side of the states of Oregon and Washington.Average DBH of these trees was 37.1 cm (range 8.8-92.5 cm) and 36.5 cm (range 16.3-102.1cm) for Douglas-fir and western hemlock trees, respectively.Average height of these trees was 30.8 m (range 10.2-61.9m) and 27.5 m (range 12.7-40.7 m) for Douglas-fir and western hemlock trees, respectively.Diameter inside bark in these trees were measured at 0.3, 0.6, 0.9 and 1.37 m, and at each 1/10th of the height above breast height afterward.
The second set of data (DATASET II) consisted of measurements on 399 Douglas-fir trees collected by the Stand Management Cooperative (SMC) from the western half of the states of Oregon and Washington.Average DBH and heights of these trees were 18.2 cm (range 4.7-43.8cm) and 15.2 m (range 5.1-27.6 m), respectively.Diameter inside bark in these trees were measured at 0.1, 1.0, 1.37 m, and at every 1 m afterward.
The third set of data (DATASET III) consisted of measurements on 103 trees (79 Douglas-fir and 24 western hemlock) sampled in 2012-2015 from the states of Oregon, Washington, and California as a part of an FIA biomass sampling and estimation project at Oregon State University.Average DBH of these trees was 48.6 cm (range 16.5-114.0cm) and 41.7 cm (range 18.0-69.9cm) for Douglas-fir and western hemlock trees, respectively.Average height of these trees was 32.2 m (range 16.5-53.3m) and 29.5 m (range 14.3-43.7 m) for Douglas-fir and western hemlock trees, respectively.Diameter inside bark in these trees was measured at stump height (approximately 0.3 m) and at every 5.18 m afterward.Numbers of trees by species and diameter class in each dataset are presented in Table 1.

Actual volume computation
Sampling protocols for three datasets differed so greatly that it was necessary to harmonize the method for actual volume computation for all datasets.We considered two linear interpolation approaches to accomplish this.In the first approach, the dibs at specified intervals were obtained based on linear interpolation of measured dibs.In the second approach the interpolated dibs were obtained based on linear interpolation of fractional error in predicted dib based on the selected taper equation.The second approach was selected based on graphical comparison and its plausibility, particularly at the butt log.The following steps were used to get the actual volume: 1) Fit Kozak (2004) taper equation using measured DIBs and obtain predicted diameter inside bark (PDIB) at each measurement heights; 2) Obtain fractional error (FE) in predicted DIB as: 3) Set a consistent stump height (minimum of all stump heights -0.03 m); 4) Divide tree height into 100 equal parts and obtain FE at each percentile of tree height by linear interpolation where, FE int is the linear interpolated fractional error at height h; h 0 and h 1 are measurement heights immediately below and above h; and FE 0 and FE 1 are fractional errors immediately below and above h; 5) Obtain interpolated DIBs at each h by back solving Eq. (1) for DIB; DIB FE is same as the measured dib at heights where actual measurements were made.6) Compute volume below 0.03 m and the top section as cylinder and cone respectively; 7) Compute volume of other sections using Smalian's formula with DIB FE i.e. using numerical integration with step size equal to 1% of total tree height and diameters at two ends obtained from interpolation using Kozak (2004) taper equation.
Observed inside bark cubic volume including top and stump (CVTS) for each tree was then computed by summing volume of all sections, stump, and top.

Diameter inside bark
Rojo et al. ( 2005) evaluated 31 different taper functions that belonged to three model groupssimple, segmented, and variable-form taper functions.These models differ in how they describe stem profile as well as their model forms and number of parameters to be estimated.In this study, we used one simple and four variable exponent taper models that Rojo et al. (2005) selected for final evaluation.These models differed in how they describe tree taper along the bole.Except for the Cervera (1973) taper equation, which is a simple polynomial equation, all the models are variable-exponent taper models that describe bole shape with a changing exponent.Geometric properties of these taper functions are displayed by plotting predicted diameter inside bark vs. height on tree bole of a Douglas-fir tree (Fig. 1).At first, the leave-oneout validation was carried out to determine the best model for estimating upper stem diameter.Parameters of the five taper models (Table 2) were obtained based on two datasets (e.g.dataset I and II).To obtain evaluation statistics, fitted models were then applied to the third dataset (e.g.dataset III).At this point, all models were fitted as fixed effects model and assumed homogenous error variance.Accuracy of these models were compared using mean prediction bias and root mean squared error (RMSE) produced by these models in estimating diameter inside bark along the bole.Akaike Information Criterion (AIC), and Bayesian Information Criterion (BIC) values were also obtained for each of these models.
where, y ij and ŷij are observed and predicted diameter inside bark of j th observation (j = 1, 2, …, m i ) along the bole on i th tree (i = 1, 2, …, n).
The Kozak (2004) taper model was selected for further analysis because it produced the smallest root mean squared error among the five models used to predict upper stem diameter of Douglas-fir trees in all datasets and comparable RMSEs for western hemlock trees.Because the taper data consists of multiple measurements taken on a single tree, there is an inherent autocorrelation among the observations obtained from the same tree which can be accounted for by adding individual tree random effects and by specifying a correlation structure in the model (Li et al. 2012).
An error model of the following form was fitted: where e is the residual (observed dibpredicted dib) obtained from fitting the Kozak (2004) model as a nonlinear fixed effect model; a 0 , a 1 , a 2 , and a 3 are regression was specified in the model to account for the heterogeneity of error variance.Here, ε i is the model residual, σ 2 is the residual sum of squares, is the weighting variable where b e 2 is the predicted values of e 2 obtained from fitting Eq. 6, and t is the variance function coefficient.

Volume estimation
Taper-based volume At first, we obtained the parameter estimates of the weighted nonlinear mixed effects model with first-order autoregressive error structure.After investigating different combination of random effects, parameters b 4 and b 7 of the selected taper equation (Table 2, M4) were associated with tree specific random effects.Diameter inside bark along the bole for each tree in the held-out dataset was obtained using the fixed effects parameters of the fitted taper equation.Taper-based CVTS was then obtained using numerical integration with step size equal to 1% of total tree height.Evaluation statistics (i.e.bias and RMSE) for predicting diameter inside bark CVTS were also calculated.
Local volume equation Even though taper equations have advantages over direct volume equations in that they have flexibility in estimating volume to different merchantability standards, it is desired that the taper equations provide CVTS with the same accuracy as the direct volume equations.We fitted a simple nonlinear volume equation that predicts the CVTS as a function of DBH and total tree height.Evaluation statistics for the simple volume equation (Eq.7) were obtained for this model as well based on leave-one-out approach.
where, ln(•) is the natural logarithm, a 0 , a 1 and a 2 are the parameters to be estimated from data and other variables are as defined previously.

FIA-PNW volume equations
The Pacific Northwest unit of the FIA uses different sets of species-specific volume equations depending upon geographic location (See Background section for details).The equation forms along with their coefficients for Douglas-fir and western hemlock trees are given in Table 3.All statistical analysis was performed in R3.3.2 (R Core Team 2016) and the mixed effects models were fitted with the nlme function in library nlme (Pinheiro et al. 2016).Evaluation statistics for estimating CVTS were obtained as follows: where, y i and ŷi are observed and predicted CVTS of the i th (i = 1, 2, 3,…, n) tree.Bias and RMSE percentages were obtained by dividing these values by the average observed CVTS of n trees.

Diameter inside bark
The parameter estimates and their corresponding standard errors for the five taper equations are given in Table 4. Plots of standardized residuals vs. fitted values obtained from the final model fitted for both species (Fig. 2) did not show any problems with heteroscedasticity.For Douglas-fir, all the parameters in all taper models were statistically significant at the 0.05 level of significance but for western hemlock some of the parameters namely b 2 and b 3 (p-values 0.8 and 0.9, respectively) for Bi (2000) model and coefficient b 3 (p-value = 0.3) for Kozak (2004) models were not statistically significant.Mean prediction bias and ffiffi ffi where, D is the diameter at breast height outside bark (cm); h is the height above ground level (m); d is the diameter inside bark (cm) at height h; H is the total tree height (m); b 1b 9 are regression coefficients to be estimated from data; X ¼ H−h H−1:3 ; t = h/H, k = 1.3/H; p is the inflection point and was set to 0.25 and 0.12 for Douglas-fir and western hemlock, respectively root mean squared error produced by these models in predicting upper stem inside bark diameters obtained based on leave-one-out validation are given in Table 5.
For Douglas-fir trees, the Cervera (1973), Kozak (1988), and Kozak (2004) models produced positive mean prediction biases for dataset I; i.e., these model under-predicted the diameter inside bark along the bole for dataset I.These models for dataset II and III and all other models for Douglas-fir, had negative biases; i.e., these models over-predicted the diameter inside bark along the bole.The Kozak (2004) Kozak (1988) model was best for dataset III (Table 5).Absolute values of mean prediction bias in predicting dib ranged from 0.09 cm to 1.76 cm and RMSE ranged from 1.04 cm to 4.70 cm.Note that the mean prediction bias and root mean squared errors are based on leave-one(dataset)out validation.
In the case of western hemlock trees, all taper models produced negative biases for dataset I and positive biases for dataset III in predicting diameter inside bark.Absolute values of biases ranged from 0.37 to 1.83 cm and RMSE ranged from 3.32 to 4.2 cm.The Arias-Rodil et al. ( 2014) model had the smallest bias (absolute value)   and RMSE values for dataset III but the Kozak (1988) had the smallest RMSE for dataset I. Once again, the mean prediction bias and root mean squared errors are based on leave-one(dataset)-out validation.
The relative height at inflection points, based on Kozak 1988 model, were set to 0.25 and 0.12 for Douglas-fir and western hemlock, respectively.In general, the Kozak models (1988 and2004) performed better than other models in predicting diameter inside bark along the bole for both species.Additionally, the Kozak 2004 model had lower RMSE for Douglas-fir and very similar prediction bias (absolute value) for western hemlock trees compared to the 1988 model.This model also has much lower multicollinearity if the model were to be log transformed to fit as a linear model (Kozak 2004) and has been successfully used by many researchers (e.g.Rojo et al. 2005;Li et al. 2012).
The selected Kozak (2004) taper model was further improved by fitting it as a weighted nonlinear mixed effects model with first-order autoregressive correlation structure (CAR(1)) and provided the smaller BIC compared to the same model with second-order autoregressive (CAR(2)) structure.Thus, our taper-based volume calculations were based on the diameter inside bark predicted from this model.Li et al. (2012) also found the CAR(1) model-fitting approach best for eleven conifer species in the Acadian region of North America.Rojo et al. (2005), however, found the CAR(2) model to perform better for maritime pine in Galicia, Northwestern Spain.
Parameter estimates and their standard errors obtained by fitting the Kozak 2004 model as a weighted nonlinear mixed effects model with first-order autoregressive error structure are given in Table 6.All parameters were statistically significant at the 0.05 level of significance for western hemlock but for Douglas-fir, parameter b 3 was not significant (p-value = 0.4).Note that these parameter estimates are obtained by fitting the model using the entire dataset.We did not see any pattern in bias in predicting upper stem diameter by the Kozak (2004) taper function but the RMSE slightly increased with increasing DBH for both species (Fig. 3).However, there was no obvious pattern in relative RMSE calculated as 100 Â RMSE DBH .

Volume estimation
Leave-one-out validation statistics in predicting CVTS based on all the taper equations fitted as fixed effects models were also obtained (Table 7).There was no  model that was consistently better than the other in predicting CVTS for both species and all datasets.For Douglas-fir trees, Bi (2000), Arias-Rodil et al. ( 2014), and Kozak (1988) taper models produced the smallest absolute bias for datasets I, II, and III, respectively.However, the Kozak (2004) produced smallest RMSE for datasets I and II and Kozak (1988) produced the smallest RMSE for dataset III.For western hemlock trees, Cervera (1973) and Kozak (1988)  Table 8 shows the parameter estimates and their standard error of a simple CVTS model (Eq.7) fitted to the data obtained in this study.All the coefficient in this model were statistically significant at 5% level of significance (p-value < 0.05).Evaluation statistics of CVTS (bias and RMSE) obtained from the fitted taper equations, local volume equation, and the volume equations used by FIA-PNW are given in Table 9.All three methods (local volume equation, FIA-PNW equation, and taper-based volume prediction) over-estimated the logarithmic volume in Douglas-fir trees as indicated by negative bias (Table 9).The local volume equation for Douglas-fir performed marginally less biased than the volume equation used by the FIA-PNW (− 0.0103 m 3 vs.-0.0185m 3 ) and Kozak (2004) taper equation (− 0.0103 m 3 vs.-0.0656m 3 ).However, the FIA-PNW equation had smaller RMSE than the local and taper-based volume equation.We saw some increase in absolute bias with CVTS prediction using Kozak ( 2004) equation ( M4) for trees larger than 65 cm DBH (Fig. 4).This could be due to the smaller number of larger trees available in the model fitting dataset.
Similar to the Douglas-fir trees, M4 and FIA-PNW equations over-predicted volume for western hemlock trees.The local volume equation, however, underpredicted western hemlock CVTS.M4 produced the smallest bias while the error was highest with the FIA-PNW equation (Table 9).Once again, the FIA-PNW equations produced the smallest root mean squared error.
It is important to note that the simple CVTS equation had smaller bias and comparable RMSE compared with the FIA-PNW equation even though the simple CVTS   9).Bias and RMSE in estimating CVTS based on taper equation also differed by tree diameter for both species (Fig. 4).For Douglas-fir, bias in taper-based CVTS ranged from − 12.66 to 0.84%.For western hemlock, bias ranged from − 28.64 to 2.29% and performance of taper equation was poor for trees larger than 65 cm DBH, as noted previously.This could be because there were only a few larger trees available for model development.
Predicted stem profiles i.e. plot of relative diameter (d/D) vs. relative height (h/H), of small, medium, and larges trees, are shown in Figs. 5 and 6 and show that there is less taper in smaller trees compared with the medium and large sized trees for both species.

Summary and conclusion
We evaluated the performance of five different taper equations in estimating upper stem diameters and cubic volume including top and stump (CVTS) of Douglas-fir and western hemlock trees based on mean prediction bias and RMSE they produced.Both Kozak (1988) and    Kozak (2004) variable-exponent taper equations performed better, in terms of RMSE, than the simple polynomial taper equation of Cervera (1973) for Douglas-fir trees.However, Bi (2000) and Arias-Rodil et al. ( 2014) taper equations, both of which are also variable exponent taper equations, produced higher RMSE compared to the simple taper equation of Cervera (1973).For western hemlock, all the variable-exponent models performed better than the simple polynomial taper function for dataset I.For dataset II, the Cervera (1973) taper equation produced smaller cross validation RMSE than all but Arias-Rodil et al. ( 2014) equation.This finding is consistent with the findings of Rojo et al. (2005) who compared the performance of 31 taper functions in predicting upper stem diameters for maritime pine in Northwestern Spain.Among the variable-exponent taper equations, Kozak (1988) and Kozak (2004) models performed better than the Bi (2000) and Arias-Rodil (2014) taper equations.Kozak (2004) was chosen as the final model because it produced smaller RMSE values for Douglas-fir, comparable RMSEs for western hemlock trees, and has been used in the past in several studies to predict upper stem diameters as well as merchantable and total volume.We report the final model parameters based on the taper and local volume equation fitted using the combined dataset.Therefore, we also obtained separate evaluation statistics for each dataset by applying the fitted model to these datasets.These results are presented in Table 10.We observed that the RMSE values were smaller for the datasets with smaller trees compared with the dataset with larger trees for both species.For example, Douglas-fir trees in dataset I and II were both sampled from the western half of states of Oregon and Washington but RMSEs for dataset II, which had small average DBH (37.1 cm vs. 18.2 cm), were less than the RMSE percent for dataset I for all methods.However, the biases were slightly higher for second dataset (Table 10) for all methods.The volume equations used by FIA-PNW performed fairly well but a volume equation with fewer parameters fitted in this study using DBH and height provided similar

Fig. 2
Fig.2Residual plots of fitted models(Kozak 2004) for Douglas-fir and western hemlock trees produced the smallest bias and RMSE, respectively, for dataset I and Arias-Rodil et al. (2014) model produced smallest bias and RMSE for dataset III.

Fig. 4
Fig. 4 Bias and RMSE percent by diameter class in estimating CVTS using Kozak (2004) taper equation for Douglas-fir (DF) and western hemlock (WH) trees

Table 1
Number of trees by diameter class in three datasets used in this study.A total of 1218 trees (1093 Douglas-fir and 125 western hemlock) were used

Table 2
Taper equations evaluated for predicting upper stem diameter in this study

Table 3
Volume equations used by FIA-PNW to estimate cubic volume including top and stump (CVTS)

Table 4
Parameter estimates their standard errors (in parenthesis) for the taper equations used in this study *Coefficient not significant at α = 0.05

Table 5
Mean prediction bias, root mean squared error (RMSE) in estimating diameter inside bark using different models.Statistics were obtained by using leave-one-out cross validation method e.g.bias and RMSE for dataset I were obtained by applying the models fitted using datasets II and III

Table 6
Kozak (2004)timates and their standard error forKozak (2004)model fitted for Douglas-fir and western hemlock trees.Parameters b 4 and b 7 were associated with tree specific random effects

Table 7
Mean prediction bias, root mean squared error (RMSE) in estimating inside bark cubic volume using different models.Statistics were obtained by using leave-one-out cross validation method i.e. bias and RMSE for dataset I were obtained by applying the models fitted using datasets II and III

Table 9
Mean bias and RMSE for estimating cubic volume including top and stump obtained from fitted taper equations (M4), volume equations fitted in this study (Local), and the volume equations used by FIA-PNW.Evaluation statistics for each dataset were obtained by using leave-one-out cross validation method i.e.

Table 10
Bias and RMSE for estimating cubic volume including top and stump obtained from fitted taper equations (M4), volume equations fitted in this study(Local), and the volume equations used by FIA-PNW for dataset I, II, and III.Final models were fitted using all datasets and evaluation statistics were obtained by applying them to individual dataset