Accuracy assessment and error analysis for diameter at breast height measurement of trees obtained using a novel backpack LiDAR system

The LiBackpack is a recently developed backpack light detection and ranging (LiDAR) system that combines the flexibility of human walking with the nearby measurement in all directions to provide a novel and efficient approach to LiDAR remote sensing, especially useful for forest structure inventory. However, the measurement accuracy and error sources have not been systematically explored for this system. In this study, we used the LiBackpack D-50 system to measure the diameter at breast height (DBH) for a Pinus sylvestris tree population in the Saihanba National Forest Park of China, and estimated the accuracy of LiBackpack measurements of DBH based on comparisons with manually measured DBH values in the field. We determined the optimal vertical slice thickness of the point cloud sample for achieving the most stable and accurate LiBackpack measurements of DBH for this tree species, and explored the effects of different factors on the measurement error. 1) A vertical thickness of 30 cm for the point cloud sample slice provided the highest fitting accuracy (adjusted R2 = 0.89, Root Mean Squared Error (RMSE) = 20.85 mm); 2) the point cloud density had a significant negative, logarithmic relationship with measurement error of DBH and it explained 35.1% of the measurement error; 3) the LiBackpack measurements of DBH were generally smaller than the manually measured values, and the corresponding measurement errors increased for larger trees; and 4) by considering the effect of the point cloud density correction, a transitional model can be fitted to approximate field measured DBH using LiBackpack- scanned value with satisfactory accuracy (adjusted R2 = 0.920; RMSE = 14.77 mm), and decrease the predicting error by 29.2%. Our study confirmed the reliability of the novel LiBackpack system in accurate forestry inventory, set up a useful transitional model between scanning data and the traditional manual-measured data specifically for P. sylvestris, and implied the applicable substitution of this new approach for more species, with necessary parameter calibration.


Background
Forest structures are generally characterized using metrics such as diameter at breast height (DBH), tree height, and tree density (Dubayah and Drake 2000). The features of Forest structure is normally quantified to reflect the community dynamics and effects of disturbances (Dubayah et al. 2010;Filippelli et al. 2019), to estimate the community biomass and carbon pool (Fang et al. 2001;Ni-Meister et al. 2010), and to indicate the mechanism of community assembly (Allié et al. 2015). Forest structure is also the critical information used in forest management and planning (Wulder et al. 2009).
DBH is one of the most important metrics of forest structure, generally used to indicate age structure or to reflect the radial growth of trees (Muller-Landau et al. 2006). The traditional forestry inventory uses a ruler and rangefinder to measure structural indices such as the DBH and height stem by stem at the forest stand scale (Liu et al. 2018a(Liu et al. , 2018b, and predictive models are fitted for regional estimates of forest metrics such as the timber volume or biomass (le Maire et al. 2011). This approach is always limited by the available labor force and operating time. Since the 1980s, vegetation coverage information has been obtained by satellite remote sensing and vegetation indicators such as the normalized difference vegetation index (NDVI) have been designed for estimating the forest vegetation biomass across space with much higher efficiency (Raynolds et al. 2006). However, the traditional remote sensing approach cannot directly obtain forest structure information, so global dynamic vegetation models (DGVMs) generally apply plant functional types as spatial units in simulations (Sato 2009;Bachelet et al. 2018). A lack of vegetation structure information leads to large uncertainty in the vegetation inversion (Meir et al. 2017), and supplementing forest structure information substantially improves the accuracy of DGVMs when predicting the vegetation productivity and estimating the responses of vegetation to climate change (Zhu et al. 2016;Garcia-Gonzalo et al. 2017). In recent years, the rapid development of light detection and ranging (LiDAR) has improved the spatial resolution of remote sensing data to the centimeter level. Compared with traditional spectral remote sensing technology, LiDAR is better at extracting the three-dimensional (3D) structural characteristics of vegetation, and thus it is increasingly used in forestry inventory and forest ecology research (Lim et al. 2003;Davies and Asner 2014;Alonzo et al. 2015).
LiDAR can be divided into three categories comprising space-borne LiDAR, airborne LiDAR, e.g., unmanned aerial vehicle (UAV)-borne LiDAR and airborne laser scanning (ALS), and ground-based LiDAR, e.g., backpack-or vehicle-based LiDAR and terrestrial laser scanner (TLS), according to the loading platform. Space-borne and airborne LiDAR are more efficient at measuring the 3D structure of the vegetation canopy at a larger scale, but less effective at obtaining information regarding the understory structure because of the obstructive effect of the forest canopy Fu et al. 2018). In contrast, groundbased LiDAR is better at providing detailed information about the understory vegetation (Moskal and Zheng 2012). For example, Liu et al. (2016) measured the DBH increases for trees in forest communities using fixed ground-based LiDAR and achieved a tree identification accuracy of about 81% in natural forest stands. Among the various categories of ground-based LiDAR, TLS was developed first and it has a high measurement accuracy but the fixed measurement method limits its spatial flexibility, while the capacity of vehicle-based LiDAR is often limited by complex terrain and available roads (Yu et al. 2015). Backpack LiDAR (e.g., LiBackpack) is a novel type of portable LiDAR for which the surveyor is the loading platform, and thus it has a high capacity in terms of accessibility and route choice. Compared with TLS, backpack LiDAR is generally much lighter and more portable, and it can obtain much higher quality 3D point clouds in forest with different vegetation structures (Su et al. 2018). However, the LiBackpack is loaded on the walking surveyors during the operating process, which may significantly reduce the system stability and increase the uncertainty of the measurements.
Studies have assessed the accuracy of the DBH measurements obtained with backpack LiDAR (Holmgren et al. 2017;Oveland et al. 2017Oveland et al. , 2018, but the factors that might influence these measurements and their contributions to the error of backpack LiDAR measurements have not been explored. According to our field experience using backpack LiDAR for measuring forest structures, the uncertainty in the results may have the following sources: 1) the effects of irregular LiDAR movements on the variation of the point cloud density; 2) the effects of 3D point cloud sampling method on parameter estimates; and 3) the geometric features of the measured objects, such as size, shape and dip angle of a tree trunk. In order to quantify the potential impacts of these factors on the uncertainty of the forest measurements obtained with this novel instrument, we investigated a Pinus sylvestris var. mongolica plantation containing trees of different ages in the Saihanba Natural Forest Park, Hebei Province in northern China, where we focused on the accuracy and uncertainty of the DBH measurements, the most important forest structure parameter.

Study area
The study was conducted in the Qiancengban Forest District (42.38°-42.48°N, 117.08°-117.43°E, 1431 m a.s.l.) of Saihanba National Forest Farm, Hebei Province, China. This forest district is located in a mountainous area at the southeastern edge of the Inner Mongolia Plateau, and it has a semiarid temperate climate (Xing et al. 2017). According to the observational data acquired by the local meteorological station in this region from 1960 to 2017, the annual average temperature was − 1.03°C and the average annual precipitation was 456.87 mm. The main vegetation types comprise artificial coniferous forests planted from the 1960s to the 1980s. The dominant tree species include Pinus sylvestris var. mongolica, Larix gmelinii, and Picea meyeri, as well as scattered natural secondary deciduous broad-leaved forests of Betula platyphylla, and Ulmus pumila woodland. Herbs and shrubs are sparse under the forest canopy.

Data collection
The LiBackpack LiDAR system used in this study was developed in 2018 by the Beijing Green Valley Technology Co. Ltd. The core parts of the LiBackpack system comprise the VLP-16 Lidar sensor and LiDAR360 software. The VLP-16 Lidar sensor was produced by Velodyne Lidar, Inc., and the technique parameters for the LiBackpack are given in Table S1. A spatial analysis module is embedded in the LiDAR360 software that provides a set of functions on LiDAR point cloud data processing and analyses.
Five plots of Pinus sylvestris var. mongolica forest measuring 30 m × 30 m area were selected. First, we used lining ropes to enclose a square of 30 m × 30 m in the forest, then measured the x and y coordinates of each tree with a DBH > 2.5 cm using a laser range finder (DISTO X3), and manually measured the tree DBH (i.e. DBH field ) using a measuring tape. Meanwhile, an investigator carried a LiBackpack to measure the trees within the sample plots on foot according to the route shown in Fig. S1. From a corner of the plot, the investigator walked along a zigzag line with a row distance of about 5 m, and passed two sides of each tree in the sample plot. During data acquisition, the equipment was carried on the investigator's back, where the sensor was higher than the top of the investigator's head and 3D data were scanned automatically. The built-in microcomputer system integrated the relative position and inertial navigation system information to produce 3D point cloud data. Meanwhile, the DBH values were measured for the trees with a measuring tape as field reference data (DBH field ). Basic information of the plots is listed in Table 1.

LiBackpack data processing Ground points extraction and tree point cloud normalization
For the data file of the scanned point cloud of each sampling plot, the first step of processing was to clean the data and separate the point cloud of the ground from that of the trees, since all objects refluxed the lasers generate groups of points in the 3D point cloud. Ground points were extracted by applying an improved progressive triangulated irregular network densification algorithm (Zhao et al. 2016), and the elevations of ground points were then subtracted from the elevations of the nearest non-ground points. Thus, the locations of all trees were transformed onto a horizontal plane of the same elevation in a data normalization process. The ground points were extracted and gridded to generate a file of digital elevation model (DEM) using an inverse distance weighted interpolation method, with a resolution of 0.5 m. These processes were implemented using the LiDAR 360 V2.2 software (https://greenvalleyintl.com/software/lidar360/).

Point cloud slicing processing
DBH fitting was implemented based on a specific height of the tree stem (Mendez et al. 2014), where the normalized point cloud was sliced horizontally with a specific vertical thickness (recorded as ST; Fig. S2) and the point cloud slices with elevations between 1.3 ± 0.5 H were intercepted subsequently. In order to analyze the effect of the slice thickness on the DBH fitting accuracy, 16 Hvalues ranging from 5 to 80 cm were specified at intervals of 5 cm. The slicing process was implemented using the LiDAR 360 V2.2 software.
Tree branches can interfere with the effectiveness of the DBH fitting algorithm used for stem recognition (Liu et al. 2018a(Liu et al. , 2018b, so it is necessary to cut off the point cloud of branches and leaves so as to obtain a point cloud slices of mere stems. For this purpose, we designed a mask extraction procedure to slice the point clouds (see Supplementary Methods 1

DBH fitting
An adaptive cylindrical fitting method was used to calculate the DBH for slices of the point clouds with different thicknesses. First, we used a density clustering algorithm called density-based spatial clustering of applications with noise (DBSCAN) to automatically divide the sufficiently dense point clouds into different clusters to obtain single tree stem segmentation (Tao et al. 2015). For each cluster, a cylindrical fitting method was applied on the base of 3D Ordinary Least Square (OLS) to obtain the LiDAR DBH values (DBH Li values) and the relative coordinates of each tree stem. Then we used artificial visual interpretation to remove the error fitting. This procedure was conducted using LiDAR 360 V2.2 software. After that, we acquired points in shape file format with DBH and relative coordinates information of each stem. By visual check of point patterns of the scanned data and the field measured data and ranking them by the same order, it was easy to match the records of two DBH data by each stem.

Factor extraction
According to the measurement process, we defined four possible sources of error in the LiBackpack measurements: two characteristics of the measured objects comprising 1) DBH field and 2) the tangent of the stem angle (TSA); 3) an environmental factor comprising the topographic slope (TS); and 4) a data factor comprising the point density (PD). Based on the DEM, we obtained TS using the surface slope tool in ArcGIS 10.2. OLS circle fitting (Thomas and Chan 1989), which was used to locate the centers of top and bottom surfaces of the stem slices, before deriving the axes of the stem cylinders to calculate TSA. PD was calculated as the points on the unit horizontal projection area using Eq. (1): where N denotes the number of points, i is the order of individual trees, ST is the slice thickness, and S is the horizontal projection area of the stem point cloud.

Data analysis
The DBH Li values were compared with the DBH field values. The accuracies of the measurements in different slice thicknesses were evaluated based on adjusted R 2 , root Mean Squared Error (RMSE), relative root mean squared error (rRMSE) using Eq.
(2), and the relative accuracy (rA) with Eq. (3): where H is the slice thickness and j denotes the DBH class. We divided all of the trees into five size classes according to the DBH field values. We ensured that the sample size in each class was roughly equal and compared rA in each class in different slice thicknesses using analysis of variance (ANOVA). According to the steps described above, we defined the optimum slice thickness H 0 that could obtain the highest rA for each size class, and we analyzed the causes of the measurement errors (ΔDBH) based on H 0 , where we defined ΔDBH as the dependent variable in Eq. (4) and the four factors as independent variables.
We conducted partial correlation analysis to determine the factors that had significant correlations with ΔDBH, and used these factors to build a multivariate prediction model for correcting the accuracy of the LiBackpack measurements.

Measured versus scanned DBH values for different slice thickness classes
We obtained the DBH field and DBH Li values for 158 Pinus sylvestris var. mongolica trees. In general, the DBH Li values (157.8 ± 51.0 mm) were significantly smaller than the DBH field values (169.7 ± 51.2 mm) (t = − 8.3949, p < 0.001). Accord to linear models in Fig. 1, the adjusted R 2 , RMSE, and fitted slope values varied among different slice thickness classes for the point cloud, where the mean values were 0.85, 2.31 mm, and 0.95, respectively. However, the maximum adjusted R 2 and minimum RMSE values were obtained at a slice thickness of 30 cm (Fig. 2). Thus, we selected 30 cm as the optimal slice thickness for DBH Li fitting in the following analyses.

rA values of DBH Li in different tree size classes
For all of the sampled trees pooled into five diameter classes, the rA values for DBH Li varied with the slice thickness. According to the standard deviation of DBH Li for the samples in each diameter class (Fig. 3a), rA was more variable in the smaller than the larger class of DBH field values. Thus for larger trees, the DBH Li values were more stable with respect to the variation in the slice thickness. ANOVA indicated that rA differed significantly among the DBH classes (p < 0.001), and the mean rA increased as the DBH field class increased (Fig. 3b).

Factor analysis
For the point cloud samples with a slice thickness of 30 cm, the absolute error ΔDBH was negatively correlated with DBH field (r = − 0.178, p = 0.025), the point cloud density (PD) (r = − 0.496, p < 0.001), and tangent of the stem angle (TSA) (r = − 0.189, p = 0.017). Among the independent variables, PD had a significant positive correlation with DBH field (r = 0.188, p = 0.029) and TSA (r = 0.349, p < 0.001), and there were no significant correlations among the other variables. A log-transformed linear model fitted for the relationship best between ΔDBH and PD, and predicted a positive ΔDBH only when PD < 4 points·cm − 2 (p < 0.0001). DBH Li was generally smaller than DBH field at higher PD values (Fig. 4). In addition, a linear decreasing trend was fitted between ΔDBH and DBH field , thereby indicating that when the tree was larger, the DBH tended to be underestimated to a greater extent by LiBackpack. However, partial correlation tests only found a significant correlation between ΔDBH and PD (r = − 0.466, p < 0.001). Therefore, based on the point cloud samples with a slice thickness of 30 cm, an OLS transition model was fitted for trees using the DBH field values and DBH Li values by considering (or not) the point cloud density as a covariate in Eqs. (5) and (6), as follows: The goodness-of-fit and prediction accuracy were better for model (5), i.e., adjusted R 2 = 0.920, RMSE = 14.77 mm, and rRMSE = 0.087, than model (4), i.e., adjusted R 2 = 0.890, RMSE = 20.85 mm, and rRMSE = 0.123. The AIC = 1297.7 of model (5) was also much lower than AIC = 1346.4 in model (4). In particular, the RMSE was reduced by 29.2%, which suggested that the model's predicting capacity could be improved greatly by considering the effect of the point cloud density.

Comparison of DBH measurement among different LiDAR systems
Comparing with the traditional 3D measurement system based on Global Navigation Satellite System and Inertial Navigation System (GNSS+INS) technology, the cost of LiBackpack is lower. Moreover, LiBackpack can implement accurate scanning during its movement and real time data integration, thereby providing the greatest flexibility and high data acquisition efficiency, compared with other forms of LiDAR systems, such as Airborne LiDAR, UAV LiDAR, TLS and Vehiclebased LiDAR (Anderson et al. 2018;Herrero-Huerta et al. 2018;Paris and Bruzzone 2019;Polewski et al. 2019). However, the stability of LiBackpack platform is probably the lowest among these types of LiDAR system, thus the reduction of data acquisition accuracy should be a sacrifice to the flexibility, and this function trade-off is critical for the selection of platforms in practical applications.
In our study, the best prediction model for the LiBackpack measurements obtained results of R 2 = 0.920, RMSE = 14.57 mm, and rRMSE = 0.087. Holmgren et al. (2017) also used backpack LiDAR to measure the DBH for trees in four plots with sample sizes ranging from 50 to 90 individuals, where the average RMSE = 18.5 mm and rRMSE = 0.06. Oveland et al. (2017Oveland et al. ( , 2018 used backpack LiDAR to measure the DBH for 18 and 199 trees, where the RMSE values were 22 and 15 mm, respectively, and the rRMSE values were 0.075 and 0.091. We also collected more than 50 LiDAR-based previous reports on measuring experiments in the past 5 years (Table 2), and found that the RMSE values were significantly higher for ALS measurement than TLS measurements (p < 0.0001), but there were no significant differences between the measurements obtained using different TLS scanning modes (p = 0.58) (Fig. 5).
Moreover, the results obtained with different DBH fitting methods did not differ significantly (p = 0.07). The average RMSE values for single and multi-station TLS measurements were 20.1 and 15.5 mm, respectively, with average rRMSE values of 0.078 and 0.082. The RMSE values were larger for TLS measurements than LiBackpack measurements in the present study, but the rRMSE values of TLS measurements were smaller. LiBackpack measurements were clearly better than ALS measurements (RMSE mean = 6.4 mm, rRMSE mean = 0.251). Thus, there is an obvious trade-off between the accuracy and efficiency of LiBackpack DBH measurements, where moving the LiBackpack under the forest canopy ensures that more complete and uniform point clouds are DBH LiDAR ¼ 0:969 Â DBH field −16:845 Â ln PD ð Þþ28:635 þ μ ð6Þ Fig. 4 Scatter plots and fitted models for (a) absolute error (ΔDBH) vs. cloud point density (PD), and (b) absolute error (ΔDBH) vs. DBH field scanned for stems by the equipment, but simplifying the equipment hardware to improve portability tends to yield lower rA values compared with the TLS measurements (Su et al. 2018).

Optimal thickness of point cloud slice for DBH estimation
The DBH was estimated for trees based on the scanned point cloud using the adaptive cylindrical fitting method, so it was essential to determine the optimal thickness for the sample slice in the point cloud. Two main factors could lead to uncertainty in the estimates. First, in order to correct the effect of the stem dip angle, a sample slice should be sufficiently thick to achieve satisfactory accuracy. Second, the uneven stem surface, especially the bumps, cracks, and branches could introduce substantial noise in the estimate, so a sufficiently thick stem sample is necessary to smooth the unevenness of the trunk. However, the sample could be more variable when the slice is thicker. Therefore, the solution involves finding a balance between the two sources of uncertainty. In our experiment on the P. sylvestris var. mongolica population, a thickness of 30 cm was confirmed as the optimal sample slice by multiple standards. However, the generalizability of this value requires further validation in other working contexts or on other species, and the parameter calibration of this simple model is still critical in its application.

Sources of uncertainty in LiBackpack DBH measurements
The estimated DBH Li values were obviously correlated with the DBH field values, but our experiment indicated that the LiBackpack scanned DBH values were generally smaller than the manually measured values, and the absolute error ΔDBH increased with the tree size (p = 0.025, Fig. 4b). Moreover, ΔDBH had a log-transformed negative correlation with the point cloud density (p <  0.001), and DBH Li was larger than DBH field only when the tree corresponded to a thin point cloud (PD < 4 points·cm − 2 ). In addition, the dip angle of a tree and the topography slope had weak but significant associations with ΔDBH, mainly due to their effects on the scanned point cloud density. Liu et al. (2018aLiu et al. ( , 2018b also obtained the maximum rA when using low density point clouds. However, why did the DBH tend to be underestimated to a greater extent for a larger tree, or for a tree estimated using a thicker scanned point cloud? What is the relationship between tree size and the point cloud density?
The variations in the PD could be attributed to the scanning distance, scanning angle, and scanning frequency by sensors (Anderson et al. 2018). In general, a larger tree has a rougher surface and irregular intersectional shape, with deeper grooves and uneven outer skin. Manual measurement involves wrapping a tape around the outermost surface of the tree to determine the maximum girth, whereas the LiBackpack scanner may emit lasers into the grooves and return a point cloud circle with a particular thickness (Johnson 2009), which is wider for a larger tree. A fitting algorithm based on measurements of either a circle or a cylinder determines the DBH based on a circle passing through the locations with the highest density in the stem point cloud (Liu et al. 2018a(Liu et al. , 2018b, which are generally in the middle. In contrast, smaller trees usually have more branches, which tend to obscure the access by the laser, thereby resulting in a thinner point cloud circle. Therefore, the differences in the surface structure of larger and smaller trees may explain the negative correlations between ΔDBH and tree size, and with PD. The negative correlations between ΔDBH and tree size as well as PD can also be due to the shelter and overlap among trees in a community. When scanning a forest of trees with different DBHs, the walking route should be designed to move through the plot as uniformly as possible (Fig. S1), but a larger tree will always be more exposed to the lasers, whereas a smaller tree is more likely to be partially or completely blocked by neighboring trees. Thus, the average laser exposure time will be lower for a smaller tree than a larger tree. Moreover, the tangent of the stem dip was positively correlated with the point cloud density, although the correlation was weak (p = 0.05), and thus a larger angle for a tree might lead to more laser returns per unit area.
In this study, we used DBH measurement obtained with a measuring tape as the field reference data, and these data have been used in most forestry surveys and plant community studies (Srinivasan et al. 2015;Polewski et al. 2019;Yun et al. 2019). In order to integrate new LiDAR-scanned forest structure measurements with the huge amounts of existing historical data, it would be useful to develop a transition model to link these two data sources. In the present study, we quantified and examined the effects of point cloud features on the quality of the data transition, and proposed a suitable multivariate model with substantially improved accuracy. Further experiments will be needed to explore the uncertainty related to trees of different shapes (or species), different measuring environment and different types of LiDAR scanners. The PD at a low and more stable level is critical for obtaining more accurate measurements, so the experimental design needs to be considered, such as the movement path and speed, scanning time, and the effect of the understory structure of trees on the quality of the scanned data. This test should also apply to the confirming test of the optimal slice thickness, although 30 cm thick here for P. sylvestris var. mongolica, for variable measuring situations.

Conclusions
The application of LiDAR in forestry investigations is expected to substantially improve their efficiency, but assessing the accuracy and sources of uncertainty are essential for LiDAR data. In this study, we measured the DBH for 158 Pinus sylvestris var. mongolica trees using the manual method and a backpack LiDAR system called LiBackpack. Compared with samples from all other thickness classes, a slice of the stem point cloud with a vertical thickness of 30 cm obtained the optimal match between DBH Li and DBH Field . The DBH Li values were generally smaller than the DBH Field values, and the difference was primarily determined by the point cloud density used to the estimate DBH Li . The branches of small trees and the rough surfaces of large trees were the major sources of the uncertainty in the PD. After correction based on PD, the accuracy of the DBH estimates obtained using LiBackpack measurements was similar to that of the TLS measurements. The reasonable data accuracy and high access capacity make LiBackpack an efficient approach for mapping and estimating the structure of forests and woodlands at broad scales.