On the potential to predetermine dominant tree species based on sparse-density airborne laser scanning data for improving subsequent predictions of species-specific timber volumes

Räty, Janne; Vauhkonen, Jari; Maltamo, Matti; Tokola, Timo

doi:10.1186/s40663-016-0060-0

Research
Open access
Published: 30 January 2016

On the potential to predetermine dominant tree species based on sparse-density airborne laser scanning data for improving subsequent predictions of species-specific timber volumes

Janne Räty¹,
Jari Vauhkonen¹,
Matti Maltamo¹ &
…
Timo Tokola¹

Forest Ecosystems volume 3, Article number: 1 (2016) Cite this article

1915 Accesses
11 Citations
1 Altmetric
Metrics details

Abstract

Background

Tree species recognition is the main bottleneck in remote sensing based inventories aiming to produce an input for species-specific growth and yield models. We hypothesized that a stratification of the target data according to the dominant species could improve the subsequent predictions of species-specific attributes in particular in study areas strongly dominated by certain species.

Methods

We tested this hypothesis and an operational potential to improve the predictions of timber volumes, stratified to Scots pine, Norway spruce and deciduous trees, in a conifer forest dominated by the pine species. We derived predictor features from airborne laser scanning (ALS) data and used Most Similar Neighbor (MSN) and Seemingly Unrelated Regression (SUR) as examples of non-parametric and parametric prediction methods, respectively.

Results

The relationships between the ALS features and the volumes of the aforementioned species were considerably different depending on the dominant species. Incorporating the observed dominant species inthe predictions improved the root mean squared errors by 13.3–16.4 % and 12.6–28.9 % based on MSN and SUR, respectively, depending on the species. Predicting the dominant species based on a linear discriminant analysis had an overall accuracy of only 76 % at best, which degraded the accuracies of the predicted volumes. Consequently, the predictions that did not consider the dominant species were more accurate than those refined with the predicted species. The MSN method gave slightly better results than models fitted with SUR.

Conclusions

According to our results, incorporating information on the dominant species has a clear potential to improve the subsequent predictions of species-specific forest attributes. Determining the dominant species based solely on ALS data is deemed challenging, but important in particular in areas where the species composition is otherwise seemingly homogeneous except being dominated by certain species.

Background

Forest ecosystem modelling requires inventory estimates, which are traditionally acquired using stand-level (compartmentwise) forest inventories based on field assessments or visual interpretation of aerial images (e.g. Eid et al. 2004; Koivuniemi and Korhonen 2006; Ståhl et al. 2011). Due to species-specific growth and yield modeling, the inventories are required to provide species-specific predictions (e.g. Maltamo et al. 2011). The conventional inventories to provide stand-level estimates are currently being replaced in Scandinavia, in particular, by discrete-return Light Detection and Ranging (LiDAR) data recorded by small-footprint airborne laser scanning (ALS; for an overview, see Maltamo et al. 2014) incorporated with spectral data from aerial (Packalén and Maltamo 2006, 2007, 2008) or satellite images (Wallerman and Holmgren 2007) for species recognition. Extracting species information has also been tested in North America (Hudak et al. 2008; van Ewijk et al. 2014) and Central Europe (Latifi et al. 2010; Heinzel and Koch 2012; Torabzadeh et al. 2014), and a detailed review on the topic can be found from Vauhkonen et al. (2014a).

High species recognition accuracy is crucial for forest management planning systems that involve different treatment schedules depending on species and also important towards accurate growth and yield estimates. According to the simulations Korpela and Tokola (2006) carried out in forest conditions closely corresponding to our study area, predictions of the total stand volume based on tree-level, species-specific allometric dependencies had Root Mean Squared Errors (RMSEs) of 30 % and about 15 %, when the species of the individual trees were recognized at accuracies of 75 % and 80–90 %, respectively, and the other measurements were error-free. A similar result is reported by Tompalski et al. (2014) in Canada, who nevertheless found predictions based on species-specific equations more accurate than generic ones.

Using ALS data, high species recognition rates are generally based on detecting individual trees (e.g., Holmgren and Persson 2004; Kim et al. 2009; Ørka et al. 2009; Suratno et al. 2009), which requires acquiring data in a higher density than what is currently feasible from an operational viewpoint (e.g., Maltamo and Packalen 2014; Næsset 2014). However, several studies have reported successful predictions of the total (Woods et al. 2011; Nord-Larsen and Schumacher 2012; Villikka et al. 2012) and even species-specific forest attributes (Vauhkonen et al. 2012; Ørka et al. 2013) based on ALS data with pulse densities < 1 m⁻² and other scanning parameters not permitting individual tree detection.

The ALS inventories employing the sparse-density data are most often implemented using so-called area-based approaches (Næsset 2002), in which (i) models to predict the forest attributes of interest for the individual areas-of-interest (AOIs) are fit based on a set of training field plots; and (ii) the resulting models are applied to all the AOIs of the entire inventory area to produce wall-to-wall predictions. Operational implementations are elaborated by White et al. (2013), Maltamo and Packalen (2014), and Næsset (2014). In particular the modeling of a multivariate response such as the species-specific attributes is generally built upon non-parametric nearest neighbor (NN) approaches, in which the predictions of the considered forest attributes are simultaneously obtained as (weighted) averages of the k most similar reference observations in terms of the considered distance metric applied in the predictor space.

NN predictions require a considerably large database of the reference observations (see Maltamo et al. 2009a), although some studies have indicated that accurate species-specific forest attribute estimates may be provided with a limited number of plots (Kotamaa et al. 2010; Villikka et al. 2012; Pippuri et al. 2013). Further, an adequately representative reference data with respect to the species and size distribution of the area may be difficult to obtain using systematic sampling designs (Maltamo et al. 2009b). The predictions could be improved by a complementary inventory according to the deficiencies of the initial estimation, as demonstrated by Vauhkonen et al. (2012) complementing the data of Maltamo et al. (2009b).

Due to the practical difficulties to obtain adequately extensive and representative field reference data for the NN predictions, parametric models such as those constructed by Seemingly Unrelated Regression (SUR) approaches could be seen as alternative methods (e.g., Lindberg et al. 2010; see also Maltamo et al. 2009c, 2012). Even if fitted with similarly limited data, the ability to linearly interpolate in between the observations could be a practical benefit compared to the NN predictions, which are, to some degree, always based on the discrete data points. Beside ALS studies, the SUR and other methods for fitting regression models based on systems of equations are presented by Siipilehto et al. (2007).

From a practical point of view, it is well-reasoned to seek alternative implementations for ALS inventories relying on the availability of both the ALS and image data. Even though aerial images are usually available for the purpose of visual forest stand delineation, using them as additional data complicates the inventory system due to the required co-registrations and calibrations of the radiometric differences of multiple images. Plot-level species-specific predictions based solely on ALS data have also been tested (Ørka et al. 2013; Vauhkonen et al. 2012, 2014b). The predictions related to the dominant species in particular have been accurate based on ALS data (Ørka et al. 2013), but the availability of the spectral data has generally improved the predictions (Vauhkonen et al. 2012; Ørka et al. 2013).

Even if the main tree species were estimated correctly, large errors may be related to the predictions of other forest attributes, especially those of the non-dominant tree species (e.g., Maltamo et al. 2009b). For example, Packalén et al. (2009) proposed excluding species representing <10 % of the total volume from the accuracy measures due to the insignificance of such species in the compartmentwise inventory. Yet, even such “near to zero” predictions may distort species proportions and cause further problems in inventory areas with an unbalanced species distribution such as strongly pine-dominated areas typical to the boreal region (e.g., Maltamo et al. 2009b; Vauhkonen et al. 2012, 2014b). However, whether known beforehand that a subject stand was dominated by certain species with a proportion of, say, >75 % or >95 %, the maximum error level expected for the predictions of the minor species could be confined. Based on this reasoning and the encouraging results of successfully predicting the dominant species based on ALS data alone (Ørka et al. 2013; Vauhkonen et al. 2014b) and improving the results of NN methods by pre-classifying the inventory area (Maltamo et al. 2015), a test of using dominant species information for the species-specific predictions was motivated.

The purpose of the study is thus to predict dominant species and species-specific timber volumes in a strongly pine-dominated test area. Predictions of the dominant species based on ALS features are evaluated. Prediction models based on NN and SUR are formulated and compared with respect to accounting for the a priori information on the dominant species.

Methods

Study area and field data

The data studied were originally collected for crown base height assessments (Korhonen 2012). Two test areas within a geographical distance of 30 km were established in Kuhmo, northeastern Finland. The area is very homogenous and strongly dominated by Scots pine (Pinus sylvestris L.) trees. The other species to be distinguished are Norway spruce (Picea abies [L.] H. Karst.) and a group of deciduous trees consisting of mainly birches (Betula spp. L.) and aspen (Populus tremula L.), which form minor proportions and typically occur below the dominant canopy. Altogether 265 field sample plots with co-located ALS and field data were studied.

Circular sample plots with radii of 9 m were used in the field data collection. Every tree with a diameter at breast height (D _bh) >5 cm was measured for the D _bh and crown base height (CBH). Trees with a D_bh corresponding to the basal area-weighted median tree of each species occurring on a plot were determined in the field and measured for tree height. The D_bh and height of these trees were used as the median tree diameter and height (D _gM and H _gM, respectively) of the corresponding species per plot, and the maxima of the values were used as the D _gM and H _gM of the entire plot. Plot basal area (G) was calculated by summing from the tree-level basal areas, determined as $ {\scriptscriptstyle \frac{\pi }{4}}{D}_{bh}^2 $. The missing tree heights were predicted by calibrating the prediction models for the parameters of Näslund’s (1936) height curve presented by Siipilehto (1999) using the species-specific D _gM and H _gM estimates. The volumes of the individual trees were predicted by models of Laasasenaho (1982), employing the D _bh, height, and tree species as predictors. The models for birch were used for all deciduous trees. Central characteristics of the field measurements aggregated for the field plots are shown in Table 1.

Table 1 Species-specific volume characteristics of the 265 sample plots. Min: minimum, Max: maximum, Sd: standard deviation

On the potential to predetermine dominant tree species based on sparse-density airborne laser scanning data for improving subsequent predictions of species-specific timber volumes

Abstract

Background

Methods

Results

Conclusions

Background

Methods

Study area and field data

ALS data and the extracted features

ALS data

Predicting the dominant species using ALS

Modelling the species-specific volumes

k-Most Similar Neighbor (k-MSN)

Seemingly Unrelated Regression (SUR)

Accuracy assessment

Results

Relationships between ALS features and species-specific attributes

Models for species-specific volumes

Classification of the dominant species

Prediction accuracies

Discussion

Conclusions

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Competing interests

Authors’ contributions

Appendix I

Appendix I

Rights and permissions

About this article

Cite this article

Share this article

Keywords