Mapping forest age using National Forest Inventory, airborne laser scanning, and Sentinel-2 data

The age of forest stands is critical information for many aspects of forest management and conservation but area-wide information about forest stand age often does not exist. In this study, we developed regression models for large-scale area-wide prediction of age in Norwegian forests. For model development we used more than 4800 plots of the Norwegian National Forest Inventory (NFI) distributed over Norway between 58{\deg} and 65{\deg} northern latitude in a 181,773 km2 study area. Predictor variables were based on airborne laser scanning (ALS), Sentinel-2, and existing public map data. We performed model validation on an independent data set consisting of 63 spruce stands with known age. The best modelling strategy was to fit independent linear regression models to each observed site index (SI) level and using a SI prediction map in the application of the models. The most important predictor variable was an upper percentile of the ALS heights, and root-mean-squared-errors (RMSE) ranged between 3 and 31 years (6% to 26%) for SI-specific models, and 21 years (25%) on average. Mean deviance (MD) ranged between -1 and 3 years. The models improved with increasing SI and the RMSE were largest for low SI stands older than 100 years. Using a mapped SI, which is required for practical applications, RMSE and MD on plot-level ranged from 19 to 56 years (29% to 53%), and 5 to 37 years (5% to 31%), respectively. For the validation stands, the RMSE and MD were 12 (22%) and 2 years (3%). Tree height estimated from airborne laser scanning and predicted site index were the most important variables in the models describing age. Overall, we obtained good results, especially for stands with high SI, that could be considered for practical applications but see considerable potential for improvements, if better SI maps were available.


Introduction
Forest stand age is a key parameter in designing both forest management and forest conservation strategies. Determining forest age is not a trivial task and can be difficult in complex forest structure but simpler in more even-aged forest. A description and characterization of the different types of forest age can be found in Chirici et al. (2011). Forest age can be determined based on stem cores taken at individual reference trees (Grissino-Mayer 2003). Counting the number of year rings in the stem core extracted close to the ground leads to a good estimate of tree age. However, this is a tedious and time-consuming method. Alternative technologies are required to estimate forest stand age for larger areas to be of practical use for forest management and conservation strategies. For example, in Norway the age of stands is often unknown, there are no public maps with reliable estimates of forest age, and even in areas with forest management inventories age is often one of the most uncertain parameters.
Growing processes over time result in specific tree dimensions and forest structure and are determined by a combination of historic management and abiotic factors. Management include tree species, genetic material, establishment history, stand density, and past harvesting activities. Abiotic factors are environmental conditions including topography, soil type, and macro and micro climatic variables. These characteristics determine the growth and production potential of a site for a given tree species and result in trees of greatly varying dimensions given the same age. Site index is typically used to describe this productivity and is tree-species specific. Site index is commonly determined by measuring the height and age of dominant trees in even-aged stands found on that site (Skovsgaard and Vanclay 2008). Other methods for estimating site index make use of climate (Nothdurft et al. 2012;Sharma et al. 2012) or remotely sensed data (Socha et al. 2017;Kandare et al. 2017).
Forest stand age and site characteristics described by site index determine the status of forest height, structure and density. Therefore, stand height and structure together with site characteristics can be used to estimate stand age. Proxies for forest stand height and structure can be estimated from remotely sensed data, such as airborne laser scanning (ALS) (Naesset 1997;Nord-Larsen and Riis-Nielsen 2010;Hudak et al. 2014;Mura et al. 2015;Guo et al. 2017) or optical data (Kayitakire et al. 2006;Mora et al. 2013;Lang et al. 2019), which often exist for large areas. In Norway, the site characteristics climate, and water and light availability are largely related to geographical location, height above sea level (asl), distance from coast, and terrain slope, which determine temperature, precipitation, water and nutrient availability, and length of growing season (Antón-Fernández et al. 2016).
Previous research demonstrated that forest stand age can be modelled using spectral data (Jensen et al. 1999;Reese et al. 2003;Buddenbaum et al. 2005;Kayitakire et al. 2006;Dye et al. 2012), 3D data from ALS (Maltamo et al. 2009;Racine et al. 2014;Zhang et al. 2014), and a combination of both (Straub and Koch 2011). Kayitakire et al. (2006) used image texture from IKONOS-2 satellite images with 1 x 1 m ground resolution and linear modelling to explain forest stand age and other structurerelated variables. Their study site was in Belgium and included 29 sample plots in even aged Norway spruce stands with age between 27 and 110 years. Age was best explained by the correlation texture calculated within a 15 x 15 m window, resulting in R 2 of 0.81 with a mean absolute error of 10 years. Racine et al. (2014) used ALS data and the k-nearest neighbour (kNN) approach to estimate forest stand age based on 158 forest plots in managed boreal forest in central Quebec, Canada. The mean plot age ranged from 11 to 94 years. The best model combining ALS based forest structure variables and ALS based variables describing site characteristics resulted in R 2 of 0.83 and root-mean-squarederror (RMSE) of 8.8 years. Maltamo et al. (2009) used ALS data and the k-most similar neighbour approach on 335 NFI plots in a 22,000 ha study site in Finland and reported RMSE of 23.5, 18.8, and 18.7 years for age of pine, spruce, and deciduous plots, respectively. Straub & Koch (2011) used both airborne ALS and multispectral variables to model forest stand age in a small study area (9.24 km 2 , 108 forest stands, 300 inventory plots) in south-west Germany using linear regression, reporting RMSE and RMSE% of 19.7 years and 28.8%, respectively. Buddenbaum et al. (2005) used hyperspectral HyMap data from the spectral angle mapper to classify Norway spruce and Douglas fir age classes on a located within one HyMap scene covering 2.5 km x 10 km in south-west Germany. Their best result of 81% overall accuracy was achieved with the maximum likelihood classifier for the four age categories 10-30, 30-50, 50-80, and >80 years. Reese et al. (2003) performed estimation of forest stand age using Landsat 7 satellite data, field data from the Swedish NFI, and the kNN approach in south-western Sweden. In this area, field data for 89 Norway spruce dominated stands ranging from 6 to 106 years were available. RMSE of predicted age for these stands was 12 years or 23%. Maltamo et al. (2019, pers. communication) found that modelling age of stands using ALS for forests older than 100 years was infeasible. For forest stands below 100 years, their models resulted in RMSE of 9 and 10 years when using ALS data and site index, and ALS data alone, respectively.
In other studies outside the temperate and boreal zone, Dye et al. (2012) used spectral and texture information from high resolution satellite QuickBird images and random forest to predict the age of pine forests in the west of South Afrika. They used 142 sample stands, and age ranged from 4 to 24 years. Their normalized out-of-bag errors ranged from 28% to 34%. Jensen et al. (1999) modelled coniferous forest age of a small study site in Brazil using Landsat TM satellite data and regression and artificial neural network approaches. Main tree species was loblolly pine and tree age ranged from 1 to 40 years. Percentage of stands with absolute age errors below 2 years were up to 83% of all stands for multiple regression modelling and up to 98% of all stands for the best artificial neural network. In a study in central Italy comprising 128.402 ha forest in various growing conditions, Frate et al. (2015) used multispectral satellite imagery and 304 field plots to first model timber volume using the kNN approach, and subsequently, used inverted yield models to predict forest age. On 305 independent validation stands covering 3,137 ha and stand age from 1 to 127 years with mean of 52 years, they obtained forest age estimates with RMSE of 16 years (30%). Zhang et al. (2014) used forest height from spaceborne ALS and non-linear regression modelling to predict forest age in China at 1 km resolution. The authors fitted first biomass-height and then age-biomass models using field observations from 3543 forest plots, and reported R 2 of 0.6 and 0.7, respectively. However, the spatial resolution was 1 km, which is too coarse for operational forest management. To the best of our knowledge there are no large-scale studies predicting forest stand age that can be used for practical forest management.
The objectives of the present study were to (i) model and map forest stand age of Norway spruce (Picea abies (L.) H.Karst.), Scots pine (Pinus sylvestris L.), and broadleaved (mostly downy birch (Betula pubescens Ehrh.)) dominated forest using National Forest Inventory (NFI) sample plots with ALS and optical satellite variables; and (ii) to validate the result on an independent data set consisting of forest stands with known age. Even though we conducted all analyses for all three tree species, we will focus on spruce here and present the results for pine and birch in the appendix. We believe this to be the first attempt to model and predict forest age at a regional scale using national laser scanning campaigns.

Study area
The 181,773 km 2 study area is located in Norway between 58° and 65.3° northern latitude and was determined by the availability of airborne ALS coverage ( Figure 1). Growing conditions vary considerably with latitude and elevation. The natural tree line is at around 1100 m above sea level (asl) in southern Norway and around 130 m asl in the north. Depending on these factors, climate zones range roughly from subarctic in the north and east, oceanic at the coast, and continental in the south-east. Tree species of main economic interest are spruce and pine, making up for the majority of biomass and timber volume. Birch-dominated forests are most abundant in terms of forest area and mainly occur as an early succession species after harvests or in high elevation forests.

National Forest Inventory data
We used the permanent sample plots of the Norwegian NFI as reference data (Tomter et al. 2010). In the study area, the NFI is based on a systematic grid of 3 km x 3 km in the low-land region and 3 km x 9 km in the low-productive, birch-dominated mountain region. For trees with a diameter at breast height >= 5 cm (dbh, 1.3 m above ground), parameters are measured on circular plots with a size of 250 m 2 . Tree heights of approximately 10 trees are measured using a Vertex hypsometer (Haglöf 2007). The heights of the remaining trees are predicted using locally calibrated height-diameter functions, and arithmetic mean height for each plot is calculated.
Stand parameters such as age and site index are determined on circular sample plots of 1000 m 2 . Each plot center is permanently marked with a metal pole buried in the ground with known coordinates determined by a global navigation satellite system (GNSS) device. The Norwegian NFI completes one full cycle every five years, i.e. each year one fifth of all plots are visited and forest variables measured. Relevant variables for the present study are stand age, mean height, and site index (SI).
Stand age is determined for each plot at one or more representative trees just outside the 250 m 2plot boundary by taking one or more stem cores using an increment corer. The biological age, rather than chronological age, is recorded that corrects years of suppression below canopy after germination. Alternatively, the number of whirls is counted in young forest where this is possible. In forests that consist of either one or more than two layers, age is the basal-area weighted age of all trees. In two-layered forests, age is the basal-area weighted age of all trees in the dominant layer.
SI is determined in classes of of 6,8,11,14,17,20,23, and 26 which describe the height (m) of the top 100 trees per ha at age 40. To this end, height and age of a tree representative of the 10 largest trees on the 1000 m 2 plot is measured and SI is determined based on SI curves (Tveite 1977).
We used NFI plots located in stands dominated by spruce, pine, and broadleaved species (defined as plots with >= 75% timber volume of each tree species, respectively). The major tree species in broadleaved dominated forests is downy birch (Betula pubescens Ehrh.), and in the following we only refer to birch when addressing broadleaved dominated forest. From these plots, we only selected NFI plots in productive forest (yearly volume increment > 0.1 m 3 / ha), and removed plots with canopy emergent seed-trees, resulting in 2121 spruce, 1779 pine, and 929 birch dominated plots that were used for modelling. Age for these plots ranged from 3 to 270, 4 to 287, and 3 to 223 years, respectively (Table 1).

Independent validation data
We used stand-level observations of forest age from 63 forest stands covering a total area of 170 ha as an independent validation data set. The age of these stands was quality-assured by the local forest administration. The stands were located in central Norway in Trøndelag county ( Figure 1) and their age ranged from 11 to 89 years ( Table 1). The SI of the stands was reported in another system than for NFI plots and consisted of the three categories "Low", "Medium", and "High". The SI of the stands was, however, not used for predicting stand age. 62 stands were dominated by spruce, and one stand was dominated by pine according to a tree species map produced independently from this study (see Section 1.1.2). The spruce, pine, and birch proportions of the stands according to the species map ranged from 48% to 100%, from 1% to 52%, and from 1% to 39%, respectively.

Auxiliary data
Variables extracted from ALS data, a mosaic of atmospherically and topographically corrected Sentinel-2 images, and a raster of the distance to the closest coast line were used for developing models and for mapping age by applying the models. Furthermore, a site index map (pSI), and a tree species map were only used for mapping age by applying the developed models.
Airborne laser scanning data ALS data were collected in several campaigns for the study area, except for high mountain ranges above the tree line. Data were collected between 2010 and 2018 with a density of 2 to 5 pulses per m 2 , resulting in first return densities ranging between 0.5 and 36 and a mean of 8 first returns per m 2 . A fine-resolution digital terrain model (DTM, 1 m x 1 m pixel size) was produced from the last return data by the Norwegian Mapping Authority (Kartverket 2019). The ALS point cloud was heightnormalized by subtracting the DTM elevation from corresponding point cloud elevation using bidirectional interpolation. The height-normalized point cloud was used to calculate various descriptive metrics for each NFI plot based on first returns, first returns above 2 m height above ground, and last returns. The metrics included mean, variance, coefficient of variations, kurtosis and skewness of ALS return heights, 10 th , 25 th , 50 th , 75 th , 90 th , and 95 th height percentiles, and ALS return density metrics for 10 height slices (d0 -d9). Crown coverage metrics were calculated as percentage of first returns above 2, 5 and 10 m, respectively. The DTM was resampled to 16 m x 16 m, such that the cell size corresponded approximately to the area covered by an NFI plot (250 m 2 ). From the DTM, terrain slope was computed as a raster with a cell size of 16 m x 16 m. All ALS variables, DTM elevation and slope were extracted for each NFI plot and rasters for those variables with a cell size of 16 m x 16 m were created for prediction purposes. Furthermore, the time difference between the ALS and the NFI data acquisitions was calculated for each NFI plot.

Sentinel-2 satellite images
The two Sentinel-2 (S2) satellites are equipped with multispectral sensors, which detect a broad electromagnetic spectrum (443 nm to 2202 nm) in 13 bands. Three of these bands (1: coastal aerosol, 9: water vapor, and 10: short-wave infrared (SWIR) cirrus) are measuring atmospheric properties and were not used in this study. S2 bottom of atmosphere (BOA) reflectance images acquired between 30 June and 31 July 2018 were mosaiced using the bands B2, B3, B4, B5, B6, B7, B8, B8A, B11, and B12. The bands in 20 m resolution were resampled to 10 m with the nearest neighbor resampling method. Cloud and shadow areas were masked out by using the classification map created by the atmospheric correction software (Sen2Cor, Louis et al. 2016). The remaining datasets were mosaiced, and color balancing was performed using the PCI Geomatics software (see Puliti et al. (2020) for more information).
S2 variables for each NFI plot were derived by extracting the area-weighted means of the pixel values intersecting with the sample plot polygons. The normalized difference vegetation index (NDVI) was calculated as band 8 minus band 4 divided by band 8 plus band 4.

Predicted site index
The site index layer of the Norwegian Forest Resource Map SR16 (Astrup et al. 2019) with a 16 x 16 m pixel size was used to apply our age models. The (predicted) site index (pSI) was mapped using climate and terrain variables in a boosted regression model utilizing the SI observed on NFI plots as the response (Astrup et al. 2019). Independence of age as input to site index prediction is crucial for age modelling that utilizes site index as a predictor variable. This was the case for the pSI map which was only based on climate and terrain variables. The pSI map has a resolution of 16 m x 16 m and is freely available (Norwegian Institute of Bioeconomy Research). Weighted means of the pSI pixels intersecting with the NFI plots were calculated. These weighted means were mapped to the closest SI level ( Figure 2). The RMSE and MD of the pSI were 3.9 (29.7%) and -2.3, respectively for spruce, 5.5 (54.8%) and -4.5, respectively for pine, and 5.1 (45.5%) and -3.5, respectively for birch. Clear regression towards the mean effects were visible as the lowest (6) and highest (26) SI levels never occurred in the pSI.

Tree species map
The tree species layer ) of the Norwegian Forest Resource Map SR16 (Astrup et al. 2019) with a 16 x 16 m pixel size was used to apply our age models for spruce, pine, and birch pixels. The tree species map was based on multitemporal S2 data and the Random Forest classifier using NFI plots as a reference. Overall accuracies of this map were 75% on plot level and 90% on stand level ).

Age modelling
An area-based approach (Naesset 2002) was utilized to model age observed at NFI plots using remotely sensed variables as predictors. Independent linear regression models for each SI were fit with the structure where y = g(Age) is the n-vector of observed age with n = number of NFI plots and g as a link function, X = design matrix for predictor variables including an intercept, ̂ = estimated parameters, ε = independently and normally distributed residuals, and 2 = the residual variance. For each SIspecific model, we started with a model including only the 95 th percentile of first ALS returns (h95_first) as a proxy for mean height. Final models were fit by forward and backward selection based on Akaike's Information Criterion (AIC) as stopping rule and further selection based on pvalues (p < 0.05). We tested models with the identity (untransformed response variable), squareroot, and natural logarithm as the link functions. Based on an initial analysis, the log transformation showed the best results and was chosen for further model development. We corrected for backtransformation bias by adding half the residual variance to the predictions before the backtransformation. Furthermore, we tested if adding predictors such as squared terms, and interactions between predictors improved the model.
We evaluated the models based on coefficient of determination (R 2 ), root-mean-squared-error (RMSE), and mean deviance (MD) according to (3) with = 1. Relative error statistics such as RMSE% and MD% were obtained by division by the mean of the observed values and multiplying by 100.

Validation with independent data
We evaluated our final model with the independent validation data. We mapped the stand age by applying our regression models according to the mapped tree species and pSI to the grid cells with predictor variables on a 16 m x 16 m raster. Synthetic estimates of stand age were obtained by calculating the mean predicted age for each forest stand. Finally, we compared the estimated stand age with the known age by calculating weighted versions of RMSE, RMSE%, and MD according to Equations 2 and 3, where the weights wi corresponded to the stand area proportion of the ith stand that sum up to 1 and n=63. All calculations were performed in R version 3.6.1 (R Core Team 2019).

Age modelling
In the following, we will focus on results for spruce. Corresponding results for pine and birch can be found in the Appendix. The age-height relationship got stronger with increasing SI for all tree species.
More variation was observed for NFI plots in forest stands above 100 years of age, especially for SI levels below 14.
The strength of the relationship between observed and predicted forest age for the eight SI-specific models increased with increasing SI (Figure 3 for spruce; for pine and birch see Appendix Figure 8 and Figure 9). A larger variation in the predictions of the models for SI 6 and 8 was clearly visible. For spruce, the adjusted R 2 ranged from 0.46 to 0.96, RMSE from 2.9 to 31.2 years, RMSE% from 6.4% to 25.8%, MD from -1.0 to 2.6 years, and MD% from -2.2% to 2.2% (Table 2). Average RMSE, RMSE%, MD, and MD% for all plots classes were 21.2 years, 25.1%, 1.0 years, and 1.2%, respectively ( Table 2). The results for pine and birch models were worse than for spruce, with R 2 ranging from 0.33 to 0.84 for pine and from 0.32 to 0.87 for birch (Appendix Table 6). For all models, the standard errors were much smaller than their corresponding parameter estimates, and most parameter estimates had pvalues < 0.001. The model details for the spruce models can be found in Table 9 (for pine and birch models Table 10 and Table 11, respectively). All models for spruce contained the 95th percentile of the ALS first returns (h95_first) as a predictor variable. The predictor variable h95_first squared (h95_first2) was included in all models except the one for SI 26, and one of the predictor variables latitude or longitude were included in all models except the models for SI 23 and 26. The models for SI 6, 8, 14, and 17 included at least one of the spectral S2-based variables such as NDVI, S2 band 8A (s2_8A) or 11 (s2_11). The models for SI 11, 14, and 17 contained the largest number of variables with 8, 10, and 8, respectively (Appendix Table 9). Besides the already mentioned predictors, they also included crown coverage in a height of 2 m and 10 m above ground (cc2, cc10), DTM elevation, distance to the coast (distC), terrain slope (slope), and the time difference between field and ALS data acquisition (diffT). The variable h95_first was the most important predictor in all models. This was assessed by re-fitting the models with standardized predictors. The predictors were centred around their mean, and then scaled by dividing by their standard deviation. The parameter estimates of the centred and scaled version of h95_first was the largest in all models.
We assessed the model behaviour by applying the models to data where the variable h95_first ranged from 0 to the maximum observed value of this variable and all other variables were set to their mean values (Figure 4, for pine and birch see Figure 10 and Figure 11). Below a h95_first of 10 m, all models behaved similar. Above 10 m, we observed similar models for SI 6, 8, and 11, and for SI 14, 17, 20, and 23. Given h95_first, the model for SI 23 appeared more similar to the models for SI 14 and 17 than to the models for SI 20 and 26. The models, however, contain more predictors whose influence is not considered in Figure 4. We applied the SI-specific models to the NFI data using the SI map (pSI) (see Section 1.1.2) to get accuracy metrics, which are more realistic for a practical application where only pSI is available. Because pSI did not contain predictions of 6 and 26, models for these SI categories were never used. RMSE ranged from 18.6 to 56.0 years (29.4% to 53.2%), and MD from 5.4 to 37.2 years (5.3% to 31.2%). Average RMSE and MD for all pSI categories were 41.1 years (48.8%) and 20.6 years (24.5%), respectively (Table 3, for pine and birch see Table 7). Except for pSI 8, RMSE and MD decreased with increasing pSI (Table 3). Age predictions for sample plots with SI corresponding to pSI follow the 1:1 line well, whereas plots with disagreement between observed SI and pSI showed systematic lack-offit ( Figure 5). This trend was most obvious for pSI 11 to 20. If pSI was larger than SI, the predicted age was too small (positive residual), whereas the opposite was observed if pSI was larger than SI. In the same way as for pSI, we used mapped tree species (see Section 1.1.2) for applying the SIspecific models to the NFI data to get accuracy metrics, which are more realistic for a practical application where only a tree species map is available. To isolate the effect of predicted tree species from the pSI effect, we used observed SI for this analysis. RMSE ranged from 3 to 32 years (7% to 25%), and MD from -4 to 2 years (-4% to 2%). Average RMSE and MD for all SI categories were 21 years (25%) and 0 years, respectively (Table 4, for pine and birch see Appendix Table 8). RMSE decreased with increasing SI, however, there was no clear trend for RMSE% and MD%. Age predictions for sample plots with predicted tree species corresponding to the tree species specific model followed the 1:1 line well, whereas plots with disagreement between observed and predicted tree species often showed lack-of-fit. This trend was most obvious for pine and birch. Overall, less variability was introduced by the tree species predictions compared to pSI.

Validation with independent field data
The estimated stand age of the validation stands resulted in area-weighted RMSE of 11.5 years (21.6%), and MD of 1.6 years (3.0%). The RMSE decreased with increasing pSI (Table 5). MD was large and negative for pSI 11, and large and positive for pSI 20. This is also reflected in the graphical representation of the results (Figure 6). For estimated pSI 11, stand ages were estimated with too large values, especially one observation with observed SI "Medium" was heavily overestimated. For estimated pSI 20, all stands with observed SI "Medium" were underestimated.   Figure 7 provides an impression of the age map in comparison to an aerial image for two validation stands. Observed stand ages were 52 and 72 years, and site index of the stands was "Medium". Estimated stand age was 50 and 82 years and the estimated site index was 14 and 17 for the two stands, respectively.

Discussion
We mapped forest stand age using a combination of ALS and Sentinel-2 based variables over a large study area in Norway encompassing various growing conditions. We used more than 4800 NFI sample plots with stand ages ranging from 3 to over 280 years and fit SI-specific models that were validated on stand-level using independent data. We found that forest age can be predicted with relatively high accuracy especially for forest younger than 100 years and that tree height estimated from airborne laser scanning and predicted site index were the most important variables in predicting age. While the results presented in the main manuscript only describe the results for spruce, the appendix provides similar results for pine and broadleaved (birch) dominated forests. In order to apply the proposed modelling approach, site index and tree species maps are required as input variables. Errors in these maps introduced errors in the prediction of age. With improved maps of tree species and site index, our results will improve as well, especially for forest stands above 100 years of age.
We obtained the best results with SI-specific models, which showed that SI was a critical variable for describing age. We observed that the age predictions corresponded well with the reference age where observed and predicted site index agree. However, plots with observed SI smaller than pSI were underpredicted, whereas plots with observed SI larger than pSI were overpredicted. Besides uncertainties in the models, this can be explained by the fact that trees on higher SI grow faster and therefore are taller at a given age. In the opposite case, if the actual SI is smaller than the predicted SI, age at a given height is underpredicted because trees grow slower than expected. While using the mapped instead of the observed tree species hardly changed the RMSE, a 100% increase in RMSE was observed when using the mapped instead of the observed SI. This underlines the importance of an accurate site index map for using the proposed modelling approach to obtain an age map with relatively high accuracy. Nonetheless, the quality of the SI map was sufficient for obtaining reasonably good results (RMSE of 11.5 years) of stand age estimates for validation stands although errors in the SI map were noticeable also on stand level.
Other studies in boreal and temperate forests modelling forest age obtained comparable results, even though they were conducted on smaller study sites. Racine et al. (2014) used a kNN approach and reported RMSE from leave-one-out cross-validation of 8.8 years (19%) for a 66 km 2 study site with 158 sample plots in Canada, where mean reference age ranged from 11 to 94 years. Maltamo et al. (2009) also used a kNN approach and reported RMSE of 18.8 years (87.9%) for spruce, 23.5 years (50.7%) for pine, and 18.7 years (101.7%) for deciduous dominated sample plots in Finland, where reference age ranged from 0 to 150 years. In total they used 335 NFI sample plots in a 22,000 ha study site for modelling. A study in temperate forests was conducted by Straub & Koch (2011) who used both airborne ALS and multispectral variables to model forest stand age in a small study area covering 9.2 km 2 , with 108 forest stands and 300 inventory sample plots in south-west Germany using linear regression. Forest age ranged from 0 to 153 years, and the forest area was composed of various deciduous and coniferous tree species. They reported an R 2 of 0.63, and RMSE of 19.7 years (28.8%). We obtained for more than 4800 NFI sample plots with age ranging from 3 to over 280 years RMSE between 21 and 25 years (23% to 29%). However, our study represents with almost 182,000 km 2 a much larger area, encompassing a wide range of growing conditions and forest structures. Our models can, therefore, be applied for practical forest management throughout the study area corresponding to the majority of the productive forests in Norway.
In the study by Maltamo et al. (2009), 69 independent validation stands with an average area of 1 ha and age ranging from 0 to 126 years were used, resulting in stand level RMSE of 18.3 years (36.3%) for spruce. In a Mediterranean to temperate climate in central Italy, Frate et al. (2015) obtained RMSE of 16 years (30%) using 305 independent validation stands with stand age ranging from 1 to 127 years and mean of 52 years. Our results on 63 independent validation stands were comparable with RMSE of 11.5 (21.6%). The smaller errors in our study might be related to younger stands ranging from 11 to 89 years, and the better performance of our models in younger forest stands compared to older ones. Maltamo et al. (2019, pers. communication) reported for forest stands below 100 years RMSE of 9 years. Our overall results for spruce with age up to 270 years was RMSE of 21 years. However, in an initial analysis, a model fitted with data using only NFI plots in spruce stands younger than 100 years (model not presented) resulted in smaller errors with RMSE and MD of 12.7 and 1.6 years, respectively, which fits to results by Maltamo et al. (2019, pers. communication). As results for the independent reference stands showed, our models performed well for stands below 100 years of age ( Figure 6). However, 36% of the NFI plots in the productive managed forest are above 100 years, and predictions for those stands would have resulted in severe under-estimations. It was also not possible to find a satisfying model classifying older than 100 years forest from younger forest.

Conclusions
The age of spruce, pine and broadleaved (birch) dominated forest stands was mapped on a fine scale (16m x 16 m) for a large study site using variables from remotely sensed data and SI-specific models. Tree height estimated from airborne laser scanning and predicted site index were the most important variables in the models describing age. Errors decreased with increasing site index. Above 100 years of age, the model predictions had more variation and higher uncertainty. Improved site index maps would be the single most important measure to improve the age prediction.

Acknowledgements
We would like to thank the Fylkesmann in Trøndelag, represented by Arne Rannem, for providing the forest stands used for validation.

Funding
This study was supported by the Norwegian Institute of Bioeconomy Research.