• Aucun résultat trouvé

Amélioration de l'estimation du volume marchand brut des sapinières par le lidar : analyse de facteurs influençant l'exactitude et la précision des modèles

N/A
N/A
Protected

Academic year: 2021

Partager "Amélioration de l'estimation du volume marchand brut des sapinières par le lidar : analyse de facteurs influençant l'exactitude et la précision des modèles"

Copied!
95
0
0

Texte intégral

(1)

© Sarah Yoga Bengbate, 2018

Amélioration de l'estimation du volume marchand brut

des sapinières par le lidar : Analyse de facteurs

influençant l’exactitude et la précision des modèles

Thèse

Sarah Yoga Bengbate

Doctorat en sciences forestières

Philosophiæ doctor (Ph. D.)

(2)

Amélioration de l’estimation du volume marchand

brut des sapinières par le lidar

Analyse de facteurs influençant l’exactitude et la précision des

modèles

Thèse

Sarah Yoga Bengbate

Sous la direction de:

Jean Bégin, directeur de recherche

Benoît St-Onge, codirecteur de recherche

(3)

iii

Résumé

Au Québec, la forêt couvre 761 100 km2. Pour assurer la pérennité de cette ressource naturelle, une gestion durable et concertée, ainsi qu’une connaissance fine de celle-ci sont indispensables. Ainsi, le Ministère des Forêts, de la Faune et des Parcs (MFFP) mène des inventaires forestiers périodiques pour caractériser les peuplements écoforestiers. Durant ces inventaires, les attributs forestiers, tels le volume marchand brut (VMB), sont calculés. Les outils de télédétection sont aussi utilisés pour améliorer la qualité des inventaires. Ils offrent l’avantage de fournir des données de haute résolution, acquises de manière objective, répétable, à grande échelle, et à partir desquelles les attributs forestiers peuvent être estimés. Le MFFP a entrepris une acquisition provinciale de données lidar afin d’améliorer la caractérisation du territoire forestier. La recherche de modèles plus exacts et précis pour prédire les attributs forestiers à partir du lidar constitue une problématique d’actualité chez les intervenants forestiers.

L’objectif de cette thèse est l’amélioration de l’estimation du VMB des sapinières au Québec à partir du lidar. Son originalité repose sur l’analyse des effets de quatre facteurs sur la modélisation du VMB, à savoir : le paramétrage du capteur lidar, la mortalité présente dans les placettes-échantillons, la variabilité des conditions de croissance entre différents sites d’étude et le décalage temporel observé entre la période d’inventaire et de survol du lidar. Une sélection des meilleures variables explicatives a d’abord été effectuée. Une méthodologie a ensuite été développée pour inclure les effets des facteurs dans les modèles de VMB. Les résultats ont indiqué que :

1) pour un site d’étude donné, la hauteur moyenne des premiers retours, le pourcentage de retours en dessous de 2 m et l’indice de rugosité de surface de la canopée sont les meilleures variables explicatives du modèle (pseudo-R2 = 0.91, racine de l'erreur quadratique moyenne, REQM = 27.8 m3 ha−1). À plus grande échelle (ex.: paysage), la combinaison optimale de variables explicatives comprend la hauteur moyenne des premiers retours, le pourcentage de retours au-dessus de 2 m et l’écart-type de la hauteur des premiers retours au-dessus de 2 m (pseudo-R2 = 0.75, REMQ = 33.0 m3 ha−1).

(4)

iv

2) la hauteur moyenne et la répartition spatiale des premiers retours dans le nuage de points ont un effet sur la précision des modèles. L’inclusion de ces paramètres dans une fonction de variance permet de modéliser l’hétéroscédasticité des résidus et de mieux estimer l’incertitude (REMQ = 3.7 m3 ha-1 multiplié par la fonction de variance versus 28.0 m3 ha- 1 sans la fonction).

3) la présence d’arbres morts dans les placettes-échantillons augmente l'erreur de prédiction des modèles. Des cimes délimitées sur un modèle de hauteur de canopée peuvent être classées selon le statut de l’arbre (mort/vivant) à l’aide de variables descriptives de la distribution d’intensité des retours. L’élimination des retours associés aux arbres morts, avant l’étape de modélisation, permet d’améliorer les prédictions. Par exemple, dans un contexte où le niveau de mortalité observé est inférieur à 33%, la REMQ a diminué de 31.3 m3 ha-1 à 27.7 m3 ha-1 tandis que le pseudo-R2 a augmenté de 0.88 à 0.90.

4) les conditions de croissance des arbres peuvent varier le long d’un gradient du bioclimat. Cette variabilité ainsi que le décalage temporel observé entre la période d’inventaire et de survol du lidar, ont un effet sur l’exactitude des modèles. L’application d’un modèle spécifique à un site d’étude vers un autre site d’étude peut conduire à des prédictions biaisées. L’inclusion d’un coefficient aléatoire d’ajustement de la hauteur par site (indicateur de la variabilité intersites) ainsi que d’une fonction d’ajustement de la croissance (pour tenir compte du décalage temporel) permettent d’améliorer les prédictions (pseudo-R2 = 0.86 versus 0.75, REMQ = 24.1 m3 ha-1 versus 33.1 m3 ha-1). Le remplacement du coefficient aléatoire par une variable du bioclimat, fournit des prédictions semblables (pseudo-R2 = 0.86, REMQ ≤ 24.3 m3 ha-1). Le gradient du bioclimat pourrait affecter la structure des peuplements notamment en modifiant le rapport hauteur-diamètre des arbres.

Cette thèse confirme l’aptitude du lidar à caractériser le volume marchand brut des sapinières au Québec. L’analyse de facteurs influençant la modélisation du VMB a amélioré l’exactitude et la précision des modèles. Ceci permet notamment d’établir des cartes de VMB plus fiables pour les intervenants forestiers.

(5)

v

Summary

In Quebec, forests cover an area of 761,100 km2. To ensure the sustainability of this natural resource, improved knowledge and informed management are important. The Quebec Ministry of Forests, Wildlife and Parks (MFFP) conducts forest inventories periodically to characterize the structure of forest stands. During these inventories, forest attributes such as timber merchantable volume (MV) are calculated. Remote sensing tools are also used to enhance the quality of the inventories. They offer the advantage of providing large scale, sequential high-resolution data, from which forest attributes can be estimated. The MFFP also started the provincial acquisition of lidar data to improve forest characterization. There is a need to develop more accurate and precise models to predict forest attributes using lidar data for forest users.

The aim of this research is to improve lidar-based models to predict the MV of balsam firs in Quebec. The originality of this investigation lies in the analysis of the effects of four factors on the MV modelling. These are the lidar parameterization, the mortality in the sample plots, the variability of growth conditions and the temporal discrepancy between the field inventory period and the lidar survey. A search of the best explanatory variables was first made. A methodology was then developed to include the effects of the factors in the MV models. The results have shown that:

1) for a given study site, the average height of first returns, the percentage of first returns below 2 m and the canopy surface roughness index are the best explanatory variables of the model (pseudo-R2 = 0.91, root mean squared error RMSE = 27.8 m3 ha−1). At a large scale (eg.: landscape), the best subset of explanatory variables is the average height of first returns above 2 m, the percentage of first returns above 2 m and the standard deviation of height of first returns above 2 m (pseudo-R2 = 0.75, RMSE = 33.0 m3 ha−1).

2) The average height and the spatial distribution of first returns have an effect on the models’ precision. Including these parameters in a variance function enables to model the residual heteroscedasticity, and consequently, better estimate uncertainty (RMSE = 3.7 m3 ha-1 multiplied by the function of variance versus 28.0 m3 ha-1 without the function).

(6)

vi

3) the presence of dead trees in plots increases the models’ prediction errors. Tree crowns delineated on a canopy height model can be classified according to the tree status (live/dead) when using variables describing the intensity distribution of returns. The removal of returns associated with dead trees prior to the modelling phase improves the predictions. For example, in a context where the observed mortality is less than 33%, the overall RMSE therefore decreased from 31.3 m3 ha-1 to 27.7 m3 ha-1 while the pseudo-R2 increased from 0.88 to 0.90.

4) the growing conditions of trees can vary along a bioclimatic gradient. This variability, along with the temporal discrepancy observed between the field inventory and the lidar survey, have an effect on the model accuracy. Applying a site-specific model from one study site to another can lead to biased predictions. The addition of a canopy height random coefficient (describing the site variability), and a growth function (accounting for the growth during the temporal discrepancy) improves the predictions (pseudo-R2 = 0.86 versus 0.75, RMSE = 24.1 m3 ha-1 versus 33.1 m3 ha-1). Replacing the random coefficient by a bioclimatic variable provides similar predictions (pseudo-R2 = 0.86, RMSEs ≤ 24.3 m3 ha- 1). The bioclimatic gradient could therefore affect the structure of stands, particularly in modifying the height-diameter ratio of trees.

This thesis confirms the capacity of lidar to characterize the timber merchantable volume of balsam firs in Quebec. The analysis of factors influencing the modelling improved the accuracy and the precision of the models. This enables to build more reliable MV maps for forests users.

(7)

vii

Table des matières

Résumé ... iii

Summary ... v

Table des matières ... vii

Liste des tableaux ... viii

Liste des figures ... ix

Remerciements ... x

Avant-propos ... xii

Introduction ... 1

Chapitre 1: Modelling the effect of the spatial pattern of airborne lidar returns on the prediction and the uncertainty of timber merchantable volume ... 5

Résumé ... 5

Abstract ... 6

Introduction ... 7

Materials and Methods ... 9

Results ... 17

Discussion ... 23

Conclusion ... 26

Chapitre 2: Lidar and multispectral imagery classifications of balsam fir tree status for accurate predictions of merchantable volume. ... 27

Résumé ... 27

Abstract ... 28

Introduction ... 29

Materials and Methods ... 31

Results ... 41

Discussion ... 47

Conclusion ... 50

Chapitre 3: A generalized lidar-based model for predicting the merchantable volume of balsam fir of sites located along a bioclimatic gradient in Quebec, Canada ... 51

Résumé ... 51

Abstract ... 52

Introduction ... 53

Materials and methods ... 55

Results ... 63

Discussion ... 69

Conclusion ... 72

Conclusion générale ... 73

(8)

viii

Liste des tableaux

Table 1.1 Summary of field plot measurements for the Montmorency Forest study site (119 plots) ... 10 Table 1.2 Summary of the lidar data for the Montmorency Forest study site (119 plots) . ... 15 Table 1.3 Parameters of the merchantable volume models before (see Equation (6)) and after the addition of the variance function ... 18 Table 1.4 Comparison of four predictive merchantable volume (MV) models before (basic model) and after addition of a spatial distribution variable ... 20 Table 1.5 Comparison of the merchantable volume (MV) models before (basic model) and after addition of a variance covariate ... 21 Table 1.6 Effects of Mean_h1st and Skew_area variables on the residual standard deviation (RSD) of the predicted merchantable volume ... 21 Table 2.1 Summary of the field plot measurements for the Montmorency Forest study site (61 plots) ... 34 Table 2.2 Summary of the lidar data for the Montmorency Forest study site (61 plots) 35 Table 2.3 Confusion matrices of two random forest classifications using lidar classifiers only (upper) or combined with multispectral imagery classifiers (lower) ... 43 Table 2.4 Random forest predictions of tree status and mortality ratios after a lidar-based classification or after a combined lidar and multispectral imagery-based classification 45 Table 2.5 Comparison of merchantable volume (MV) models built before or after a point cloud filtering ... 45 Table 2.6 Root mean square errors (RMSEs) of the merchantable volume (MV) models presented by class of mortality ... 46 Table 3.1 Summary of the field data a ... 58 Table 3.2 Summary of the lidar data a ... 59 Table 3.3 Anova comparisons amongst candidate merchantable volume (MV) models ..

... 64 Table 3.4 Average biases of the merchantable volume (MV) models per study site .... 65 Table 3.5 Fixed coefficients of candidate merchantable volume models ... 66 Table 3.6 Canopy height random coefficients of candidate merchantable volume models ... 67

(9)

ix

Liste des figures

Figure 1.1 Locations of the Montmorency Research Forest study site and the field plots over a lidar DTM ... 9 Figure 1.2 Spatial distribution pattern of returns for the field plot # 28 surveyed by three lidar strips (#8, #9, #10) ... 13 Figure 1.3 Area distribution for the triangulated returns of the field plot # 28 surveyed by three lidar strips (#8, #9, #10) ... 14 Figure 1.4. Scatter plot of predicted versus field merchantable volume ... 17 Figure 1.5 Scatter plots of standardised (Std) residuals versus predicted merchantable volume (MV), average height of first returns (Mean_h1st) and skewness of the area distribution for triangulated returns (Skew area) ... 19 Figure 2.1 Location of the Montmorency Forest study site and the field plots over a lidar digital terrain model ... 31 Figure 2.2 Workflow diagram describing the steps from the lidar and multispectral imagery data processing to the classification of individual tree crowns and the prediction of the merchantable volume ... 36 Figure 2.3 Normalized intensity distributions of lidar first returns (above 2 m) of eight field plots having different mortality ratios ... 41 Figure 2.4 Tree crowns delineation and classification for the field plot #124 ... 42 Figure 2.5 Variable importance plot for the lidar-based classification (top) and the combined lidar and multispectral imagery-based classification (bottom) of live and dead tree crowns ... 44 Figure 3.1 Location of the study sites ... 56 Figure 3.2 Scatter plots of the field merchantable volume (MV) against the explanatory variables. ... 63 Figure 3.3 Scatter plots of Model 4 canopy height (CH) random coefficients plotted against the height-diameter ratio (H-D ratio), the elevation, and the growing degree days variables presented by study site ... 68

(10)

x

Remerciements

En premier lieu, je remercie le Bon Dieu qui m’a donné la persévérance de commencer et d’achever mes études. Les années doctorales ont été un parcours rempli de défis, mais la joie de l’Éternel a été ma force.

Mes vifs remerciements à Jean Bégin, mon directeur de recherche, de m’avoir accueillie au sein de son équipe. Merci pour ta supervision et ta patience durant tout mon parcours doctoral. Un grand merci pour tes promptes corrections et pour le soutien financier.

Je remercie Benoît St-Onge, mon codirecteur, pour ses précieux conseils. Merci d’avoir accepté d’encadrer mes études.

Un grand merci aux coauteurs Martin Riopel (MFFP), Demetrios Gatziolis (US Forest Service) et Gaétan Daigle (Université Laval). Votre expertise en analyse de données, en télédétection et en analyse statistique a grandement amélioré la qualité de mon travail.

Je tiens à remercier toutes les personnes qui m’ont assistée dans le traitement des données géospatiales notamment Pierre Racine (Centre d’étude de la forêt) et Stéphane Biondo (Université Laval).

Merci à Vernon Singhroy (Centre canadien de cartographie et d'observation de la Terre) pour ta disponibilité et ton oreille attentive.

Un grand merci à ma famille, les super Yoga, de m’avoir encouragée pendant toutes ces années études depuis Little Woods kindergarten jusqu’au doctorat. Merci pour votre soutien : je vous aime !

Je remercie les différents partenaires financiers de ce projet : le Fonds de recherche du Québec-Nature et technologies (FRQNT) et le Ministère des Forêts, de la Faune et des Parcs (MFFP) du Québec, Canada. Un grand merci au MFFP et à la forêt Montmorency pour l’accès aux données utilisées dans ce projet.

(11)

xi

Un grand merci aux évaluateurs de cette thèse pour leurs différentes critiques.

Ne pouvant citer nommément tout le monde, un grand merci à tous les collaborateurs et amis qui ont contribué à cette thèse. Merci !

(12)

xii

Avant-propos

Ce doctorat est rédigé sous forme d’une thèse avec insertion d’articles. Il comprend cinq parties : une introduction générale, le corps de la thèse correspondant à trois manuscrits (chapitres 1 à 3) et une conclusion générale.

Les chapitres 1, 2 et 3 ont été publiés dans des revues avec comité de lecture.

Chapitre 1: Yoga, S., Bégin, J., St-Onge, B., & Riopel, M. (2017). Modelling the effect of the spatial pattern of airborne lidar returns on the prediction and the uncertainty of timber merchantable volume. Remote Sensing, 9 (8), 808.

Chapitre 2: Yoga, S., Bégin, J., St-Onge, B., & Gatziolis, D. (2017). Lidar and multispectral imagery classifications of balsam fir tree status for accurate predictions of merchantable volume. Forests, 8 (7), 253.

Chapitre 3: Yoga, S., Bégin, J., Daigle, G., Riopel, M., & St-Onge, B. (2018). A generalized lidar-based model for predicting the merchantable volume of balsam fir of sites located along a bioclimatic gradient in Quebec, Canada. Forests, 9(4), 166.

Je suis le premier auteur des trois manuscrits. J’ai effectué la revue de littérature, le traitement des données du lidar, les analyses statistiques et la rédaction des manuscrits. Jean Bégin, Benoît-St-Onge et moi avons défini le cadre théorique des manuscrits. Jean Bégin et Gaétan Daigle ont fourni de l’expertise en statistique. Benoît St-Onge et Demetrios Gatziolis ont fourni de l’expertise en télédétection. Martin Riopel a réalisé le traitement des données d’inventaire.

(13)

1

Introduction

Au Québec, la forêt couvre 761 100 km2. Levier économique important, elle procure plus de 60 000 emplois (MFFP, 2018). Une gestion durable et concertée de cette ressource naturelle ainsi qu’une connaissance fine de celle-ci sont indispensables pour assurer sa pérennité. Écosystème vivant, la structure de la forêt varie dans le temps et dans l’espace (Pretzsch, 2010). Elle peut être caractérisée par les données dendrométriques, (ex.: volume marchand brut (VMB)) collectées lors des inventaires forestiers. Le VMB est un autre attribut forestier qui synthétise la structure des peuplements. Difficilement mesurable sur les arbres sur pied, il est traditionnellement modélisé à partir d’autres variables dendrométriques (ex.: hauteur, diamètre, forme de l’arbre) ou, depuis quelques années, à partir d’outils de télédétection.

Les outils de télédétection ont largement été utilisés pour améliorer la connaissance sur les peuplements écoforestiers (Hyyppä et al., 2000; Lefsky et al., 2002; Reutebuch et al., 2005; Wulder, 1998). Ils offrent l’avantage de caractériser la structure des

peuplements à distance et d’en obtenir des informations temporelles. Il devient alors possible de recueillir des données de manière objective, répétable et à grande échelle spatiale (White et al., 2016; Wulder et al., 2012). Certains outils tels le light detection and ranging (lidar) fournissent en outre des données de haute résolution. Le lidar est un capteur actif qui émet des impulsions de lumière laser vers un objet cible situé au sol puis enregistre le signal réfléchi par l’objet. Selon les caractéristiques de l’objet cible, les impulsions peuvent être réfléchies à différentes hauteurs; on parle alors de retours multiples. L’intervalle de temps entre le signal émis et le signal réfléchi est ensuite converti en distance. Connaissant la distance entre le capteur (d'altitude connue) et l’objet cible, il est possible de calculer l'altitude de l’objet cible. Le lidar produit en sortie un nuage de points à partir des retours. La structure du nuage dépend des paramètres du capteur (ex.: angle de balayage, fréquence des impulsions), des propriétés de l’objet cible (ex.: forme, réflectance), des facteurs environnementaux (ex. nébulosité) et de l’interaction entre ces différentes composantes (Roncat et al., 2014). Elle peut être décrite à l’aide de variables descriptives calculées sur le nuage (p.ex.: hauteur, densité). Ces variables permettent aussi de caractériser les attributs de l’objet cible, par exemple, le VMB d’un peuplement (Bouvier et al., 2015; Kotivuori et al., 2016; Næsset, 2004c;

(14)

2

Treitz et al., 2012). Le lidar est utile pour l’estimation systématique des attributs forestiers

à grande échelle (ex.: paysage, province, pays) d’autant plus que ceci est difficilement réalisable par inventaire forestier traditionnel (Maltamo et al., 2005; Næsset, 2007;

Nilsson et al., 2017; Nord-Larsen & Schumacher, 2012).

La sélection de meilleures variables explicatives et d’un bon modèle statistique optimal ainsi qu’une bonne description des facteurs affectant la modélisation des attributs forestiers demeurent des problématiques récurrentes en foresterie. Au Québec, cette problématique est particulièrement d’actualité depuis que le ministère des Forêts, de la Faune et des Parcs a entrepris une acquisition provinciale de données lidar visant à améliorer la caractérisation du territoire forestier (MFFP, 2016). Ainsi, l’objectif général

de cette thèse est d’améliorer l’estimation du VMB des sapinières au Québec à partir du lidar. Les objectifs spécifiques sont:

 l’analyse de quatre facteurs influençant la modélisation du VMB, à savoir le paramétrage du capteur lidar, la mortalité présente dans les placettes-échantillons, la variabilité dans les conditions de croissance, pour les sites étudiés et le décalage temporel observé entre la période d’inventaire et de survol du lidar

 la description des effets des facteurs sur l’estimation du VMB et  l’amélioration de l’exactitude et de la précision des modèles.

Le paramétrage du capteur lidar a un effet sur la répartition spatiale des retours dans le nuage de points et peut, par conséquent, influencer la caractérisation de l’objet cible.

Roussel et al. (2018) ont ainsi observé une corrélation entre l’angle d’incidence du

capteur et la hauteur moyenne d’un modèle numérique de canopée. Morsdorf et al. (2008) et Næsset (2004b) ont observé qu’une augmentation de l’altitude du survol

conduit à une sous-estimation de la hauteur moyenne des arbres individuels. De même, un espacement irrégulier des retours dans le nuage se traduit par une alternance d’amas denses de retours et de zones exemptes de retours (Balsa-Barreiro et al., 2012). Il peut en résulter une surreprésentation de l’objet cible ou à l’inverse, un manque d’information spatiale. L’alternance de zones surreprésentées et sous-représentés devrait augmenter la variabilité des descripteurs et porter plus à conséquence dans les peuplements hétérogènes. La répartition des retours aurait donc un effet sur la modélisation des

(15)

3

attributs forestiers en occasionnant de l’hétéroscédasticité dans les résidus. Si tel est le cas, la modélisation de l’hétéroscédasticité est importante puisqu’une prédiction imprécise des attributs forestiers a un effet direct sur l’évaluation marchande de la forêt (Holopainen et al., 2010). Treitz et al. (2012) ont démontré qu’une décimation du nuage

de points (jusqu’à 0.5 impulsions m−2), n’a pas d’effet sur l’exactitude de modèles des attributs forestiers. Cependant, peu d’études ont été réalisées sur l’incertitude de prédiction d’attributs forestiers par le lidar (ex. (Ehlert & Heisig, 2013; Morsdorf et al., 2008; Næsset, 2004b)). Le chapitre 1 de ce document vise à caractériser la distribution spatiale des retours et à analyser l’effet du paramétrage du lidar sur l’exactitude et l’imprécision des estimations du VMB.

Les nuages de points contiennent parfois des données qui nuisent à l'évaluation de certains attributs. Par exemple, les retours associés aux arbres morts constituent un "bruit" lors de la modélisation du volume marchand des arbres vivants. Les attributs modélisés seront alors surestimés ou sous-estimés suite aux données bruitées. Le filtrage du nuage de points, pour ne conserver que de l’information pertinente, serait une solution pour améliorer les modèles. Par ailleurs , certaines études ont démontré que le lidar et l’imagerie multispectrale discriminent correctement l’état (vivant/mort) des arbres (Gates et al., 1965; Kim et al., 2009; Näsi et al., 2015; Wing et al., 2015). Kim et al. (2009), par exemple, ont observé une différence d’intensité du lidar entre les arbres

vivants et morts. Wing et al. (2015) ont démontré qu’un algorithme de filtrage peut

discriminer spécifiquement les retours associés aux arbres morts dans le nuage. Ces études ont été menées dans des sites où une forte mortalité a été observée suite à une perturbation. À notre connaissance, il n’existe pas d’étude sur l’utilisation des outils de télédétection pour améliorer les prédictions des attributs forestiers dans un contexte de faible mortalité. Pourtant, leur bonne estimation reste importante pour assurer une gestion optimale de la ressource et déterminer la valeur marchande du bois. Le chapitre 2 de ce doctorat vise à vérifier si on peut améliorer la relation entre le VMB de sapinières et les variables descriptives du nuage de points en éliminant les retours associés aux arbres morts. Une classification de couronnes d’arbres individuels a été réalisée préalablement à l’aide de variables d’intensité du signal lidar et d’imagerie numérique.

Une autre problématique récurrente en modélisation forestière est la prédiction des attributs forestiers à grande échelle (ex. paysage). Lorsque les conditions de croissance varient selon un gradient de bioclimat, la transférabilité des modèles d’un site d’étude

(16)

4

vers un autre ou l’établissement d’un modèle prédictif général peut alors s’avérer complexe. Bouvier et al. (2015), Kotivuori et al. (2016) et Nord-Larsen and Schumacher (2012), par exemple, ont observé un effet de la variabilité intersites sur la modélisation du volume. Ils ont obtenu une erreur de prédiction plus grande dans les forêts de plus grande complexité structurale (ex.: forêts mixtes, forêts de haute latitude). De plus, l’existence d’un décalage temporel entre la période d’inventaire et de survol du lidar affecte la corrélation entre les données. Bright et al. (2013) ont suggéré que la présence d’un décalage temporel pourrait avoir un effet négatif sur la modélisation d’attributs forestiers tels la surface terrière. Certaines études ont ainsi analysé comment optimiser la modélisation des attributs forestiers à grande échelle (ex.: paysage, province). Hansen et al. (2014), par exemple, ont inclus des variables descriptives du climat, de la topographie et du sol dans leur modèle afin d’estimer la hauteur et la densité de couvert de différents sites d’étude localisés le long d’un transect de 4000 km. Næsset and Gobakken (2008) et Nilsson et al. (2017) ont d’abord calibré des modèles individuels d’attributs forestiers dans chacun de leurs sites étudiés. Ils ont ensuite combiné les modèles afin de caractériser ces attributs forestiers à une grande échelle. Quelques questions doivent néanmoins être encore élucidées pour améliorer les prédictions des attributs forestiers. Par exemple: 1) est-ce que la combinaison de données lidar provenant de différents sites d’étude augmente les biais de prédiction des sites? 2) est-ce que l’inclusion de l’effet de la variabilité intersites dans le modèle peut réduire est-ces biais? 3) comment prédire les attributs forestiers dans des sites non étudiés? 4) quel est l’effet du décalage temporel sur les modèles? Le chapitre 3 vise ainsi à analyser les effets de la variabilité intersites et du décalage temporel sur la modélisation du VMB pour des sapinières localisées le long d’un gradient de température et d’altitude.

(17)

5

Chapitre 1: Modelling the effect of the spatial pattern of

airborne lidar returns on the prediction and the uncertainty of

timber merchantable volume

Résumé

Les données lidar sont régulièrement utilisées pour caractériser les structures forestières. Dans cette étude, nous analysons les effets de trois attributs du lidar (densité de retours, espacement des retours, angle de balayage) sur le calcul de prédiction et d'incertitude du volume marchand brut de sapinières. Des variables prédictives, dérivées du nuage de points, ont été combinées dans un modèle de régression non linéaire. L’analyse a démontré une bonne corrélation entre les volumes observés et prédits (pseudo R2 = 0.91). Cependant, les résidus étaient hétéroscédastiques. L’incertitude associée aux prédictions a alors été mieux caractérisée en rajoutant au modèle une fonction de variance décrivant la variabilité causée par la hauteur moyenne et la distribution spatiale des retours. L'écart type résiduel a été mieux estimé (3.7 m3 ha-1 multiplié par la fonction de variance versus 28.0 m3 ha-1 sans la fonction de variance). Nous n'avons trouvé aucun effet de la densité des retours sur les prédictions (p-value = 0.74).

Mots-clés: lidar; modélisation; volume marchand brut; sapinière; distribution spatiale des retours; densité des retours; espacement des retours; angle de balayage; variance résiduelle.

(18)

6 Abstract

Lidar data are regularly used to characterize forest structures. In this study, we determine the effects of three lidar attributes (return density, return spacing, scanning angle) on the accuracy and the uncertainty of timber merchantable volume estimates of balsam fir stands (Abies balsamea (L.) Mill.) in eastern Canada. We used lidar point clouds to compute predictor variables of the merchantable volume in a nonlinear model. The best model included the average height of first returns, the percentage of first returns below 2 m and the canopy surface roughness index. Our analysis shows a high correlation between lidar and the field data of 119 plots (pseudo-R2 = 0.91). However, residuals were heteroscedastic. More precise parameter estimates were obtained by adding to the model a variance function of variables describing the average height of returns and the skewness of the area distribution of triangulated lidar returns. The residual standard deviation was better estimated (3.7 m3 ha-1 multiplied by the variance function versus 28.0 m3 ha-1 without the variancefunction). We found no effect of return density on the predictions (p-value = 0.74). This suggests that the height and the spatial pattern of returns, rather than the return density, should be considered when assessing the uncertainty of the merchantable volume.

Keywords: lidar-based model; timber merchantable volume; balsam fir; spatial distribution; return density; return spacing; scanning angle; residual variance.

(19)

7 Introduction

Airborne laser scanning (ALS or lidar) has proven to be a useful 3d tool for the measurement of forest attributes. It generates a point cloud which is a three-dimensional representation of the volumetric interaction between the incident laser pulses and the illuminated objects. The spatial distribution of returns within the point cloud depends on the characteristics of the lidar systems and of the target object (Gatziolis & Andersen, 2008; Korpela et al., 2010). The scan mechanisms (e.g.: oscillating mirror, rotating mirror, nutating mirror) mounted on lidar scanners produce different scanning patterns of returns on the ground (Gatziolis & Andersen, 2008). For example, the bi-directional scan mechanism of oscillating mirrors produces a see-saw scanning pattern where the incident pulses tend to be more homogeneously distributed along the center of the flight line than at the borders. Consequently, an alternation of local clumps of returns and local gaps will be observed in the point cloud (Balsa-Barreiro et al., 2012). Scanning angle is another setting which influences the spatial distribution of returns (Roussel et al., 2018). High-scanning angles increase the distance traveled by the incident pulse. Objects located further from the sensor are likely to be occluded or undetected (Ehlert & Heisig, 2013), thus producing an irregular spatial distribution. The effects of the surface characteristics of scanned objects on the scanning pattern are not controlled by the user.

Gatziolis and Andersen (2008), for example, demonstrated that higher return densities are recorded in forest stands compared to green pastures under similar lidar settings.

Korpela et al. (2010) noted that leaf size and orientation and foliage density affect the intensity of lidar returns. Balsa-Barreiro and Lerma (2014) noted that topography and land cover influence the density and the spacing of returns.

Variables (e.g.: height percentiles, percent canopy cover), describing the point clouds are often used to estimate forest attributes such as volume. Hence, high correlations between field and predicted volumes have been achieved in many studies (Bouvier et al., 2015; Maltamo et al., 2006; Næsset, 2004a; Treitz et al., 2012). Studies on lidar modelling focus on investigating the optimal explanatory variables that best explain the variability of the forest attributes. The relationship between the explanatory and dependent variable is obtained by means of a regression model. Least square procedures are commonly used to estimate the parameters of the model (Beal & Sheiner, 1988). These procedures focus on minimizing the sum of squared residuals. Heteroscedasticity occurs in a model when the residuals variance is not constant (Beal

(20)

8

& Sheiner, 1988; Wolter, 2007). Some possible causes of heteroscedasticity can be measurement errors of the data, autocorrelation or misspecification of the model. This can lead to a significant loss of the model precision where the estimated parameters are inefficient, although unbiased. The standard errors and confidence intervals are also unreliable. In other words, the forest attribute can still be predicted, but the predictions are uncertain. In an operational context, uncertain predictions of a forest attribute such as the timber merchantable volume have a direct effect on the evaluation of its economic value (Holopainen et al., 2010). Heteroscedasticity may be accounted for by using different methods (Beal & Sheiner, 1988; Pinheiro & Bates, 2000; Worrall et al., 2008). For example, variable transformation is a method where the measurement scale of variables is changed to decrease heteroscedasticity (Beal & Sheiner, 1988; Ruppert, 2014). Variance modelling identifies the sources of heteroscedasticity in a model and attempts to better estimate the resulting uncertainty through a variance function (Pinheiro & Bates, 2000).

Few studies can be found on the effects of lidar on the uncertainty of the predictive models (e.g.:(Ehlert & Heisig, 2013; Morsdorf et al., 2008; Næsset, 2004b)). Yet, with the increasing use of lidar in forestry, reliable estimates of forest attributes derived from lidar remain necessary. There is, therefore, a need to investigate if factors like the spatial pattern of returns can also influence the accuracy or the uncertainty of modeled forest attributes. This study, therefore, aims at analyzing the effects of three lidar attributes (return density, return spacing, scanning angle) on the predictions and the uncertainty of timber merchantable volume estimates.

(21)

9 Materials and Methods

Study area

The study was conducted at the Montmorency Forest which is a teaching and research forest facility of the Laval University. The forest is located in central Quebec, Canada (47.3°N, 71.1°W), about 70 km north of Quebec City (Figure 1.1). The study site covered an area of 66 km2. Elevation at the site ranges from approximately 460 to 1040 m above sea level. The average annual precipitation is 1589 mm, and the average annual temperature is 0.3°C (Environnement Canada, 2016). The site lies on the Laurentian Plateau and is part of the balsam fir/white birch domain. Balsam fir (Abies balsamea [L.] Miller), black spruce (Picea mariana [Miller] BSP and white spruce (Pinus glauca [Moench] Voss) are the dominant conifers found in the forest. White birch (Betula papyrifera Marshall) and trembling aspen (Populus tremuloides) are also common (Bélanger, 2001).

Figure 1.1 Locations of the Montmorency Research Forest study site and the field plots over a lidar DTM

(22)

10 Field plot data

The Montmorency Forest has a network of circular permanent plots (radius = 11.28 m, area = 400 m2). The permanent plots are re-measured on a five-year cycle. The plots center position were georeferenced using a dual-frequency antenna GPS (Trimble Yuma, accuracy ~ 30 cm) and were not post-processed. A sample of 119 plots was selected following these criteria: 1) the dominant species was balsam fir, 2) the last cut was before 1996 and 3) the dominant height was at least 7 m. The merchantable trees with a diameter at breast height (dbh) above 9 cm were measured using a diameter tape. Tree height was measured on eight sample trees per plot using a Vertex hypsometer. A height-diameter relationship was adjusted for each plot (nonlinear mixed effects model) and then used to calculate the height of the other trees. The volume of each merchantable tree (MV) was computed using height and diameter based equations as described in Fortin et al. (2007). The total MV was computed as the sum of the individual MVs. We used field data that were collected between 2007 and 2014. Twenty-four field plots had been initially inventoried in 2011, the same year as the lidar survey. Attributes of the other plots were estimated by interpolating the field measurements between two dates of inventory. Table 1.1 presents a summary of the field data.

Table 1.1 Summary of field plot measurements for the Montmorency Forest study site (119 plots)

Field attribute Average Standard deviation Range

Diameter at breast height (cm) 21.9 5.2 13.3 – 38.0

Dominant height (m) 15.1 4.2 9.2 – 26.6

Density (trees ha-1) 1433.0 536.0 117.0 – 3078.0

Basal area (m2 ha-1) 25.8 10.9 3.7 – 49.5

Merchantable volume (m3 ha-1) 144.1 90.2 21.0 – 411.9

Lidar data

The airborne laser data were acquired in August 2011 with an Optech ALTM 3100 discrete return sensor. The average flying altitude was 1000 m above ground. The incident pulse repetition rate was of 100 kHz with a 50% strip overlap. The scanning angle ranged from 0 to 24° relative to the nadir. The sensor could record up to four measurements per pulse. Each field plot was covered by two to four lidar strips. The average above-ground return density was 6.4 point  m-2. Information for each return was

(23)

11

recorded in .las files (version 1.0). The returns classified as ground (by the vendor) were interpolated to construct a digital terrain model using the LAStools software (Isenburg, 2016). Non-ground returns were normalized by measuring the elevation difference with the underlying terrain model.

Data analysis

The data analysis is organized in six sections: section 1) describes the nonlinear generalized least squares regression, section 2) provides an analysis of the spatial distribution of returns using an example field plot, section 3) calculates the lidar variables from the point clouds, section 4) describes the merchantable volume modelling, section 5) analyses the residual variance, and section 6) provides the model validation.

1) Generalized least squares in nonlinear regression

Heteroscedasticity occurs in a nonlinear regression (NLIN-GLS) model when the error variance is not constant over all the observations (Pinheiro & Bates, 2000):

yj= f (xj;β) + εj with var (εj) = σ2 g2 (zj;δ) (1)

where yj is the value of the response variable measured in the jth experimental unit, and is expressed as a known function f(.) of some explanatory variables (xj) and parameters (β) plus a random error term (εj). In this model, heteroscedasticity is considered by the fact that the variance σ2 of this latter term is not constant, but depends on a known function g (.) of some covariates zj and parameters δ.

Equation (1) was used to predict the merchantable volume of our study site. The variance functions g (.) that we have considered had one of the following expressions (Pinheiro & Bates, 2000):

(24)

12 g(zj;δ)= { |z1j| δ exp (δz1j) |z1j| δ1 |z2j| δ2 exp (δ1z1j) exp (δ2z2j) |z1j| δ1 exp (δ 2z2j) (2)

The first variance function is a power function of the absolute value of the covariate zj,

the second one is an exponential function of this covariate, and the others are various combinations of the previous ones.

The spatial distribution pattern group of covariates (SP), which will be described later, have been included in this model either in the fixed part through the f(.) function or in the random part through the g (.) function.

2) Analysis of the spatial distribution of returns

The return density, return spacing and scanning angle play an important role in the spatial distribution of returns. To provide a visual example of this, we have extracted a sample of the 2d distribution of all the returns of a given field plot. The plot had been covered by three lidar strips (#8, #9 and #10) during the survey (Figure 1.2). The strips had an average return density of 2.15, 2.19 and 2.04 point  m-2 respectively, an average return spacing of 6.8, 6.8 and 7.0 dm and an average scanning angle of 16, 1.6, and 9.4°. The returns were plotted in a 2d geographic grid and a line segment was drawn between consecutive returns to show the scanning sequence. Strips #8 and #10 had a heterogeneous spatial distribution of returns. Local clumps of returns alternated with local gaps. Conversely, strip #9 had a more homogeneous spatial distribution of returns throughout the plot. When the three strips were pooled together at the plot level, the average return spacing drastically decreased (4.0 dm). However, local gaps were still observable within the plot.

(25)

13

Figure 1.2 Spatial distribution pattern of returns for the field plot # 28 surveyed by three lidar strips (#8, #9, #10)

Thick areas depict clumps of returns and white areas depict gap where there are no returns.

We triangulated neighboring returns into a 3d network to analyze the spacing distribution. This was done using the Delaunay triangulation algorithm of the ArcGIS software. The area of each triangle was calculated. Figure 1.3 shows the area distribution of triangulated returns for the previous example field plot. Strips #8 and #10 which had a heterogeneous pattern (see Figure 1.2) produced an asymmetric right-skewed area distribution. The areas ranged from 0.2 to 211.44 dm2 and 0.68 to 217.29 dm2 respectively. Conversely, strip #9, which had a homogeneous pattern, produced a symmetric leptokurtic area distribution. The areas ranged from 0.14 to 67.12 dm2. The heterogeneous pattern observed, when pooling all strips together, induced an asymmetric right-skewed mesokurtic area distribution. The number of small triangles increased. The areas ranged from 0.06 to 53.14 dm2. This example shows that an irregular distribution of returns persisted locally even with three overlapping strips. Strips were always pooled together in the following analyses.

(26)

14

Figure 1.3 Area distribution for the triangulated returns of the field plot # 28 surveyed by three lidar strips (#8, #9, #10)

3) Lidar variable generation

Lidar returns were clipped to the extent of each plot spatial boundary to extract the corresponding point cloud. Lidar variables were computed both at the strip and plot levels using the Fusion software (McGaughey, 2014). They were classified into three standard groups of explanatory variables: canopy height variables (CH: height percentiles, average, mode), canopy density variables (CD: percentage of first returns above or below a 2 m, 5 m or 7 m threshold) and canopy structural heterogeneity variables (CSH: standard deviation, variance, coefficient of variation, canopy surface roughness index). The canopy surface roughness index, hereafter called rumple index (Ri), was computed as the ratio of the canopy outer surface area to the ground surface area as measured by the lidar-derived canopy surface and digital terrain models in a 1 m x 1 m grid (Kane et al., 2010).

The fourth group of explanatory variables was computed to describe the spatial distribution pattern of returns (SP). Variables included in this group were: return density variable (computed as the number of above ground returns of a given plot divided by its area), area distribution variables (average, standard deviation, minimum, maximum, skewness and kurtosis of the area distribution of triangulated returns) and scanning angle variables (mean, mode, standard deviation, minimum, and maximum of the

(27)

15

scanning angle distribution). Table 1.2 presents a summary of the lidar data for the study site.

Table 1.2 Summary of the lidar data for the Montmorency Forest study site (119 plots) Lidar attribute Average Standard deviation Range

Mean_h1st (m) 5.9 2.2 1.7 – .5

Above-ground return density

(point  m-2) 6.4 1.8 3.2 – 12.5

Above-ground return spacing (dm) 4 1 3 – 6

Scanning angle (°) 9.1 4.7 0.0 – 24.0

Skew_area 1.4 0.5 0.4 – 3.9

Per_1st_2m (%) 15.3 13.9 0.2 – 63.5

Ri 2.7 0.7 1.6 – 5.2

Mean_h1st = average height of first returns, Skew_area = skewness of the area distribution for triangulated returns, Per_1st_2m = percentage of first returns below 2 m, Ri = rumple index

4) Merchantable volume modelling

The basic merchantable volume model was built at the plot level using one variable per standard group iteratively. Nonlinear relationships were observed between the field MV and the canopy density and canopy heterogeneity variables. Hence, each subset of variables was combined into the NLIN-GLS model described in Equation (1):

MVj= exp (β0) * CHjβ1 * CD

j β2

* CSHjβ3+ ε

j (3)

where MVj is the merchantable volume of the jth plot, CH

j the canopy height variable, CDj

the canopy density variable, CSHj the canopy structural heterogeneity variable and εj the

residual term.

The SP variables were then added to Equation (3):

MVj= exp (β0) * CHj β1 * CDjβ2 * CSH j β3 * SPjβ4 + ε j (4)

(28)

16

The best subsets of variables were selected based on the corrected Akaike Information Criterion AICc, the residual standard deviation RSD, the level of significance of the model p-value, and the coefficient of determination pseudo-R2. The statistical analysis was done with the R software (R Core Team, 2017). We used the “nlme” package (Pinheiro et al., 2016) for model fitting.

5) Residual variance analysis

The absolute residuals were regressed against the predicted MVs to assess the model heteroscedasticity. A significant p-value of the MV parameter was an indication of heteroscedasticity. The model uncertainty was then assessed by iteratively adding an SP variable to the error variance function (random part of the model) as shown in Equation (2): MVj= exp (β0) * CHj β1 * CDjβ2 * CSH j β3 + εj with var (εj) = σ2 ∗ g2 (SPj;δ) (5)

The regression parameters were estimated using an NLIN-GLS regression. The AICc, RSD, p-value and the 95% confidence intervals (95% CI) were computed to compare the models.

Following Beal and Sheiner (1988), we also tested and compared variable transformation to account for heteroscedasticity.

6) Model validation

The leave-one-out cross-validation was used to assess the accuracy of the merchantable volume models. A new dataset was created by removing one field plot iteratively. The MV model was fitted to the new dataset (training data) and used to predict the MV of the removed field plot (test data). The procedure was repeated until predicted values were obtained for all the field plots.

(29)

17 Results

We found a very good correlation between the predicted and the field merchantable volumes even before the introduction of the SP variables (pseudo-R2 = 0.91, Figure 1.4 and Table 1.3). The best predictors were the average height of first returns (Mean_h1st), the percentage of first returns below 2 m (Per_1st_2m) and the rumple index (Ri). Their importance in the model was of 87%, 2%, and 1% respectively. The RSD was 28.0 m3 ha- 1 for the regression model. It was estimated to 29.0 m3 ha-1 after cross-validation. The best equation was:

MV = exp (2.04) * Mean_h1st 1.33 * Per_1st_2m-0.08 * Ri 0.64 + ε (6)

(30)

18

Table 1.3 Parameters of the merchantable volume models before (see Equation (6)) and after the addition of the variance function Before (no variance function) After (with variance function)

Value Std. Err. p-value 95% CI Value Std. Err. p-value 95% CI Parameters of the

fixed part of the model Variables β0 Intercept 2.04 0.17 <0.001 [1.77 – 2.42] 1.97 0.16 <0.001 [1.76 – 2.34] β1 Mean_h_1st 1.33 0.10 <0.001 [1.14 – 1.52] 1.33 0.11 <0.001 [1.12 – 1.55] β2 Per_1st_2m −0.08 0.03 0.020 [−0.13 – −0.02] −0.07 0.03 0.020 [−0.07 – −0.02] β3 Ri 0.64 0.12 <0.001 [0.40 – 0.88] 0.71 0.13 <0.001 [0.64 – 0.87] Parameters of the

variance function Variables

δ1 Mean_h_1st --- --- --- --- 0.85 <0.001 [0.50 – 1.20]

δ2 Skew_area --- --- --- --- 0.31 <0.001 [0.05 – 0.57]

Variance function = Mean_h1stδ1 *exp(δ

2 * Skew_area)

Mean_h1st = average height of first returns, Per_1st_2m = percentage of first returns below 2 m, Ri = rumple index, Skew_area = skewness of the area distribution for triangulated returns, Std.Err. = standard error

(31)

19

Figure 1.5 shows the residuals plots of Equation (6). Residuals had an outward-opening funnel form: they were more variable when the predicted MVs increased. The MV parameter was significant when regressing the absolute residuals against the predicted MVs (p = 0.02). This indicated the presence of heteroscedasticity and needed to be taken into in the model.

Figure 1.5 Scatter plots of standardised (Std) residuals versus predicted merchantable volume (MV), average height of first returns (Mean_h1st) and skewness of the area distribution for triangulated returns (Skew area)

The scatter plots on the left show the relationship before addition of the variance function to the model (see Equation (3)) and on the right, after the addition (see Equation (5)).

The spatial distribution of returns barely improved the predicted MV. Table 1.4 shows a model comparison between the basic MV model and 3 MV models built with an additional SP variable:

(32)

20

MV = exp (2.01) * Mean_h1st1.34 * Per_1st_2m -0.08 * Ri 0.68 * Skew_area -0.06+ ε (7)

MV = exp (2.25) * Mean_h1st1.32 * Per_1st_2m -0.08 * Ri 0.66 * Mean_angle -0.09+ ε (8)

MV = exp (2.00) * Mean_h1st1.34 * Per_1st_2m -0.08 * Ri 0.63 * Density 0.02+ ε (9)

The addition of the density variable (Equation (9)) was the least significant (p = 0.74). The RSDs were however similar for all models.

Table 1.4 Comparison of four predictive merchantable volume (MV) models before (basic model) and after addition of a spatial distribution variable

MV model AICc p-value Pseudo-R2 RSD (m3 ha-1)

Basic model 1137.23 <0.001 0.91 28.0

Basic model + Density 1139.32 0.74 a 0.91 28.0

Basic model + Mean_angle 1137.79 0.21 a 0.91 28.0

Basic model + Skew_area 1137.48 0.17 a 0.91 27.9

Basic model = exp(2.04) * Mean_h1st1.33 * Per_1st_2m -0.08 * Ri 0.64 + ε

Mean_h1st = average height of first returns, Per_1st_2m = percentage of first returns below 2 m, Ri = rumple index, Density = return density, Mean_angle = average scanning angle of returns, Skew_area = skewness of the area distribution for triangulated returns RSD: Residual standard deviation

a: the values represent the level of significance of the additional variable

Table 1.5 shows a comparison between MV models before (basic model) and after addition of a variance function (corrected models). Adding SP variables to the random part of the MV model improved the precision. The corrected MV models had lower AICc values (1110.32, 1123.58 and 1106.72 versus 1137.23 for the basic model). Mean_h1st and Skew_area had the most effect on the unexplained variance: residuals were more variable when they increased (see the left side of Figure 1.5 and Table 1.6). The model uncertainty thus increased for high values of height and an irregular spatial distribution of lidar returns. The uncertainty was better assessed by adding the following variance function to Equation (6):

(33)

21

The residual standard deviation, initially estimated to 28.0 m3 ha-1, was of 3.7 m3 ha- 1 multiplied by the variance function when accounting for the model heteroscedasticity. The residual distribution became homogeneous (see right side of Figure 1.5).

Table 1.5 Comparison of the merchantable volume (MV) models before (basic model) and after addition of a variance covariate

MV model AICc p-value RSD m3 ha-1

Basic model 1137.23 <0.001 28.0

Basic model + Mean_h1st variance covariate 1110.32 <0.001 a 9.0 Basic model + Skew_area variance covariate 1123.58 <0.001 a 13.5

Basic model + (Mean_h1st + Skew_area)

variance covariates 1106.72 <0.001

a 3.7

Basic model = exp(2.04) * Mean_h1st1.33 * Per_1st_2m -0.08 * Ri 0.64 + ε

Mean_h1st = average height of first returns, Per_1st_2m = percentage of first returns below 2 m, Ri = rumple index, Skew_area = skewness of the area distribution for triangulated returns, RSD: Residual standard deviation

a: the values represent the level of significance of the additional variance function. Table 1.6 Effects of Mean_h1st and Skew_area variables on the residual standard deviation (RSD) of the predicted merchantable volume

Mean_h1st (m) 2.0 4.0 6.0 8.0 10.0 12.0 Skew_area RSD (m3 ha-1) 0.5 7.8 14.0 19.8 25.3 30.6 35.7 1.0 9.1 16.4 23.1 29.5 35.7 41.7 1.5 10.6 19.1 27.0 34.5 41.7 48.7 2.0 12.4 22.3 31.5 40.3 48.7 56.9 2.5 14.5 26.1 36.8 47.0 56.9 66.4 3.0 16.9 30.5 43.0 54.9 66.4 77.5 3.5 19.7 35.6 50.2 64.1 77.5 90.5 4.0 23.0 41.5 58.6 74.9 90.5 105.7

We obtained results similar to the previous analyses when testing the robustness of the corrected MV model for a low return density scenario like in some operational context. The RSD was of 27.4 m3 ha-1 initially. It was then estimated to 3.7 m3 ha-1 multiplied by the variance function when accounting for the heteroscedasticity. Mean_h1st and Skew_area remained the best combination of covariates for the variance function.

(34)

22

Variable transformation was also used to account for the heteroscedasticity. We tested a square root, a cube root, and a logarithmic transformation given that the response and explanatory variables of the MV model were strictly positive (see Table 1.1 and 1.2), and that the three variables (MV, Per_1st_2m, and Ri) had a right-skewed distribution. The heteroscedasticity was best reduced with the logarithmic transformation. The p-value for the predicted MV parameter was improved, however, it remained significant (0.01). The model had the following form:

log(MV) = 1.99 + 1.18 * log (Mean_h1st) - 0.08 * log (Per_1st_2m) + 0.95 * log (Ri) + ε (11)

(35)

23 Discussion

This study aimed at developing an optimal model to predict the timber merchantable volume of balsam fir stands from lidar and to better assessing the model uncertainty. When heteroscedasticity occurs in a model, the confidence intervals of the parameters are unreliable (Beal & Sheiner, 1988; Robinson & Hamann, 2011). Uncertain predictions may induce measurement errors in the volume of stands and consequently on the evaluation of their economic value (Holopainen et al., 2010). The merchantable volume uncertainty should, therefore, be better characterized.

A highly significant model for predicting the MV could be developed using standard lidar variables as expected (Table 1.4: p-value <0.001). This is consistent with other studies indicating that lidar variables can accurately predict forest structure attributes in the eastern Canadian boreal forest (Luther et al., 2013; Treitz et al., 2012; Woods et al., 2011). The best predictors were Mean_h1st, Per_1st_2m, and Ri. Mean_h1st had a positive effect on the predicted MV as shown in Table 1.3. This can be explained by the fact that lidar can accurately determine the vertical structure of stands (Magnussen et al., 1999; Maltamo et al., 2005; White et al., 2016). Other studies have also confirmed that average height is a good predictor of volume (Bouvier et al., 2015; Treitz et al., 2012). Per_1st_2m can be correlated with the fraction of canopy gaps. Canopy gaps are small openings of stands where there are smaller or no trees (St-Onge et al., 2014). They, therefore, reduce the overall volume of stands. The variable had a negative effect on the predicted MV. Ri had a positive effect on the predicted MV. Ri has also been defined as a 3d measure of canopy heterogeneity (Kane et al., 2010). The positive effect can be explained by the fact that our study site had a heterogeneous canopy structure (see Table 1.1).

Figure 1.2 shows that the spatial distribution of returns can remain irregular despite a high return density or the presence of overlapping strips. The consequences of an irregular spacing pattern are directly evident when building, for example, canopy surface models or digital elevation models (Balsa-Barreiro et al., 2012; Puetz et al., 2009; Vaze et al., 2010). In our case study, the irregular pattern induced an alternation of local clumps (overscanned areas) and local gaps (areas without data) within the point cloud. This pattern did not affect the model accuracy as adding an SP variable to the fixed part

(36)

24

did not improve the predictions. However, the model residuals were affected by the pattern. This shows that an irregular spatial distribution of returns can increase the uncertainty of predictive models.

The variance function helped to identify the sources of heteroscedasticity in the model. Residuals were more heteroscedastic with an increase in the average height of returns (Mean_h1st) and the skewness of the area distribution of triangulated returns (Skew_area), see Figure 1.5 and Table 1.6. This suggests a higher MV uncertainty for high stands (e.g.: mature forest stands) and for irregular lidar scanning patterns. Natural high stands such as balsam fir stands tend to have a more heterogeneous canopy height structure compared to low stands (Spies & Franklin, 1991). High stands are also preferentially cut during forest operations. An irregular scanning pattern over these stands could over or under characterize the height distribution substantially. Their MV, therefore, need to be precisely predicted. Adding a variance function to the model enabled to assess how the model uncertainty varied for predicted MVs (see Table 1.6). Conversely, low stands, tend to have a more homogeneous canopy height structure. The over or under characterization of the height distribution due to the irregular scanning pattern would be less substantial.

Pooling lidar strips together at the plot level increased the return density (6.4 points  m- 2 on average). However, our analysis shows that return density did not influence the prediction of the merchantable volume estimates at the Montmorency Forest study site (p = 0.74). This result is in accordance with other studies (e.g.: (Næsset, 2004a; Thomas et al., 2006; Treitz et al., 2012)) done even in a low return density context (≤ 2 points  m- 2).

Acquiring lidar data with a small scanning angle would have also been an ideal solution to obtain more homogeneous data but the acquisition costs would also be greatly increased. Other authors have chosen to exclude from the point clouds either incident pulses transmitted at high scanning angles (Næsset, 2004b, 2005) or irregularly spaced sectors within strips (Balsa-Barreiro et al., 2012). However, these changes in the point cloud would substantially alter its structure. A statistical method describing the effects of the return distribution on the prediction of forest attributes has the advantage of alleviating these problems while entailing no additional costs. Roussel et al. (2017) have

(37)

25

also applied a probabilistic model to account for the effect of pulse density and footprint when estimating canopy height. We recommend the use of an additional spatial distribution variable when characterising the MV uncertainty.

A practical application of our study is the establishment of reliable uncertainty maps of predicted merchantable volumes. Users can then confidently assess the reliability of the estimates and consequently better plan their harvesting.

(38)

26 Conclusion

While lidar return density and scanning angle have little impact on the timber merchantable volume parameter estimates, an irregular return spacing influences the model uncertainty. Residuals of a standard merchantable volume predictive model were found to be correlated with the average height of returns and the skewness of the area distribution of triangulated lidar returns. We included these variables in a variance function to better characterize the model heteroscedasticity. The residual standard deviation was better estimated (3.7 m3 ha-1 multiplied by a variance function versus 28.0 m3 ha-1 without the variance function). Therefore, the average height and the spatial distribution of returns should be considered for reliable estimates of forest attributes.

(39)

27

Chapitre 2: Lidar and multispectral imagery classifications of

balsam fir tree status for accurate predictions of

merchantable volume.

Résumé

Dans cette étude, nous analysons l’aptitude des images lidar et multispectrales à classifier l’état des arbres (vivant/mort) et à prédire le volume marchand brut (VMB) dans 61 placettes-échantillons de sapinières. Nous avons délimité les cimes sur un modèle de hauteur de canopée. Pour chaque cime, des variables ont été calculées à partir des valeurs d'intensité des images puis utilisées comme variables de classification. La précision de classification était de 89% et le coefficient kappa, de 0.78. Les placettes-échantillons ont ensuite été classées selon le taux de mortalité (faible/fort). Les retours lidar associés aux arbres morts ont été éliminés. Notre étude suggère que les images lidar et multispectrales discriminent efficacement l’état des arbres et que l’élimination des retours associés aux arbres morts permet de réduire l’erreur de prédiction du VMB jusque 7.9% dans les placettes-échantillons de faible mortalité et jusque 17.2% dans les placettes-échantillons de forte mortalité.

Mots-clés: lidar; modélisation; volume marchand brut; sapinière; perturbation; mortalité des arbres; intensité du signal lidar; imagerie multispectrale; classification; filtrage de nuage de points.

(40)

28 Abstract

Recent increases in forest diseases have produced significant mortality in the boreal forests. These disturbances influence the timber merchantable volume predictions as they affect the distribution of live and dead trees. In this study, we assess the use of lidar, alone or combined with multispectral imagery, to classify trees and predict the merchantable volumes (MV) of 61 balsam fir plots in a boreal forest in eastern Canada. We delineated single tree crowns on a canopy height model. The number of detected tree crowns represented 92% of field trees. Using lidar intensity and image pixel variables, trees were classified as live or dead with an overall accuracy of 89% and a kappa coefficient of 0.78. Plots were then classified according to their level of mortality (low/high) using a 10.5% mortality ratio threshold. Lidar returns associated with dead trees were clipped. Before clipping, the root mean square errors of the MV models were of 22.7 m3 ha−1 in the low mortality plots and of 39 m3 ha−1 in the high mortality plots. After clipping, they decreased to 20.9 m3 ha−1 and 32.3 m3 ha−1 respectively. Our study suggests that lidar and multispectral imagery can be used to accurately filter dead balsam fir trees and decrease the merchantable volume prediction error by 17.2% in high mortality plots and by 7.9% in low mortality plots.

Keywords: lidar–based model, timber merchantable volume; balsam fir; forest disturbance; tree mortality; lidar intensity; multispectral imagery; classification; point cloud filtering.

(41)

29 Introduction

Remote sensing has greatly improved the quality of forest inventories. Light detection and ranging (lidar) is an active remote sensing tool which uses incident laser pulses to measure the distance to and record the strength of light backscattering from a target. Lidar generates point clouds which are a three-dimensional representation of the volumetric interaction between the laser pulses and the illuminated objects. Point clouds can be used to model forest attributes (e.g.: height, diameter at breast height (dbh), biomass, basal area, volume) at the tree (individual tree detection approach) or the stand (area-based approach) level (Bouvier et al., 2015; Maltamo et al., 2006; Næsset, 2002;

Sheridan et al., 2015; Treitz et al., 2012).

Lidar intensity, defined as the quantification of the strength of the pulse backscattering received from the object, has been proven useful in feature extraction (Hu et al., 2004), species identification (Fassnacht et al., 2016), land-cover classification (Song et al., 2002), forest attribute modelling (Bright et al., 2013; Kim et al., 2009), snag detection (Martinuzzi et al., 2009), and point cloud filtering (Wing et al., 2015). Bright et al. (2013),

for example, used lidar intensity to estimate the dead basal area of beetle-affected pine forests. Kim et al. (2009) observed that live and dead trees have different intensity distribution modes and that a segmentation method can predict the standing tree biomass of burned mixed coniferous forests. Casas et al. (2016) used lidar intensity to detect dead trees. Lidar has also been combined with other remote sensing tools such as optical imagery to improve dead tree identification (Näsi et al., 2015; Polewski et al., 2015; Vogeler et al., 2016).

Many of these previous studies have been carried out on sites where insect or fire disturbances have caused significant damage to the forests (Bright et al., 2013; Kim et al., 2009; Martinuzzi et al., 2009; Wing et al., 2015). However, we did not find any study demonstrating the ability of lidar sensors to detect dead trees for sites with low mortality where, for example, only natural mortality is observed. Yet, accurate information on live trees alone (e.g., live merchantable volume (MV), live biomass) are mandatory in an operational context as they are directly linked to the economic value of timber. Improved estimates of tree mortality are also fundamental for a sustainable forest management.

Références

Documents relatifs