• Aucun résultat trouvé

Title: Joint Selection of Wavenumber Regions for MidIR and Raman Spectra and Variables in PLS Regression using Genetic Algorithms Authors & affiliations: Lidwine Grosmaire

N/A
N/A
Protected

Academic year: 2021

Partager "Title: Joint Selection of Wavenumber Regions for MidIR and Raman Spectra and Variables in PLS Regression using Genetic Algorithms Authors & affiliations: Lidwine Grosmaire"

Copied!
2
0
0

Texte intégral

(1)

Title:

Joint Selection of Wavenumber Regions for MidIR and Raman Spectra and Variables in PLS Regression using Genetic Algorithms

Authors & affiliations:

Lidwine Grosmaire1, Pedro Maldonado1, Christelle Reynès2, Robert Sabatier2, Dominique Dufour3, Thierry Tran4 and Jean-Louis Delarbre1

1 Laboratoire de Physique Moléculaire et Structurale – UMR Qualisud – Université Montpellier 1 - France

2 Laboratoire de Physique Industrielle et Traitement de l’Information – EA 2415 - Université Montpellier 1 – France

3 CIRAD – Dpt Persyst - UMR Qualisud – CIAT, Cali, Colombia

4 CIRAD - Dpt Persyst - UMR Qualisud - CSTRU - Kasetsart University – Bangkok - Thailand

Abstract: (Your abstract must use Normal style and must fit in this box. Your abstract should be no longer than 300 words. The box will ‘expand’ over 2 pages as you add text/diagrams into it.)

This work fits into the context of cassava processing. Production and consumption of this product is steadily increasing worldwide and especially in tropical regions where, after harvest, cassava starch is extracted according to an empirical process: natural fermentation and sun-drying, which gives to this product an interesting breadmaking capacity despite of the absence of gluten.

The objective of this work is to try to explain the breadmaking ability from different parameters (physicochemical and spectroscopic data) using a statistical regression method while selecting variables of different types: individual and intervals.

In chemometrics, the choice of explanatory variables is a problem often discussed, but, when it comes to select intervals, methodologies are rarer and more complex (Höskuldsson 2001). Among the specific methods developed (Norgaard et al 2004), Genetic Algorithms (GA) were chosen and combined with the PLS method to select intervals (Leardi 2000, Leardi and Norgaard 2004).

In this case, the explanatory variables are organized in a multitable in which intervals and individual variables are selected in order to predict one variable of interest: the breadmaking capacity. To this end, we will use and adapt a GA developed in a context of discrimination (usual LDA), jointly with the PLS1 method (Reynes et al 2006), this method is called AGvPLSm.

Figure 1 : Final GA populations characteristics: selected variables are indicated by black points The variables selected by this method are shown in Figure 1 which confirms the global convergence of algorithms. AGvPLSm compared with other methods (cf table 1) shows that the resulting model is more efficient (r ² ≈ 99%) and leads to a reasonable number of selected variables in the multitable (4% of the initial variables).

(2)

PLS 7926 7 0.7836 0.6605

PLS + VIP 4 3 0.7210 0.6650

AGvPLSm 311 12 0.9936 0.8273

Table 1 : Comparison of different method results (number of selected variables, number of retained PLS components, R² and cross-validation R²).

In terms of interpretation, this method allowed to highlight the importance of some physicochemical variables and select a small number of spectral intervals that significantly complete the model.

Höskuldsson, A. (2001) Variable and subset selection in PLS regression. Chemometrics and Intelligent

Laboratory System, 55, 23-38.

Leardi, R. (2001) Genetic algorithms in chemometrics and chemistry: a review. Journal of Chemometrics, 15, 559-569.

Leardi, R. Norgaard, L. (2004) Sequential application of backward interval spartial least squares and genetic algorithms for the selection of relevant spectral regions. Chemometrics and Intelligent Laboratory

System, 18, 486-497.

Nogaard, L., Saudland, A., Wagner, J., Nielsen, J.P., Munck, L., Engelsen, S.B. (2000) Interval partial least-squares regression (iPLS): A comparitive chemometric study with an example from near-infrared spectroscopy. Applied Spectroscopy, 54, 3, 413-419.

Références

Documents relatifs

The first method (see [5]) compares empirical risks associated with different bandwidths by using ICI (Intersection of Confidence Intervals) rule whereas the second one (see

RMSE calculated from logistic model, nonparametric and semiparametric regression estimators using discrete symmetric triangular associated kernels on mortality

On adaptive wavelet estimation of the regression function and its derivatives in an errors-in-variables model... in an

The present work thus addresses the influence of particle concentration, particle size and frequency on the intrinsic damping properties of STF composed of highly concen-

PLS1 Y IR 2 X Ra 3 X 1 52 1 52 1 4562 -0.45 -0.40 -0.35 -0.30 -0.25 -0.20 -0.15 -0.10 -0.05 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 Int ens it é (c oup s) 500 1 000 1 500

Domain disruption and mutation of the bZIP transcription factor, MAF, associated with cataract, ocular anterior segment dysgenesis and coloboma Robyn V.. Jamieson1,2,*, Rahat

L’énonciateur est un être discursif qui correspond au(x) contenu(s) exprimé(s) dans un énoncé. Ce n’est pas une personne mais un être abstrait qui s’exprime

Keywords: Asymptotic normality, consistency, deconvolution kernel estimator, errors-in-variables model, M-estimators, ordinary smooth and super-smooth functions, rates of