Comparaison des deux méthodes proposées - Résultats et expériences

Chapitre 5: Nouvelle méthode de sélection d'attributs basée sur les réseaux de neurones et sur

5.4 Résultats et expériences

5.4.8 Comparaison des deux méthodes proposées

Dans cette partie, nous sommes intéressés à faire une petite comparaison entre les deux méthodes de sélection d'attributs proposées dans cette thèse : EN-Relief et la méthode basée sur l'idée des réseaux de neurones. Dans ce but, nous avons appliqué les deux méthodes proposées sur les mêmes données. En effet, on génère deux classes de signaux en utilisant le filtrage d'un bruit blanc par deux filtres Butterworth différents. Chaque classe comporte 500 signaux. Les signaux de classe 1 sont générés par un filtre butterworth avec une bande de fréquence [0.1 0.3] et les signaux de classe 2 sont générés par un filtre butterworth avec une bande de fréquence [0.3 0.5]. On extrait de ces signaux cinq attributs : Cd, fmed, MPF, pf et Ca. Dans ce cas l'attribut MPF est l'attribut le plus important qui distingue les deux classes.

On applique la méthode proposée EN-ReliefF sur ces données. Elle choisit l'attribut important MPF. On obtient le même résultat, MPF est choisi, si on applique la méthode proposée basée sur l'idée de réseaux de neurones (Tableau 5.4).

114

5.5 Conclusion

Dans ce chapitre, nous avons fait une présentation des réseaux de neurones et de l'apprentissage automatique. Ensuite, nous avons décrit en détails une méthode basée sur l'idée de réseaux de neurones, qu'on propose pour effectuer une sélection d'attributs. Cette méthode affecte aux attributs importants des poids élevés et aux attributs non importants des poids proches de zéro. Puis nous avons présenté une interprétation de l'effet de deux paramètres λ et C sur la méthode proposée. Les résultats de l'application de la méthode proposée sur des données simulées sont ensuite présentés. Ces résultats montrent que cette méthode de sélection des attributs est efficace comme elle choisit les attributs importants en leur affectant des coefficients élevés et en affectant aux autres attributs des poids nuls.

A noter que le travail sur cette méthode proposée n'est pas totalement terminé. Comme perspectives, nous proposons une étude plus détaillée de la performance de la méthode proposée dans la sélection des attributs ; par exemple on pourra étudier la sensibilité de la méthode par rapport aux paramètres λ et C , appliquer une méthode de détermination des paramètres λ et C...

115

Conclusion générale et perspectives

Cette thèse traite le problème de la réduction de dimension qui est un thème important en classification-identification et compréhension des données. Il existe deux approches générales pour la réduction de la dimension : l'extraction des attributs et la sélection des attributs. L'extraction des attributs consiste à réduire la dimension de l'espace par une transformation des attributs existants. Tandis que la sélection des attributs consiste à trouver un sous-ensemble d'attributs optimal à partir des attributs existants sans aucune transformation.

Dans cette thèse, nous sommes intéressés à la sélection des attributs. A partir de l'ensemble total de tous les attributs, nous cherchons à trouver un sous-ensemble optimal de ces attributs de plus faible dimension possible et aboutissant à une bonne classification.

Notre objectif a donc principalement porté sur les points suivants :

 Nous avons présenté une étude complète et détaillée de la sélection des attributs. Ces principaux avantages sont expliqués ainsi que les différentes étapes du processus de sélection des attributs et les différentes stratégies de recherches et les différents critères d'évaluation qui peuvent être utilisées dans les algorithmes de sélection des attributs. Une présentation détaillée des différentes méthodes de sélection des attributs déjà existantes est effectuée. Les algorithmes de ces méthodes ainsi que leurs principaux avantages et limitations sont aussi présentés.

 Nous avons proposé une nouvelle méthode pour la sélection des attributs "EN-ReliefF". Des expérimentations sur des signaux réels et non réels ont été effectuées. Tous les résultats montrent que c'est une méthode de sélection d'attributs très efficace. Elle enlève les attributs redondants et choisit un sous-ensemble optimal d'attributs. L'étude de la performance de cette méthode a montré qu'elle aboutit à une très bonne erreur de classification. En plus, plusieurs simulations faites ont montré que cette méthode proposée présente une grande stabilité dans le processus de sélection.

116

 Nous avons aussi proposé une nouvelle méthode pour la sélection des attributs basée sur l'idée de réseau de neurones. Cette méthode consiste à optimiser un prédicteur en pondérant les attributs de sorte que les attributs importants aient des poids élevés et les attributs non importants des poids proches de zéro. L'application de la méthode proposée sur des données simulées a été faite. Tous les résultats ont montré que cette méthode est très efficace, elle choisit les attributs importants.

Nous avons conscience que notre travail ne constitue qu'un pas à travers le long chemin du domaine de la sélection d’attributs. Ainsi nous présentons ici plusieurs améliorations et perspectives qui peuvent prolonger les travaux de cette thèse :

 Développer le travail sur la méthode de réseau de neurones pour trouver automatiquement des poids élevés aux attributs les plus pertinents.

 Développer théoriquement cette méthode basée sur les réseaux de neurones et trouver les dérivées de l’erreur par rapport aux poids des attributs.

 Comparer notre méthode aux méthodes basées sur le BPSO (Binary Particle Swarm Optimization).

 Appliquer les méthodes proposées sur des signaux réels d'origine et de nature différentes de celles des signaux réels utilisés dans notre travail.

 Comparer les méthodes proposées entre elles et à d’autres méthodes pour des données de grande dimension.

117

Références

[1] A. El Hannani, K. Refassi, A. El Maiche, M. Bouamama, "Maintenance predictive et preventive basee sur l'analyse vibratoire des rotors," Laboratoire de Mécanique des Solides et des Structures, Université de Sidi Bel Abbes., 2015.

[2] O. DJEBILI, "Contribution à la maintenance prédictive par analyse vibratoire des composants mécaniques tournants. Application aux butées à billes soumises à la fatigue de contact de roulement," UNIVERSITE DE REIMS CHAMPAGNE ARDENNE, phd thesis 2013.

[3] E. Alpaydin, Introduction to Machine Learning.: London: The MIT Press, 2010.

[4] Z. M. Hira, D. F. Gillies, "A review of feature selection and feature extraction methods applied on microarray data," Advances in Bioinformatics, vol. 2015, 2015.

[5] Y. Saeys, I. Inza, P. Larrañaga, "A review of feature selection techniques inbioinformatics,"

bioinformatics, vol. 23, no. 19, pp. 2507–2517, 2007.

[6] M. Dash, H. Liu, "Feature Selection for Classification," J. Intelligent Data Analysis, vol. 1, pp. 131- 156, 1997.

[7] K. Kira, L.A.Rendell, "A pratical approach to feature selection," In Proceedings of ninth international

workshop on machine learning, pp. 249-256, 1992.

[8] L. Meier, S. van de Geer, P. Bühlmann, "The group LASSO for logistic regression," J. R. Statist. Soc.

B, vol. 70, Part 1, pp. 53-71, 2008.

[9] R.Tibshirani, "Optimal Reinsertion:Regression shrinkage and selection via the LASSO ," J.R.Statist.

Soc. B, 1996.

[10] Y.Zhou, R.Jin, S.C.H. Hoi, "Exclusive LASSO for Multi-task Feature Selection," Journal of Machine

Learning Research - JMLR, 2010.

[11] H.Zou, T.Hastie, "Regularization and variable selection via the elastic net," J. R. Statist. Soc. B, 2005.

[12] P.Pudil, J.Novovicova, J.Kittler, "Floating Search Methods in Feature Selection," Pattern

Recognition Letters , vol. 15, pp. 1119--1125, 1994.

[13] I.Guyon, A.Elisseeﬀ, "An Introduction to Variable and Feature Selection," Journal of Machine

118

[14] A. El Akadi, "Contribution à la sélection de variables pertinentes en classification supervisée : Application à la sélection des gènes pour les puces à ADN et des caractéristiques faciales," UNIVERSITÉ MOHAMMED V – AGDAL, Rabat, phd dissertation 2012.

[15] W. Jitkrittum, L. Sigal, E.P. Xing, M. Sugiyama M.Yamada, "High-Dimensional Feature Selection by Feature-Wise Kernelized LASSO," Yahoo! Labs, USA,Tokyo Institute of Technology, Japan, Disney Research Pittsburgh, Carnegie Mellon University, Pittsburgh, 2013.

[16] Z.Lipton, J.Berkowitz, C.Elkan, "A Critical Review of Recurrent Neural Networks for Sequence Learning," arXiv:1506.00019v4 [cs.LG], 2015.

[17] R. Gutierrez-Osuna, "Dimensionality reduction," Wright State University, lecture 10 2000. [18] S. El Ferchichi, "Sélection et Extraction d’attributs pour les problèmes de classification,"

Université Lille1 des Sciences et Technologies Université de Tunis El Manar Ecole Nationale d’Ingénieurs de Tunis, phD thesis 2013.

[19] M. Reza, F. Derakhshi, M. Ghaemi, "Classifying different Feature Selection Algorithms based on the Search Strategies," in International Conference on Machine Learning, Electrical and Mechanical

Engineering (ICMLEME), Dubai, 2014, pp. 17-21.

[20] R. Gutierrez-Osuna, "Introduction to Pattern Analysis," Texas A&M University, lecture 11 2012. [21] S. Alelyani, Z. Zhao, H. Liu, "A Dilemma in Assessing Stability of Feature Selection Algorithms," in

IEEE International Conference on High Performance Computing and Communications, 2011, pp.

701-707.

[22] J. Sun, X. Zhang, D. Liao, V. Chang, "Efficient Method for Feature Selection in Text Classification," in

International Conference on Engineering and Technology (ICET), 2017, pp. 1 - 6.

[23] H. A. Elnemr, "Feature Selection for Texture-Based Plant Leaves Classification," in 2017 Intl Conf on

Advanced Control Circuits Systems (ACCS) Systems & 2017 Intl Conf on New Paradigms in Electronics & Information Technology (PEIT), 2017, pp. 91 - 97.

[24] S.l Kumbhar, S. Patil, "An Efficient & Effective Feature Subset Selection for High Dimensional Data ," in International Conference on Trends in Electronics and Informatics (ICEI), 2017, pp. 486-491. [25] H. Motoda, H. Liu, "Feature selection, extraction and construction," Commnication of Institute of

Information and Computing Machinery, vol. 5, pp. 67–72, 2002.

[26] S. Khalid; T. Khalil; S. Nasreen, "A Survey of Feature Selection and Feature Extraction Techniques in Machine Learning," in 2014 Science and Information Conference, 2014, pp. 372 - 378.

119

[27] D. Zongker, A. Jain, "Algorithms for Feature Selection: An Evaluation," in Proceedings of 13th

International Conference on Pattern Recognition, 1996, pp. 18-22 vol.2.

[28] G. Wang, Q. Song, H. Sun, X. Zhang, B. Xu, Y. Zhou, "A Feature Subset Selection Algorithm Automatic Recommendation Method," Journal of Artificial Intelligence Research, vol. 47, 2013. [29] F. TAN, "IMPROVING FEATURE SELECTION TECHNIQUES FOR," Georgia State University, phd

dissertation 2007.

[30] D.Alamedine, "Selection of EHG parameter characteristics for the classification of uterine contractions," Université de Technologie de Compiègne, 2015.

[31] N. Zhang, "Feature selection based segmentation of multi-source images : application to brain tumor segmentation in multi-sequence MRI," INSA de Lyon, 2011.

[32] H. Liu, L. Yu, "Toward integrating feature selection algorithms for classification and clustering,"

Knowl. Data Eng. IEEE Trans. On, vol. 17, no. 4, pp. 491–502, 2005.

[33] P. Langley, "Selection of relevant features in machine learning," in In: Proceedings of the AAAI Fall

Symposium on Relevance, 1994, pp. 1–5.

[34] H. Chouaib, "Sélection de caractéristiques: méthodes et applications," université Paris Descartes, Ph.D. dissertation 2011.

[35] R. Kohavi, G. H. John, "Wrappers for feature subset selection," Artif. Intell., vol. 97, no. 1, pp. 273– 324, 1997.

[36] S. F. Pratama, A. K. Muda, Y.-H. Choo, N. A. Muda, "A Comparative Study of Feature Selection Methods for Authorship Invarianceness in Writer Identification," Int. J. Comput. Inf. Syst. Ind.

Manag. Appl., vol. 4, pp. 467–476, 2012.

[37] S. Srivastava1, N. Joshi, M. Gaur, "A Review Paper on Feature Selection Methodologies and Their Applications," International Journal of Engineering Research and Development, vol. 7 , no. 6, pp. 57-61, 2013.

[38] A. L. Blum, P. Langley, "Selection of relevant features and examples in machine learning," Artif.

Intell, vol. 97, no. 1, pp. 245-271, 1997.

[39] M. L. Raymer, W. F. Punch, E. D. Goodman, L. A. Kuhn, A. K. Jain, "Dimensionality Reduction Using Genetic Algorithms," IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, vol. 4, no. 2, pp. 164-171, 2000.

120

[40] T. Marill, D. M. Green, "On the effectiveness of receptors in recognition systems," Inf. Theory IEEE

Trans. On, vol. 9, no. 1, pp. 11–17, 1963.

[41] L. Ladha, T. Deepa, "Feature selection methods and algorithms," Int. J. Comput. Sci.Eng., vol. 3, no. 5, pp. 1787–1797, 2011.

[42] A. W. Whitney, "A direct method of nonparametric measurement selection," Comput.IEEE Trans.

On, vol. 100, no. 9, pp. 1100–1103, 1971.

[43] P.Somol, J. Novovicova, P.Pudil, "Efficient Feature Subset Selection and Subset Size Optimization ," Institute of Information Theory and Automation of the Academy of Sciences of the Czech Republic, 2010.

[44] S.F.Pratama, A.K.Muda, Y.Choo, N.A.Muda, "Computationally Inexpensive Sequential Forward Floating Selection for Acquiring Significant Features for Authorship Invarianceness in Writer Identification," International Journal on New Computer Architectures and Their Applications

(IJNCAA), vol. 1, no. 3, pp. 581-598, 2011.

[45] A. R. Webb, "Statistical Pattern Recognition. John Wiley & Sons, ," 2002.

[46] P. M. Narendra, K. Fukunaga, "A branch and bound algorithm for feature subset selection," IEEE

Transactions on Computers, vol. C-26, no. 9, pp. 917–922, 1977.

[47] P. Somol ; P. Pudil ; J. Kittler, "Fast branch & bound algorithms for optimal feature selection," IEEE

TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, vol. 26, no. 7, pp. 900 - 912,

2004.

[48] X. Chen, "An improved branch and bound algorithm for feature selection," Pattern Recognition

Letters, vol. 24, no. 12, pp. 1925-1933, 2003.

[49] S.Nakariyakul, "A Review of Suboptimal Branch and Bound Algorithms," IPCSIT, vol. 2, 2011. [50] B. Yu and B. YUAN, "A more efficient branch and bound algorithm for feature selection," Pattern

Recognition, vol. 26, no. 6, pp. 883-889, 1993.

[51] S. Nakariyakul, D. Casasent, "Adaptive branch and bound algorithm for selecting optimal features,"

Pattern Recognition Letters, vol. 28, pp. 1415-1427, 2007.

[52] T. Cover, J. Thomas, Elements of Information Theory.: John Wiley & Sons, Inc., 2006. [53] E. Grall-Maës, P. Beauseroy, "Mutual Information-Based Feature Extraction on the Time–

121

[54] H. Liu, Y. Mo, J. Wang, J. Zhao, "A new feature selection method based on clustering," in 2011

Eighth International Conference on Fuzzy Systems and Knowledge Discovery (FSKD), vol. 2, 2011,

pp. 965–969.

[55] M. A. Hall, "Correlation-based Feature Selection for Machine Learning," The university of Waikato, Hamilton, New Zealand, 1999.

[56] M. A. Hall, L. A. Smith, "Feature subset selection: a correlation based filter approach," in

International Conference on Neural Information Processing and Intelligent Information Systems,

1997, pp. 855-858.

[57] J. C. H. Hernandez, B. Duval, J. Hao, "A Genetic Embedded Approach for Gene Selection and Classification of Microarray Data," in EvoBIO 2007, 2007, pp. 90–101.

[58] S. Dudoit, J. Fridlyand, T. P. Speed., "Comparison of discrimination methods for the classification of tumors using gene expression data," Journal of the American Statistical Association, vol. 97, no. 457, pp. 77–87, 2002.

[59] F.Pérez-Cruz, "Kullback-Leibler Divergence Estimation of Continuous Distributions," in Information

Theory, 2008. ISIT 2008. IEEE International Symposium on, 2008.

[60] M.Osborne, F.Keller, "Entropy and Kullback-Leibler Divergence," School of Informatics, University of Edinburgh, 2008.

[61] B.Bigi, "Using Kullback-Leibler Distance for Text Categorization," in ECIR 2003, 2003, pp. 305–319. [62] M. Eltabach, T Vervaeke, S. Sieg-Zieba, E. Padioleau, S. Berlingen, "Features extraction using

vibration signals for condition monitoring of lifting cranes," Centre Technique des Industries Mécaniques CETIM., 2010.

[63] H. Wang, P. Chen, "A Feature Extraction Method Based on Information Theory for Fault Diagnosis of Reciprocating Machinery," in Sensors 2009, vol. 9, 2009, pp. 2415-2436.

[64] M. Zani, "Les roulements, des composants à surveiller de près," MESURES , vol. 754, pp. 40-43, Avril 2013.

[65] M. KHALIL, J. AL HAGE, K. KHALIL, "Feature Selection Algorithm Used to Classify Faults in Turbine Bearings," International Journal of Computer Science and Application, vol. 4, no. 1, pp. 1-8, 2015. [66] N. Jung, "Le LASSO dans le cadre des microarrays ," Institut de Recherche en Mathématiques

122

[67] J.Ogutu, T.Schulz-Streeck, H. Piepho, "Genomic selection using regularized linear regression models: ridge regression, LASSO, elastic net and their extensions," in 15th European workshop on

QTL mapping and mark assisted selection (QTLMAS), Rennes, France, 2011.

[68] B. Efron, T. Hastie, I. Johnstone, and R. Tibshirani, "LEAST ANGLE REGRESSION," The Annals of

statistics, vol. 32, no. 2, pp. 407–499, 2004.

[69] H.Fawzi, W.Krichene, "La régularisation elastic net pour la classification de cancer via l’expression des gènes," Rapport mini projet 2013.

[70] W. Megchelenbrink, "Relief-based feature selection in bioinformatics: detecting functional specicity residues from multiple sequence alignments," Radboud University Nijmegen, Master thesis in information science 2010.

[71] S. F. Rosario, K. Thangadurai , "RELIEF: Feature Selection Approach ," International journal of

innovative research & development, vol. 4, no. 11, pp. 218-224, 2015.

[72] M.Robnik-ˇSikonja, I. Kononenko, "An adaptation of Relief for attribute estimation in regression," University of Ljubljana, 2000.

[73] I.Kononenko, M.Robnik-ˇSikonja, U.Pompe, "ReliefF for estimation and discretization of attributes in classification, regression, and ILP problems," University of Ljubljana, 2000.

[74] M. Robnik-Šikonja and I. Kononenko, "Comprehensible interpretation of relief’s estimates," in in

Machine Learning: Proceedings of the Eighteenth International Conference on Machine Learning (ICML’2001), 2001, pp. 433–440.

[75] I. Kononenko, "Estimating attributes: Analysis and extension of Relief," in European Conference on

Machine Learning, 1994, pp. 171-182.

[76] A.Smola, S.V.N. Vishwanthan, "Introduction to Machine Learning ," Cambridge university press, 2008.

[77] N. J. Nilsson, "Introduction to Machine Learning," Robotics Laboratory Department of Computer Science Stanford University, , 1998.

[78] S. Shalev-Shwartz, S. Ben-David, "Understanding Machine Learning: From Theory to Algorithms ," Cambridge university press, 2014.

[79] K. P. Murphy, Machine Learning A Probabilistic Perspective. Massachusetts London, England: The MIT Press, 2012.

123

[81] C. Peterson, T. Rognvaldsson, "An introduction to artificial neural networks," University of Lund, Sweden., 1992.

[82] V.Cheung, K.Cannons, "An Introduction to Neural Networks," Signal & Data Compression Laboratory, University of Manitoba, Canada, 2002.

[83] S. Haykin, Neural networks : A comprehensive foundation. New Jersey: Prentice-Hall, 1999. [84] G. Huang, Q.Zhu, C.Siew, "Extreme learning machine: Theory and applications," Neurocomputing,

vol. 70, pp. 489–501, 2006.

[85] T. Yuan, W. Deng, J. Hu, "Supervised Hashing with Extreme Learning Machine," in IEEE Visual

124

Liste des publications

Publication dans des journaux internationaux :

N. CHALLITA, M. KHALIL, P. BEAUSEROY, "Efficiency and stability of EN-ReliefF, a new method for feature selection", International Journal Of Computer Aided Engineering and Technology, Vol. 10, No. 3, 2018.

Publications dans des congrès internationaux :

N. CHALLITA, M. KHALIL, P. BEAUSEROY, "New technique for feature Selection: combination between Elastic Net and Relief", The Third International Conference on Technological Advances in Electrical, Electronics and Computer Engineering (TAEECE), Page(s):262 - 267, 2015.

N. CHALLITA, M. KHALIL, P. BEAUSEROY, "New feature Selection method based on neural network and machine learning", IEEE International MultidisciplinaryConference on Engineering Technology (IMCET), Page(s):81- 85, 2016.

125

Annexe A : Algorithmes de quelques méthodes de sélection

de paramètres

A.1: Algorithme de SBS

1. Commencer par un ensemble qui contient tous les paramètres S0=A; k = 0; 2. a− = argmax

a−∈S_k [J(S_k− {x−})] 3. Tant que J(S_k− {a−}) > 𝐽(S_k)

a. Mettre à jour S_k+1= S_k− {a⁻}; k = k + 1 b. Enlever le plus mauvais paramètre:

a⁻ = argmax

a−∈S_k

[J(S_k− {a⁻})]

4. Fin

A.2: Algorithme de SFBS

1. Commencer par un ensemble qui contient tous les paramètres S₀ = A; k = 0; 2. a⁻ = argmax a−∈S_k [J(S_k− {a⁻})] 3. Tant que J(S_k− {a⁻}) > J(S_k) a. Mettre à jour S_k+1= S_k− {a⁻}; k = k + 1; b. a⁺ = argmax a+∉S_k [J(S_k∪ {a⁺})] c. Tant que J(S_k∪ {a⁺}) > J(S_k) Mettre à jour S_k+1 = S_k∪ {x⁺}; k = k+1; a⁺ = argmax a+∉S_k [J(S_k∪ {a⁺})]

126 d. a⁻ = argmax

a−∈Sk

[J(S_k− {a⁻})]

4. Fin

A.3: Algorithme de LRS pour L < R

5. Si L < R

Commencer par un ensemble qui contient tous les paramètres S₀ = A; k = 0; 6. Tant que |S_k| > 𝑚 Répéter R fois a⁻ = argmax a−∈S_k [J(S_k− {a⁻})] S_k+1 = S_k− {a⁻}; k = k + 1 Répéter L fois a⁺ = argmax a+∉S_k [J(S_k∪ {a⁺})] S_k+1 = S_k∪ {a⁺}; k = k + 1

Dans le document Contributions à la sélection des attributs de signaux non stationnaires pour la classification (Page 116-130)