Les mesures d’exactitude par classe - Analyse de données massives

Pour une classe donnée, un classifieur, et un exemple, quatre cas peuvent se présenter :

1. Le classifieur ne se trompe pas : c’est un vrai positif.

2. Le classifieur se trompe : c’est un faux négatif.

3. Le classifieur la lui attribue quand même : c’est faux positif.

4. Le classifieur ne le range pas non plus dans cette classe : c’est un vrai

A.1.20.1 TP Rate

Rapport (ratio) des vrais positifs. Il correspond à :

nbre de vrais positi f s

(nbre de vrais posit f s+nbre de f aux ngati f s) ⁼

nbre de vrais positi f s nbre d0exemples de cette classe

C’est donc le rapport entre le nombre de bien classé et le nombre total d’éléments qui devraient être bien classes.

A.1.20.2 FP Rate

Rapport des faux positifs. Il correspond à :

nbre de f aux positi f s

(nbrede f aux posit f s+nbre de vrais ngati f s) ⁼

nbre de f aux positi f s

nbre d0exemples n0etantpas de cette classe

La donnée des taux TP Rate et FP Rate permet de reconstruire la matrice de confusion pour une classe donnée.

[Abraham & Sathya 2013] Abraham et Sathya. Comparison of Supervised and Unsupervised Learning Algorithms for Pattern Classification. In-ternational Journal of Advanced Research in Artificial Intelligence, vol. 2, no. 2, 2013.

[Adamov 2018] Abzetdin Adamov. large-scale data modelling in hive and distributed query processing using mapreduce and tez. Research in the framework of Center for Data Analytics Research (CeDAR), 2018. [Agrawal et al. 1998] Agrawal, Gunopulos Gehrke et Raghavan. Automatic

subspace clustering of high dimensional data for data mining applications. 1998.

[Akbani et al. 2004] Akbani, Kwek et Japkowicz. Applying support vector machines to imbalanced datasets. ECML, Springer Berlin Heidelberg, page 39–50, 2004.

[Akidau et al. 2015] Tyler Akidau, Slava Chernyak Robert Brad-shaw Craig Chambers et Rafael Fernandez Moctezuma. The Dataflow Model : A Practical Approach to Balancing Correctness, Latency, and Cost in Massive-Scale, Unbounded, Out-of-Order Data Processing. Proceedings of the VLDB Endowment, vol. 8, no. 12, 2015.

[Albattah 2016] Waleed Albattah. The Role of Sampling in Big Data Analy-sis. International Conference on Big Data and Advanced Wireless Technologies, vol. doi>10.1145/3010089.3010113, 2016.

[Aldossary & Allen 2016] Sultan Aldossary et William Allen. Data Secu-rity, Privacy, Availability and Integrity in Cloud Computing : Issues and Current Solutions. International Journal of Advanced Computer Science and Applications, vol. 7, no. 4, 2016.

[Alhadad 2018] Sakinah Alhadad. Visualizing Data to Support Judge-ment, Inference, and Decision Making in Learning Analytics : In-sights from Cognitive Psychology and Visualization Science. The Journal of Learning Analytics, vol. 5, no. 2, pages 60–85, http ://dx.doi.org/10.18608/jla.2018.52.5, 2018.

[Allison 2014] Paul Allison. Measures of Fit for Logistic Regression. Statisti-cal Horizons LLC and the University of Pennsylvania, Paper 1485, 2014.

[Alvi 2016] Mohsin Alvi. A Manual for Selecting Sampling Techniques in Research. University of Karachi, MPRA Paper No. 70218, vol. dis-ponible sur : https ://mpra.ub.uni-muenchen.de/70218/, 2016.

[Anders & KAndrot 2011] JAson Anders et EdwArd KAndrot. CUDA by Example : An Introduction to General-Purpose GPU programming. NVI-DIA Corporation, 2011.

[Anderson et al. 1995] Anderson, Culler et Patterson. A case for NOW. IEEEMicro, vol. 15, no. 1, pages 54–64, 1995.

[Andrew & Steen 2016] Andrew et Steen. A brief introduction to distributed systems. Computing, vol. 98, pages 967–1009, doi 10.1007/s00607– 016–0508–7, 2016.

[Anguera et al. 2018] Anguera, Chacón-Moscoso Portell et Sanduvete-Chaves. Indirect observation in everyday contexts : Concepts and methodological guidelines within a mixed methods frame-work. Frontiers in Psychology, vol. 9, page Article ID 13. http ://dx.doi.org/10.3389/fpsyg.2018.00013, 2018.

[Ansari & Swarna 2017] Zahid Ansari et Swarna. Apache Pig - A Data Flow Framework Based on Hadoop Map Reduce. International Journal of Engineering Trends and Technology, vol. 50, no. 5, 2017.

[Antal & Tille 2011] Erika Antal et Yves Tille. Simple random sampling with over-replacement. Journal of Statistical Planning and Inference, vol. 141, no. 1, pages 597–601, 2011.

[Arasu et al. 2013] Arasu, Kossmann Ramamurthy Eguro Kaushik et Ven-katesan. A secure coprocessor for database applications. In Proceedings of the 23rd International Conference on Field programmable Logic and Applications, pages 1–8, doi :10.1109/FPL.2013.6645524, 2013. [Ardilly 2006] Ardilly. Les techniques de sondage. Edition TECHNIP, 2006. [Arlot & Celisse 2010] Arlot et Celisse. A survey of cross-validation

proce-dures for model selection. Statistics,surveys, vol. 4, page 40–79, 2010. [Armstrong et al. 2014] Timothy Armstrong, Michael Wilde Justin

Woz-niak et Ian Foster. Compiler Techniques for Massively Scalable Implicit Task Parallelism. SC14, New Orleans, Louisiana, USA, 2014.

[Arren 1952] Torgerson Arren. Multidimensional scaling : I. Theory and me-thod. Psychometrika, vol. 17, pages 401–419, 1952.

[Asprey 1989] William Asprey. Von Neumann’s contributions to computing and computer science. Annals of the history of computing, vol. 11, no. 3, pages 189–195, 1989.

[Asprey 1990] William Asprey. Von Neumann and the Origins of modern computing. The MIT press. Cambridge, Mass, 1990.

[AStephen 2015] Thomas AStephen. Data Visualization with JavaScript. ed. s.l. :No Starch Press, 2015.

[Ataro 1967] Yamane Ataro. Statistics, an introductory analysis. 2nd ed. New York : Harper and Row, 1967.

[Atiquzzaman 1993] Atiquzzaman. Performance modeling of multipro-cessor systems for different data loading schemes. Microproces-sing and Microprogramming, vol. 36, no. 4, pages 167–178, https ://doi.org/10.1016/0165–6074(93)90241–C, 1993.

[Azeem et al. 2015] Muhammad Waqas Azeem, Arslan Tariq Farzan Ja-ved Sheikh et Mirza Ahsan Ullah. A Review on Multiple Instruction Multiple Data (MIMD) Architecture. Proceedings of the 1st Inter-national Multi-Disciplinary Conference (IMDC), The University of Lahore , Gujrat Campus, PK, 23-24, 2015.

[Bache & Lichman 2013] Bache et Lichman. UCI machine learning repo-sitory, vol. http ://archive.ics.uci.edu/ml, 2013.

[Baer 1976] Baer. Multiprocessing Systems. IEEE Transactions on Compu-ters, vol. 25, pages 1271–1277, doi : 10.1109/TC.1976.1674594, 1976. [Balaji & Baskaran 2013] Arun Balaji et Baskaran. Design and development

of artificial neural networking (ann) system using sigmoid activation function to predict annual rice production in tamilnadu. International Journal of Computer Science, Engineering and Information Tech-nology, vol. 3, no. 1, 2013.

[Balasundaram 2009] Nimalathasan Balasundaram. Factor Analysis : Na-ture, Mechanism and Uses in Social and Management Science. Journal of Cost and Management Accountant, vol. 37, no. 2, pages 15–25, 2009.

[Banerjee & Wolfe 1987] Utpal Banerjee et Michael Wolfe. Data dependence and its application to parallel processing. International Journal of Pa-rallel Programming, vol. 16, no. 2, page 137–178, 1987.

[Banerjee et al. 2007] Banerjee, Banerjee Mahato Chaudhury Singh et Hal-dar. Statistics without tears - inputs for sample size calculations. Indian Psychiatr Journal, vol. 16, pages 150–152, 2007.

[Barapatre & Vijayalakshmi 2017] Darshan Barapatre et Vijayalakshmi. Data preparation on large datasets for data science. Asian Journal of Pharmaceutical and clinical research (AJPCR), pages ISSN : 2455–3891 ,DOI https ://doi.org/10.22159/ajpcr.2017.v10s1.20526„ 2017.

[Barker & Ward 2013] Barker et Ward. Undefined by data : a survey of big data definitions. arXiv preprint arXiv :1309.5821, 2013.

[Barto & Sutton 2014] Andrew Barto et Sutton. Reinforcement Learning : An Introduction. Second edition, The MIT Press Cambridge, Mas-sachusetts London, 2014.

[BASU 2016] De BASU. Parallel and Distributed Computing : Architectures and algorithms. pages 14–15, 2016.

[Bei et al. 2018] Zhendong Bei, Chuntao Jiang Chengzhong Xu Zhibin Yu Ni Luo et Shengzhong Feng. Configuring

in-memory cluster computing using random forest. Fu-ture Generation Computer Systems, vol. 79, pages 1–15, http ://dx.doi.org/10.1016/j.future.2017.08.011, 2018.

[Bell 2010] Bell. Doing Your Research Project. Maidenhead : Open Univer-sity Press, vol. (5th ed), 2010.

[Belzer et al. 1997] De Jack Belzer, Albert Holzman et Allen Kent. Ency-clopedia of Computer Science and Technology. computer selection to curriculum, vol. 6, pages 40–69, 1997.

[Benfield & Szlemko 2006] Benfield et Szlemko. Internet-based data collection : Promises and realities. Journal of Research Practice, vol. 2, no. 2, pages Article D1. Retrieved from, http ://jrp.icaap.org/index.php/jrp/article/view/30/51, 2006. [Benjamin 2008] Wah Benjamin. Interconnection networks for parallel

com-puters. Wiley Encyclopedia of Computer Science and Engineering, 2008.

[Benmammar 2017] Badr Benmammar. Concurrent, Real-Time and Distri-buted Programming in Java : Threads, RTSJ and RMI. FOCUS Series in Computer Engineering,Abu Bekr Belkaid University, Tlemcen, Algeria, 2017.

[Bernhard et al. 1999] Scholkopf Bernhard, Smola Alexander et Muller Klaus. Kernel principal component analysis. Advances in Kernel Me-thods – Support Vector Learning, page 327–352. MIT Press, 1999. [Bertsekas & Shreve 1978] Bertsekas et Shreve. Stochastic Optimal Control

-The DiscreteTime Case-. Academic Press, New York. 1, 1978. [Bertsekas 2007] Bertsekas. Dynamic Programming and Optimal Control.

Athena Scientific, Belmont, MA, 3 edition, vol. 1, a 2007.

[Best & Kahn 2003] Best et Kahn. Research in Education. 9th Edition, Prentice-Hall of India Private Limited, New Delhi., 2003.

[Bhaskar & Zulfiqar 2016] Bhaskar et Zulfiqar. Basic statistical tools in re-search and data analysis. Indian J Anaesth, vol. 60, no. 9, pages 662–669. doi : 10.4103/0019–5049.190623, 2016.

[Bhattacharya & Bhatnagar 2016] Abhishek Bhattacharya et Shefali Bhat-nagar. Big Data and Apache Spark : A Review. International Journal of Engineering Research Science, vol. 2, no. 5, 2016.

[Biggio et al. 2011] Biggio, Nelson et Laskov. Support vector machines un-der adversarial label noise. Asian Conference on Machine Lear-ning, JMLR : Workshop and Conference Proceedings, vol. 20, page 97–112, 2011.

[Bikakis 2018] Nikos Bikakis. Big Data Visualization Tools. ATHENA Re-search Center, Greece, vol. arXiv :1801.08336v2 [cs.DB], Springer, 2018.

[Blascheck & Ertl 2013] Tanja Blascheck et Thomas Ertl. Workshop on Vi-sual and Spatial Cognition, vol. Techniques for Analyzing Empiri-cal Visualization Experiments Through Visual Methods, 2013. [Boeth 1970] Boeth. The Assault on Privacy : Snoops, Bugs, Wiretaps, Dossiers,

Data Bann Banks, and Specters of 1984. Newsweek. Incorporated, 1970.

[Borthakur 2007] Dhruba Borthakur. The Hadoop Distributed File System : Architecture and Design. The Apache Software Foundation, 2007. [Bouazza 2017] Naoufal Ben Bouazza. Apprendre à bien choisir

son architecture Big Data. Tutorial, vol. disponible sur : https ://big-data.developpez.com/tutoriels/apprendre-faire-choix-architecture-big-data/, 2017.

[Breiman 2001] Leo Breiman. Random forests. Machine Learning, vol. 5, no. 2, page 5–32, 2001.

[Brownbridge et al. 1982] David Brownbridge, Philip Treleaven et Richard Hopkins. Data-Driven and Demand-Driven Computer Architecture. Computing Surveys, vol. 14, no. 1, 1982.

[Brownlee 2016] Jason Brownlee. Machine learning – How it works. pages 1–5, 2016.

[Bruce & Bruce 2017] Peter Bruce et Andrew Bruce. Practical Statistics for Data Scientists. Published by O’Reilly Media, Inc., 1005 Gravenstein Highway North, Sebastopol, CA 95472, 2017.

[Brydon & Gemino 2008] Michael Brydon et Andrew Gemino. Classifica-tion trees and decision-analytic feedforward control : a case study from the video game industry. Data Min Knowl Disc, vol. 17, pages 317–342 DOI 10.1007/s10618–007–0086–6, 2008.

[Bréchon 2015] Pierre Bréchon. Random Sample, Quota Sample : The Tea-chings of the EVS 2008 Survey in France. vol. 126, pages 67–83, DOI : 10.1177/0759106315572558, 2015.

[Burhan et al. 2014] Khan Burhan, Rashidah et Hunain. Critical Insight for MapReduce Optimization in Hadoop. International J of Computer Science and Control Engineering, vol. 2, no. 1, pages 1–7, 2014. [Burks & Neumann 1946] Burks et Von Neumann. Preliminary Discussion

of the Logical Design of an Electronic Computing Instrument. Prince-ton : Institute for Advanced Studies, 1946.

[Buyya et al. 2016] Rajkumar Buyya, Wu et Kotagiri Ramamohanarao. Big Data Analytics = Machine Learning + Cloud Computing. Published in ArXiv, vol. doi :10.1016/b978-0-12-805394-2.00001-5, 2016.

[Campbell & Swinscow 2009] Campbell et Swinscow. Statistics at Square One, 11th ed. Oxford : Wiley-Blackwell, 2009.

[Cardie & Wagstaff 2001] Claire Cardie et Kiri Wagstaff. Constrained K-means Clustering with Background Knowledge. In Proceedings of the Eighteenth International Conference on Machine Learning, page 577–584, 2001.

[Caruana et al. 2008] Rich Caruana, Nikos et Ainur. An Empirical Evalua-tion of Supervised Learning in High Dimensions. Conference on Ma-chine Learning, ACM, 2008.

[CE 1998] Paz CE. A survey of parallel genetic algorithms. Calc Paralleles Reseaux et Syst Repar, vol. 10, no. 2, page 141–171, 1998.

[Chan & Stolfo 1998] Chan et Stolfo. Toward Scalable Learning with Non-uniform Class and Cost Distributions : A Case Study in Credit Card Fraud Detection. In Proceedings of the Fourth International Confe-rence on Knowledge Discovery and Data Mining, page 164–168. AAAI Press, 1998.

[Chan 2013] Chan. An architecture for big data analytics. Communications of the IIMA, vol. 13, no. 2, page 1–13, 2013.

[Changtian et al. 2018] Ying Changtian, Ying Changyan et Ban Chen. A performance optimization strategy based on degree of parallelism and al-location fitness. Journal on Wireless Communications and Networ-king, vol. Article number : 240, 2018.

[Chavent et al. 2007] Marie Chavent, Yves Lechevallier et Olivier Briant. DIVCLUS-T : A monothetic divisive hierarchical clustering method. Comput. Statist. Data Anal, vol. doi : 10.1016/j.csda.2007.03.013, 2007.

[Chen et al. 2006] Chen, Jiang et Yoshihira. Robust nonlinear dimensionality reduction for manifold learning. In Proceeding of 18th International Conference on Pattern Recognition, page 447–450, 2006.

[Chen et al. 2014a] Chen, Mao et Liu. Big Data : a survey. mobile networks and application, vol. 19, no. 2, page 171–209, 2014.

[Chen et al. 2014b] Chen, Mao et Liu. Big data : a survey. Mob. Netw. Appl, vol. 19, page 171–209, 2014.

[Child 2006] Dennis Child. The Essentials of Factor Analysis. 3rd edition, continuum international publishing group, vol. The Tower Buil-ding, 11 York Road, SE1 7NX, London, 2006.

[Chiliang & Wenting 2012] Chiliang et Wenting. Cross-domain representation-learning framework with combination of class-separate and domain-merge objectives. Proceedings of the CDKD 2012 Conference, page 18–25, 2012.

[Chittaro 2006] Chittaro. Visualizing Information on Mobile Devices. ACM Computer, vol. 39, no. 3, pages 40–45, 2006.

[Chmidt 2012] Chmidt. Data is exploding : the 3 versus of big data. Bus Comput World, vol. 15, 2012.

[Choi et al. 2016] Seung-Hyun Choi, Yong-Min Tai Junguk Cho et Seong-Won Lee. A parallel camera image signal processor for SIMD architec-ture. EURASIP Journal on Image and Video Processing, vol. 29, pages doi 10.1186/s13640–016–0137–2, 2016.

[Christopher 1989] Watkins Christopher. Learning from Delayed Rewards. thesis, Cambridge University, Cambridge, England., 1989.

[Ciglaric et al. 2003] Mojca Ciglaric, Matjaz Pancur Ma-tej Trampus et Tone Vidmar. Message routing in pure peer-to-peer networks. vol. Disponible sur :

https ://pdfs.semanticscholar.org/c52a/666c1186b19ccc438260edbc921844608b57.pdf, 2003.

[Cochran 1977] Cochran. Sampling techniques. 3rded. New York : John Wiley and Sons,inc, page 75, 1977.

[Colin 2004] Ware Colin. Information Visualization : Perception for Design. Morgan Kaufmann, 2004.

[conrad taeuber 1961] Richard conrad taeuber. On sampling with replace-ment : an axiomatic approach. Institute of Statistics Mimeo Series, vol. 299, 1961.

[Conti 2015] Francesco Conti. Heterogeneous Architectures for Parallel Acce-leration. Thèse de doctorat,University of Bologna, 2015.

[Coulet et al. 2018] Adrien Coulet, Mohammad Chawki Nicolas Jay Ni-gam Shah Maxime Wack et Michel Dumontier. Predicting the need for a reduced drug dose, at first prescription. Scientific Reports, vol. 8, page Article number : 15558, 2018.

[Crinivasa et al. 2018] Crinivasa, Siddesh et Srinidhi. Network Data Analy-tics. chapter 6, 1st edition, pages 95–105, 2018.

[Cunningham & Delany 2007] Padraig Cunningham et Sarah Jane De-lany. k-Nearest Neighbour Classifiers. Technical Report UCD-CSI-2007-4, 2007.

[da Silva et al. 2018] Ticiana Coelho da Silva, Regis Magalh aes et Igo Bril-hante. Big Data Analytics Technologies and Platforms : a brief review. LADaS, 2018.

[Darlington 1990] Darlington. Regression and Linear Models. Columbus, OH : McGraw-Hill Publishing Company., 1990.

[Das & Behera 2017] Kajaree Das et Rabi Narayan Behera. A Survey on Machine Learning : Concept, Algorithms and Applications. Internatio-nal JourInternatio-nal of Innovative Research in Computer and Communi-cation Engineering, vol. 5, no. 2, pages ISSN : 2320–9798, DOI : 10.15680/IJIRCCE.2017. 0502001, 2017.

[Dasgupta & Nath 2016] Ariruna Dasgupta et Asoke Nath. Classification of Machine Learning Algorithms. International Journal of Innova-tive Research in Advanced Engineering (IJIRAE),ISSN : 2349-2763, vol. 3, no. 3, 2016.

[Dataflair 2018] Team Dataflair. Spark Tutorial : Learn Spark Program-ming. disponible sur : https ://data-flair.training/blogs/spark-sql-tutorial/, 2018.

[Davenport & Kim 2013] Davenport et Kim. Keeping Up with the Quants. Harvard Business Review Press, USA, 2013.

[Dayan 1999] Peter Dayan. Unsupervised Learning. In Wilson, RA Keil, F, editors. The MIT Encyclopedia of the Cognitive Sciences, 1999. [DBTA 2013] DBTA. Big Data Sourcebook. Unisphere Media., 2013.

[Demidova et al. 2016] Demidova, Nikulchev et Sokolova. Big Data Clas-sification Using the SVM Classifiers with the Modified Particle Swarm Optimization and the SVM Ensembles. International Journal of Ad-vanced Computer Science and Applications, vol. 7, no. 5, 2016. [den Broeck et al. 2005] Van den Broeck, Eeckels Argeseanu

Cunnin-ghamS et Herbst. Data cleaning : Detecting,diagnosing, and editing data abnormalities. PLoS Med, vol. 2, no. 10, page e267, 2005. [den Broeck et al. 2013] Van den Broeck, Sandøy et Brestoff. The

Recruit-ment, Sampling, and Enrollment Plan - Epidemiology : principles and practical guidelines. Springer Netherlands, pages 171–196, 2013. [Deng & Yu 2013] Li Deng et Dong Yu. Deep Learning Methods and

Appli-cations, Foundations and Trends. Signal Processing, vol. 7, no. 3-4, pages 197–387, doi : 10.1561/2000000039, 2013.

[DESASO 1964] Department of economic social affairs statistjcal office of the united nations DESASO. Recommendations for the Preparation of Sample Survey Reports. United nations new york, statistical papers, vol. 1, no. 2, 1964.

[Deshpande et al. 2016] Siddharth Deshpande, Nithya Gogtay et Urmila Thatte. Data Types. Journal of The Association of Physicians of India, vol. 64, 2016.

[Dharwat 2016] Alaa Dharwat. Principal component analysis. A tu-torial, Frankfurt University of Applied Sciences, vol. doi : 10.1504/IJAPR.2016.079733, 2016.

[Dhyani & Barthwal 2014] Bijesh Dhyani et Anurag Barthwal. Big Data Analytics using Hadoop. International Journal of Computer Appli-cations, vol. 108, no. 12, 2014.

[Diger 2001] Schollmeier Diger. A Definition of Peer-to-Peer Networking for the Classification of Peer-toPeer Architectures and Applications. Procee-dings of the First International Conference on Peer-to-Peer Com-puting, page doi : 10.1109/P2P.2001.990434, 2001.

[Dijkstra & Broy 1985] Edsger Dijkstra et Manfred Broy. Control Flow and Data Flow : Concepts of Distributed Programming. International Sum-mer School, first edition, 1985.

[Dipboye 1994] Dipboye. Structured and unstructured selection interviews. Research in Personnel and Human Resources Management, vol. 12, pages 79–123, ISBN : 1–55938–733–5, 1994.

[Djafri & Mekki 2012] Laouni Djafri et Rachida Mekki. Monitoring and Re-source Management in P2P Grid-Based Web Services. Computer En-gineering and Applications, vol. 1, no. 1, pages ISSN : 2252–5459, 2012.

[Djafri et al. 2018] Laouni Djafri, Djamel Amar bensaber et Reda Adjoudj. BIG DATA ANALYTICS FOR PREDICTION : parallel processing of the big learning base with the possibility of improving the final result of the prediction. Information discovery and delivery, vol. 46, no. 3, 2018. [Dobre & Xhafa 2014] Dobre et Xhafa. Intelligent services for big data science.

Future Generation Computer Systems, vol. 37, page 267–281, 2014. [Dong et al. 2013] Long Jun Dong, Xi Bing Li et Kang Peng.

Predic-tion of rockburst classificaPredic-tion using Random Forest. TransacPredic-tions of Nonferrous Metals Society of China, vol. 23, pages 472477, doi : 10.1016/S1003–6326(13)62487–5, 2013.

[Dong et al. 2016] Yanchao Dong, Jiguang Yue Yan Zhang et Zhencheng Hu. Comparison of random forest, random ferns and support vector ma-chine for eye state classification. Multimedia Tools and Applications, vol. 75, pages 11763–11783, doi : 10.1007/s11042–015–2635–0, 2016. [Dongarra & van der Steen 2012] Dongarra et van der Steen.

High-performance computing systems : Status and outlook. Acta Numerica, pages 1–96, doi :10.1017/S09624929XXXXXXXX, 2012.

[Dormehl 2014] Dormehl. The Five Best Libraries For Building Data Visuali-zations. Fast Company, 2014.

[Dreyfus 2002] Dreyfus. Richard Bellman on the birth of dynamic program-ming. MLRG - Winter Term 2, vol. 50, no. 1, page 48–51, 2002. [Drugan 2017] Madalina Drugan. Reinforcement learning versus evolutionary

computation : a survey on hybrid algorithms. Technical University of Eindhoven, The Netherlands, 2017.

[Dörnyei 2007] Dörnyei. Research methods in applied linguistics. New York : Oxford University Press, 2007.

[Eassa & Zaki 1995] Fathy Eassa et Zaki. A computational model for static data flow machines. Computers Electrical Engineering, vol. 21, no. 6, pages 483–497, doi : https ://doi.org/10.1016/0045– 7906(95)00018–P, 1995.

[Eaton et al. 2011] Eaton, Deutsch Zikopoulos DeRoos et Lapis. Unders-tanding Big Data. McGraw-Hill, USA, 2011.

[Eaton et al. 2012] Eaton, Deutsch Deroos et Lapis. Understanding Big Data : Analytics for Enterprise Class Hadoop and Streaming Data. Mc-Graw Hill Professional, McMc-Graw Hill, New York, vol. ISBN : 978-0071790536, 2012.

[Eckerson 2007] Wayne Eckerson. Predictive analytics : Extending the Value of Your Data Warehousing Investment. tdwi best practices report, 2007.

[Elnour et al. 2014] Manhal Elfadil Eltayeeb Elnour, Muhammad Sha-fie Abd Latif et Ismail Fauzi Isnin. Distributed Memory and Shared Distributed Memory Architecture for Implementing Local Sequences Ali-gnment : A Survey. International Journal of Computer Science and Telecommunications, vol. 5, no. 8, 2014.

[EMCES 2015] EMC Education Services EMCES. Data Science and Big Data Analytics. Indianapolis : John Wiley Sons, vol. 978-1-118-87613-8, 2015.

[Erl et al. 2016] Erl, Khattak et Buhler. Big Data Fundamentals : Concepts. Prentice Hall Press, Drivers Techniques, 2016.

[Espinosa et al. 2012] Mariano Martinez Espinosa, Isanete Bieski et Domingos Tabajara de Oliveira Martins. Probability sam-pling design in ethnobotanical surveys of medicinal plants. Re-vista Brasileira de Farmacognosia, vol. 22, no. 6, pages http ://dx.doi.org/10.1590/S0102–695X2012005000091, 2012. [Ester et al. 1996] Martin Ester, Jörg Sander Hans-Peter Kriegel et Xiaowei

Xu. A density-based algorithm for discovering clusters a density-based algorithm for discovering clusters in large spatial databases with noise. In KDD’96 Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, pages 226–231, 1996. [Etikan & Bala 2017] Etikan et Bala. Sampling and Sampling Methods.

Biom Biostat Int J, vol. 5, no. 6, pages 138–149, DOI : 10.15406/bbij.2017.05.00149, 2017.

[Etikan et al. 2016] Ilker Etikan, Sulaiman Abubakar Musa et Rukayya Su-nusi Alkassim. Comparison of Convenience Sampling and Purposive Sampling. American Journal of Theoretical and Applied Statistics, vol. doi : 10.11648/j.ajtas.20160501.11, 2016.

[Etzioni et al. 2003] Oren Etzioni, Rattapoom Tuchinda Craig Knoblock et Alexander Yates. SIGKDD, pages Washington, DC, USA, 2003. [Evans & Lindner 2012] Evans et Lindner. Business Analytics : The Next

Frontier for Decision Sciences. Decision Line, vol. 43, no. 2, pages 4–6, 2012.

[Even-Dar et al. 2009] Even-Dar, Kakade et Mansour. Online markov deci-sion processes. Math. Oper. Res, vol. 34, no. 3, page 726–736, 2009. [Ewens 2004] Ewens. Mathematical Population Genetics 1 - Theoretical

Intro-duction. New York : Springer, 2004.

[Fan et al. 2012] Fan, Gondek Kalyanpur et Ferrucci. knowledge extraction from documents. IBM Journal of Research and Development, vol. 56 (3.4), no. 5, pages 1–10, 2012.

[Faraz et al. 2015] Ahmed Faraz, Faiz Ul Haque Zeya et Majid Kaleem. A survey of paradigms for building and designing parallel computing ma-chines. Computer Science Engineering : An International Journal, vol. 5, no. 1, 2015.

[Faubert & Wheeldon 2009] Jacqueline Faubert et Johannes Wheeldon. Framing Experience : Concept Maps, Mind Maps, and Data Collection in Qualitative Research. International Journal of Qualitative Methods, vol. 8, no. 3, 2009.

[Fedak et al. 2001] Fedak, Néri Germain et Cappello. XtremWeb : A gene-ric global computing system. Proceedings of the IEEE International Symposium on Cluster Computing and the Grid (CCGRID). IEEE Press, Piscataway, New Jersey, 2001.

[Fei 2015] Shi Fei. Study on a Stratified Sampling Investigation Me-thod for Resident Travel and the Sampling Rate. Discrete Dynamics in Nature and Society, vol. Article ID 496179, http ://dx.doi.org/10.1155/2015/496179, 2015.

[Fellows 1994] Michael Fellows. On Search, Decision, and the Efficiency of Polynomial-Time Algorithms. Journal of computer and system sciences, vol. 49, 1994.

[Feo 1992] Feo. A Comparative Study of Parallel Programming Languages : The Salishan Problems. North-Holland, 1992.

[Ferguson 2013] Mike Ferguson. Enterprise Information Protection- The Im-pact of Big Data. IBM, 2013.

[Fernandez-Delgado et al. 2014] Fernandez-Delgado, Barro S Cernada E et Amorim D. Do we need hundreds of classiffiers to solve real world classification problems. J Mach Learn Res, vol. 15, 2014.

[Finney 1948] Finney. random and systematic sampling in timber surveys. In-ternational Journal of Forest Research, vol. 22, no. 1, pages 64–99, https ://doi.org/10.1093/oxfordjournals.forestry.a062953, 1948.

Dans le document Analyse de données massives – Big Data- pour la prédiction (Page 182-200)