Data Analytics - Juan J. Cuadrado-Gallego, Yuri Demchenko

Data Analytics

A Theoretical and Practical View from the EDISON Project
Buch | Softcover
XIII, 477 Seiten
2024
Springer International Publishing (Verlag)
978-3-031-39131-6 (ISBN)
181,89 inkl. MwSt

Building upon the knowledge introduced in The Data Science Framework, this book provides a comprehensive and detailed examination of each aspect of Data Analytics, both from a theoretical and practical standpoint. The book explains representative algorithms associated with different techniques, from their theoretical foundations to their implementation and use with software tools.

Designed as a textbook for a Data Analytics Fundamentals course, it is divided into seven chapters to correspond with 16 weeks of lessons, including both theoretical and practical exercises. Each chapter is dedicated to a lesson, allowing readers to dive deep into each topic with detailed explanations and examples. Readers will learn the theoretical concepts and then immediately apply them to practical exercises to reinforce their knowledge. And in the lab sessions, readers will learn the ins and outs of the R environment and data science methodology to solve exercises with the R language.With detailed solutions provided for all examples and exercises, readers can use this book to study and master data analytics on their own. Whether you're a student, professional, or simply curious about data analytics, this book is a must-have for anyone looking to expand their knowledge in this exciting field.

The following chapters have contributions by:

  • Chapter 4, "Anomaly Detection" - Juan J. Cuadrado-Gallego, Yuri Demchenko, Josefa Gómez, and Abdelhamid Tayebi
  • Chapter 5, "Unsupervised Classification" - Juan J. Cuadrado-Gallego, Yuri Demchenko, and Abdelhamid Tayebi
  • Chapter 6, "Supervised Classification" - Juan J. Cuadrado-Gallego, Yuri Demchenko, and Josefa Gómez


Dr. Juan José Associate Professor in the Department of Computer Science at the University of Alcalá, in the area of Computer Science and Artificial Intelligence and Affiliate Associate Professor in the Department of Computer Science and Software Engineering, of the Faculty of Engineering and Computer Science, of the Concordia University, in Montreal, Canada. Previously, he was a professor at the Spanish Universities Universitat Oberta de Catalunya, in Barcelona, from 2004 to 2016, the University of Valladolid, in Segovia, in 2004, and the Universidad Carlos III de Madrid, in Madrid, between 1997 and 2004. He has been Visiting Associate Professor, in the Department of Software and IT Engineering, of the École de Technologie Supérieure, at the Université du Québec à Montréal, in Montreal, Canada, from 2009 to 2015; and Visiting Professor, in the Postgraduate and Research section, of the Faculty of Administration and Management, of the National Polytechnic Institute, inMexico City, Mexico, from 2009 to 2014. He was also a researcher in the Department of Astrophysics and Atmospheric Sciences, from the Faculty of Physical Sciences, of the Complutense University of Madrid, in Madrid, Spain, from 1994 to 1997.

Juan José has a degree in Physical Sciences from the Complutense University of Madrid, in 1994; obtained in Recognition of the research sufficiency in the Faculty of Physical Sciences of the Complutense University of Madrid, in 1997; and the Doctorate in Computer Engineering, at the Carlos III University of Madrid, in 2001, with the qualification of A "cum laude" unanimously by the court. It currently has 4 six-year periods and 3 five-year periods. In 2010, she obtained the Outstanding Research Pathway certification by the National Agency for Evaluation and Prospective (ANEP) of the Secretary of State for Universities and Research of the Ministry of Science and Innovation, within the program I3 Program, Incentive for the Incorporation andIntensification of Research Activity.

Juan José has carried out research stays at the Universities: University of Amsterdam, Amsterdam, Holland, at the Informatics Institute, of the Faculty of Science, in 2018, funded by a mobility grant from the University of Alcalá; at the Otto-von-Guericke-University, Magdeburg, Germany, at the Institüt für Verteilte Systeme, de la Fakültat für Informatik, in 2013, funded by a mobility grant from the University of Alcalá, in 2012, within a sabbatical year granted by the University of Alcalá, in 2009, funded by a "José Castillejo" for further studies and research, from the University of Alcalá; at the Université du Québec à Montréal, in Montreal, Canada, in the Department of Software and IT Engineering, from the École de Technologie Supérieure, in 2006 and 2005; at the University of Reading, in Reading, United Kingdom, in the Computer Science Department, in 2005 and 2004; and the Università Roma Tre, in Rome, Italy, in the Dipartamento di Informatica e Automatizacione, in 2004 and 2003.

Juan José is currently researching in the fields of Artificial Intelligence and Data Science. He has made more than 200 scientific publications, many of which have been in journals indexed in the JRC Science Edition. He has also participated, as principal investigator or researcher, in numerous research projects, both financed with public funding, both European, national, regional or university; as well as with private financing, through contracts made through article 83 of the University Law. He has also directed nine doctoral theses, all of them having received the highest qualification; and has participated in numerous doctoral courts, in Spain, Germany, and Mexico. He is also an External Evaluator of projects in Computer Science, of the Natural Sciences and Engineering Research Council of Canada since 2014 and Evaluator of the National Agency for Evaluation and Prospective, of the General Directorate of Scientific and TechnicalRes

Contents.- Chapter 1. Introduction to data science and data analytics 1.- 1.1 About Data Science.- 1.2 About the EDISON Project and Data Science Framework.- 1.2.1 The EDISON project.- 1.2.2 The EDISON Data Science Framework.- 1.3 About Data Analytics.- 1.3.1 Data Analytics Competences .- 1.3.2 Data Analytics Body of Knowledge.- 1.3.3 Data Analytics Model Curriculum Approach .- 1.3.4 Data Analytics Professional Profiles .- 1.4 About this Book .- Chapter 2.  Data ...... 49.- A. Theory.- 2.1 Introduction .- 2.2 Characteristic .- 2.2.1 Definition of characteristic .- 2.2.2 Types of characteristics .- 2.3 Data  .- 2.3.1 Definition of Data.- 2.3.2 Types of data from their nature.- 2.3.3 Types of data from their storage .- 2.4 Available Data .- 2.4.1 Experiment .- 2.4.2 Data population .- 2.4.3 Data Sample .- 2.4.4 Data Quality .- 2.5 Frequency .- 2.5.1 Definition of frequency .- 2.5.2 Types of frequency .- 2.5.3 Frequency of grouped Data.- 2.5.4 Mode.- 2.6 Mean.- 2.6.1 Definition of Mean .- 2.6.2 Arithmetic Mean .- 2.6.3 Variance and Standard Deviation .- 2.7 Median .- 2.7.1 Range .- 2.7.2 Median .- 2.7.3 Quantiles .- 2.7.4 Quantiles range.- B. Computer Based Solving .- 2.8 Reproject .- 2.9 R graphical user interface .- 2.10  Data exercises solves with R.- C. Data Exercises solves .- 2.11  Handmade exercises .- 2.12  Exercises solves in R.- Annex.   Data Extended Concepts .- 2.A.1 Frequency .- 2.A.2 Mean.- Chapter 3.  Probability .- A. Theory .- 3.1 Introduction .- 3.2 Event .- 3.3 Sets theory actions and operations .- 3.4 La Place or classic probability.- 3.5 Bayesian Probability .- 3.6 Probability distribution of random variables .- 3.6.1 Random Variable.- 3.6.2 Probability distribution .- 3.6.3 Discrete probability distributions .- 3.6.3.1  Bernoulli Probability distribution.- 3.6.3.2  Binomial Probability distribution.- 3.6.3.3  Geometric Probability distribution .- 3.6.3.4 Poison Probability distribution .- 3.6.4 Continuous probability distribution .- 3.6.4.1  Normal Distribution .- 3.6.4.2  Pearson chi square distribution.- 3.6.4.3  T the student distribution .- 3.6.4.4  F the fisher distribution .- B. Computer Based Solving .- C. Probability exercises solved .- 3.7 Handmade exercises .- 3.8 Exercises solved in R.- Annex.   Probability extended concepts.- Chapter 4.  Anomaly Detection .- Juan. J Cuadrado-Gallego, Yuri Demchenko, Josefa Gómez, Adelhamid Tayebi.- A. Theory.- 4.1 Introduction .- 4.2 Anomaly detection basic on Statistics .- 4.2.1 Anomaly detection Basic on the mean and the standard deviation .- 4.2.2Anomaly detection based on the quartiles.- 4.2.3 Anomaly detection based errors of the residuals .- 4.3 Anomaly detection based on proximity. K nearest neighbor algorithm .- 4.4 Anomaly detection based on density simplified local outlier factor algorithm.- B. Computer based solving.- 4.5 R packages .- 4.6 Anomaly detection the exercise solves with R .- C. Anomaly detection exercises solves .- 4.7 Handmade exercises .- 4.8 Exercises solved in  R .-  .- Chapter 5.  Unsupervised Classification .- Juan. J Cuadrado-Gallego, Yuri Demchenko, Adelhamid Tayebi.- A. Theory .- 5.1 Introduction .- 5.2 Unsupervised classification based on distances K Meand Algorithm.- 5.3 Agglomerative hierarchical clustering .- B. Computer Based Solved .- 5.4 R studio .- 5.5 Unsupervised classification exercises solves with R .- C. Unsupervised Classification Solved .- 5.6 Handmade exercises .- 5.7 Exercises solved in  R.-  .- Chapter 6.  Supervised Classification .- Juan. J Cuadrado-Gallego, Yuri Demchenko, Josefa Gómez.- A. Theory .- 6.1 Introduction .- 6.2 Decision tree.- 6.2.1 Optimizing the construction of a decision tree: ID3 Algorithm.- 6.2.2 Optimizing the construction of a decision tree: CART Algorithm .- 6.2.3 Optimizing the construction of a decision tree: Error Algorithm .- 6.3 Neural Network .- 6.4 Naïve Bayes .- 6.5 Regression functions .- 6.5.1 Lineal regression of polynomial events .- 6.5.2 Lineal regression of polynomial for three events .- 6.5.3 Lineal regression of polynomial for K events.- 6.5.4 No Lineal regression of polynomial for two events.- 6.5.5 No Lineal regression of not polynomial for two events.- 6.5.6 Lineal regression validity analysis .- B. Computer based solving.- C. Supervised classification analysis exercises solved .- 6.6 Handmade Exercises.- 6.7. Exercises solves in R.- Chapter 7.  Association .- A. Theory .- 7.1 Introduction .- 7.2 Analysis of association of events composed by a single elementary event .- 7.2.1 Support .- 7.2.2 Confidence .- 7.2.3 Contingency .- 7.2.4 Correlation .- 7.3 Analysis of association of events composed by more than one elementary event . Apriori algorithm.- B. Computer based solving.- C. Association analysis  exercises solved .- 7.4 Handmade Exercises .- 7.5 Exercises solves in R.

Erscheinungsdatum
Co-Autor Josefa Gómez Pérez, Abdelhamid Tayebi Tayebi
Zusatzinfo XIII, 477 p. 107 illus., 43 illus. in color.
Verlagsort Cham
Sprache englisch
Maße 155 x 235 mm
Themenwelt Mathematik / Informatik Informatik Datenbanken
Informatik Theorie / Studium Künstliche Intelligenz / Robotik
Schlagworte Big Data • data analytics • Data Science • EDISON Data Science Framework (EDSF) • EDISON Project • machine learning
ISBN-10 3-031-39131-4 / 3031391314
ISBN-13 978-3-031-39131-6 / 9783031391316
Zustand Neuware
Haben Sie eine Frage zum Produkt?
Mehr entdecken
aus dem Bereich
Eine kurze Geschichte der Informationsnetzwerke von der Steinzeit bis …

von Yuval Noah Harari

Buch | Hardcover (2024)
Penguin (Verlag)
28,00