Analysis of Distributional Data -

Analysis of Distributional Data

Paula Brito, Sonia Dias (Herausgeber)

Buch | Softcover
376 Seiten
2024
Chapman & Hall/CRC (Verlag)
978-1-032-25571-2 (ISBN)
56,10 inkl. MwSt
In the era of "Big Data," distributional data is becoming more prevalent. This book presents a synthesis of research in this area over the last twenty years. It has been carefully edited to ensure it is consistent with respect to style, level, notation. Each chapter includes examples to illustrate the topics and software where appropriate.
In a time when increasingly larger and complex data collections are being produced, it is clear that new and adaptive forms of data representation and analysis have to be conceived and implemented. Distributional data, i.e., data where a distribution rather than a single value is recorded for each descriptor, on each unit, come into this framework. Distributional data may result from the aggregation of large amounts of open/collected/generated data, or it may be directly available in a structured or unstructured form, describing the variability of some features. This book provides models and methods for the representation, analysis, interpretation, and organization of distributional data, taking into account its specific nature, and not relying on a reduction to single values, to be conform to classical paradigms.

Conceived as an edited book, gathering contributions from multiple authors, the book presents alternative representations and analysis’ methods for distributional data of different types, and in particular,
-Uni- and bi-variate descriptive statistics for distributional data
-Clustering and classification methodologies
-Methods for the representation in low-dimensional spaces
-Regression models and forecasting approaches for distribution-valued variables

Furthermore, the different chapters
-Feature applications to show how the proposed methods work in practice, and how results are to be interpreted,
-Often provide information about available software.

The methodologies presented in this book constitute cutting-edge developments for stakeholders from all domains who produce and analyse large amounts of complex data, to be analysed in the form of distributions. The book is hence of interest for companies operating not only in the area of data analytics, but also on logistics, energy and finance. It also concerns national statistical institutes and other institutions at European and international level, where microdata is aggregated to preserve confidentiality and allow for analysis at the appropriate regional level. Academics will find in the analysis of distributional data a challenging up-to-date field of research.

Paula Brito is a Professor at the Faculty of Economics of the University of Porto, and a member of the Artificial Intelligence and Decision Support Research Group (LIAAD) of INESC TEC, Portugal. She holds a doctorate degree in Applied Mathematics from the University Paris Dauphine, and an Habilitation in Applied Mathematics from the University of Porto. Her current research focuses on the analysis of multidimensional complex data, known as symbolic data, for which she develops statistical approaches and multivariate analysis methodologies. In this context, she has been involved in two European research projects. Paula Brito has been president of the International Association for Statistical Computing (IASC-ISI) in 2013–2015, and of the Portuguese Association for Classification and Data Analysis for the term 2021-2023. She has been invited speaker at several international conferences, and is a regularly member of international program committees. Paula Brito has been chair of COMPSTAT 2008 and will co-chair the IFCS 2022 conference. Sónia Dias is a Professor in the area of Mathematics at the School of Technology and Management of the Polytechnic Institute of Viana do Castelo, and a member of the Laboratory in Artificial Intelligence and Decision Support (LIAAD) of INESC TEC, Portugal. She holds a PhD in Applied Mathematics from the University of Porto (2014). Her main scientific areas of research are Data Analysis, Symbolic Data Analysis (analysis of multidimensional complex data) and Statistical/Mathematical Applications. Under this context, she has participated in several conferences and published articles in international journals and proceedings. She was a member of the organizing committee of the international Symbolic Data Analysis Workshop - SDA2018 and is a member of the organizing committee of the IFCS 2022 conference.

I Data Representation and Exploratory Analysis
1. Fundamental Concepts about Distributional Data
2. Descriptive Statistics based on Frequency Distributions
3. Descriptive Statistics for Numeric Distributional Data
4. The Quantile Methods to Analyze Distributional Data

II Clustering and Classification
5. Partitive and Hierarchical Clustering of Distributional Data using the Wasserstein Distance
6. Divisive clustering of histogram data
7. Clustering of Modal Valued Data
8. Mixture Models for Distributional Data
9. Classification of Continuous Distributional Data Using the Logratio Approach

III Dimension Reduction
10. Principal Component Analysis of Distributional Data
11. Principal Component Analysis of Numeric Distributional Data
12. Multidimensional Scaling of Distributional Data

IV Regression and Forecasting
13. Regression Analysis with the Distribution and Symmetric Distribution Model
14. Regression Analysis of Distributional Data Based on a Two-Component Model
15. Forecasting Distributional Time Series

Erscheinungsdatum
Zusatzinfo 71 Tables, black and white; 30 Line drawings, color; 80 Line drawings, black and white; 30 Illustrations, color; 80 Illustrations, black and white
Sprache englisch
Maße 156 x 234 mm
Gewicht 453 g
Themenwelt Geisteswissenschaften Psychologie Allgemeine Psychologie
Informatik Datenbanken Data Warehouse / Data Mining
ISBN-10 1-032-25571-4 / 1032255714
ISBN-13 978-1-032-25571-2 / 9781032255712
Zustand Neuware
Haben Sie eine Frage zum Produkt?
Mehr entdecken
aus dem Bereich
Datenanalyse für Künstliche Intelligenz

von Jürgen Cleve; Uwe Lämmel

Buch | Softcover (2024)
De Gruyter Oldenbourg (Verlag)
74,95
Auswertung von Daten mit pandas, NumPy und IPython

von Wes McKinney

Buch | Softcover (2023)
O'Reilly (Verlag)
44,90