Graphics of Large Datasets (eBook)

Visualizing a Million
eBook Download: PDF
2007 | 2006
XIII, 271 Seiten
Springer New York (Verlag)
978-0-387-37977-7 (ISBN)

Lese- und Medienproben

Graphics of Large Datasets - Antony Unwin, Martin Theus, Heike Hofmann
Systemvoraussetzungen
74,89 inkl. MwSt
  • Download sofort lieferbar
  • Zahlungsarten anzeigen

This book shows how to look at ways of visualizing large datasets, whether large in numbers of cases, or large in numbers of variables, or large in both. All ideas are illustrated with displays from analyses of real datasets and the importance of interpreting displays effectively is emphasized. Graphics should be drawn to convey information and the book includes many insightful examples. New approaches to graphics are needed to visualize the information in large datasets and most of the innovations described in this book are developments of standard graphics. The book is accessible to readers with some experience of drawing statistical graphics.


Graphics are great for exploring data, but how can they be used for looking at the large datasets that are commonplace to-day? This book shows how to look at ways of visualizing large datasets, whether large in numbers of cases or large in numbers of variables or large in both. Data visualization is useful for data cleaning, exploring data, identifying trends and clusters, spotting local patterns, evaluating modeling output, and presenting results. It is essential for exploratory data analysis and data mining. Data analysts, statisticians, computer scientists-indeed anyone who has to explore a large dataset of their own-should benefit from reading this book.New approaches to graphics are needed to visualize the information in large datasets and most of the innovations described in this book are developments of standard graphics. There are considerable advantages in extending displays which are well-known and well-tried, both in understanding how best to make use of them in your work and in presenting results to others. It should also make the book readily accessible for readers who already have a little experience of drawing statistical graphics. All ideas are illustrated with displays from analyses of real datasets and the authors emphasize the importance of interpreting displays effectively. Graphics should be drawn to convey information and the book includes many insightful examples.From the reviews:"e;Anyone interested in modern techniques for visualizing data will be well rewarded by reading this book. There is a wealth of important plotting types and techniques."e; Paul Murrell for the Journal of Statistical Software, December 2006"e;This fascinating book looks at the question of visualizing large datasets from many different perspectives. Different authors are responsible for different chapters and this approach works well in giving the reader alternative viewpoints of the same problem. Interestingly the authors havecleverly chosen a definition of 'large dataset'. Essentially they focus on datasets with the order of a million cases. As the authors point out there are now many examples of much larger datasets but by limiting to ones that can be loaded in their entirety in standard statistical software they end up with a book that has great utility to the practitioner rather than just the theorist. Another very attractive feature of the book is the many colour plates, showing clearly what can now routinely be seen on the computer screen. The interactive nature of data analysis with large datasets is hard to reproduce in a book but the authors make an excellent attempt to do just this."e; P. Marriott for the Short Book Reviews of the ISI

Preface 7
Contents 10
1 Introduction 15
1.1 Introduction 15
1.2 Data Visualization 18
1.3 Research Literature 21
1.4 How Large Is a Large Dataset? 23
1.5 The Effects of Largeness 31
1.6 What Is in This Book 36
1.7 Software 37
1.8 What Is on the Website 38
1.9 Contributing Authors 40
Basics 42
2 Statistical Graphics 43
2.1 Introduction 43
2.2 Plots for Categorical Data 43
2.3 Plots for Continuous Data 48
2.4 Data on Mixed Scales 56
2.5 Maps 59
2.6 Contour Plots and Image Maps 61
2.7 Time Series Plots 62
2.8 Structure Plots 63
3 Scaling Up Graphics 67
3.1 Introduction 67
3.2 Upscaling as a General Problem in Statistics 67
3.3 Area Plots 68
3.4 Point Plots 74
3.5 From Areas to Points and Back 79
3.6 Modifying Plots 83
3.7 Summary 84
4 Interacting with Graphics 85
4.1 Introduction 85
4.2 Interaction 86
4.3 Interaction and Data Displays 87
4.4 Interaction and Large Datasets 100
4.5 New Interactive Tasks 110
4.6 Summary and Future Directions 113
Applications 114
5 Multivariate Categorical Data — Mosaic Plots 115
5.1 Introduction 115
5.2 Area-based Displays 115
5.3 Displays and Techniques in One Dimension 117
5.4 Mosaic Plots 123
5.5 Summary 133
6 Rotating Plots 135
6.1 Introduction 135
6.2 Beginning to Work with a Million Cases 138
6.3 Software System 145
6.4 Application 147
6.5 Current and Future Developments 150
7 Multivariate Continuous Data — Parallel Coordinates 152
7.1 Introduction 152
7.2 Interpolations and Inner Products 153
7.3 Generalized Parallel Coordinate Geometry 154
7.4 A New Family of Smooth Plots 158
7.5 Examples 159
7.6 Detecting Second–Order Structures 163
7.7 Summary 164
8 Networks 165
8.1 Introduction 165
8.2 Layout Algorithms 166
8.3 Interactivity 170
8.4 NicheWorks 174
8.5 Example: International Calling Fraud 175
8.6 Languages for Description and Layouts 180
8.7 Summary 182
9 Trees 184
9.1 Introduction 184
9.2 Growing Trees for Large Datasets 185
9.3 Visualization of Large Trees 194
9.4 Forests for Large Datasets 205
9.5 Summary 209
10 Transactions 210
10.1 Introduction and Background 210
10.2 Mice and Elephant Plots and Random Sampling 212
10.3 Biased Sampling 217
10.4 Quantile Window Sampling 222
10.5 Commonality of Flow Rates 228
11 Graphics of a Large Dataset 234
11.1 Introduction 234
11.2 QuickStart Guide Data Visualization for Large Datasets 235
11.3 Visualizing the InfoVis 2005 Contest Dataset 236
References 257
Authors 268
Index 272

Erscheint lt. Verlag 12.6.2007
Reihe/Serie Statistics and Computing
Statistics and Computing
Zusatzinfo XIII, 271 p.
Verlagsort New York
Sprache englisch
Themenwelt Mathematik / Informatik Informatik Datenbanken
Mathematik / Informatik Informatik Grafik / Design
Mathematik / Informatik Mathematik Computerprogramme / Computeralgebra
Mathematik / Informatik Mathematik Statistik
Mathematik / Informatik Mathematik Wahrscheinlichkeit / Kombinatorik
Technik
Wirtschaft Betriebswirtschaft / Management Planung / Organisation
Schlagworte Computer • Data Analysis • Excel • Modeling • Statistica • statistical software • Visualization
ISBN-10 0-387-37977-0 / 0387379770
ISBN-13 978-0-387-37977-7 / 9780387379777
Haben Sie eine Frage zum Produkt?
PDFPDF (Wasserzeichen)
Größe: 31,5 MB

DRM: Digitales Wasserzeichen
Dieses eBook enthält ein digitales Wasser­zeichen und ist damit für Sie persona­lisiert. Bei einer missbräuch­lichen Weiter­gabe des eBooks an Dritte ist eine Rück­ver­folgung an die Quelle möglich.

Dateiformat: PDF (Portable Document Format)
Mit einem festen Seiten­layout eignet sich die PDF besonders für Fach­bücher mit Spalten, Tabellen und Abbild­ungen. Eine PDF kann auf fast allen Geräten ange­zeigt werden, ist aber für kleine Displays (Smart­phone, eReader) nur einge­schränkt geeignet.

Systemvoraussetzungen:
PC/Mac: Mit einem PC oder Mac können Sie dieses eBook lesen. Sie benötigen dafür einen PDF-Viewer - z.B. den Adobe Reader oder Adobe Digital Editions.
eReader: Dieses eBook kann mit (fast) allen eBook-Readern gelesen werden. Mit dem amazon-Kindle ist es aber nicht kompatibel.
Smartphone/Tablet: Egal ob Apple oder Android, dieses eBook können Sie lesen. Sie benötigen dafür einen PDF-Viewer - z.B. die kostenlose Adobe Digital Editions-App.

Buying eBooks from abroad
For tax law reasons we can sell eBooks just within Germany and Switzerland. Regrettably we cannot fulfill eBook-orders from other countries.

Mehr entdecken
aus dem Bereich
der Grundkurs für Ausbildung und Praxis

von Ralf Adams

eBook Download (2023)
Carl Hanser Verlag GmbH & Co. KG
29,99
Das umfassende Handbuch

von Wolfram Langer

eBook Download (2023)
Rheinwerk Computing (Verlag)
49,90