Modeling with Data - Ben Klemens

Modeling with Data

Tools and Techniques for Scientific Computing

(Autor)

Buch | Hardcover
472 Seiten
2008
Princeton University Press (Verlag)
978-0-691-13314-0 (ISBN)
114,70 inkl. MwSt
Explains how to execute computationally intensive analysis on very large data sets. This book shows readers how to determine some of the best methods for solving a variety of different problems, how to create and debug statistical models, and how to run an analysis and evaluate the results.
Modeling with Data fully explains how to execute computationally intensive analyses on very large data sets, showing readers how to determine the best methods for solving a variety of different problems, how to create and debug statistical models, and how to run an analysis and evaluate the results. Ben Klemens introduces a set of open and unlimited tools, and uses them to demonstrate data management, analysis, and simulation techniques essential for dealing with large data sets and computationally intensive procedures. He then demonstrates how to easily apply these tools to the many threads of statistical technique, including classical, Bayesian, maximum likelihood, and Monte Carlo methods. Klemens's accessible survey describes these models in a unified and nontraditional manner, providing alternative ways of looking at statistical concepts that often befuddle students. The book includes nearly one hundred sample programs of all kinds. Links to these programs will be available on this page at a later date.
Modeling with Data will interest anyone looking for a comprehensive guide to these powerful statistical tools, including researchers and graduate students in the social sciences, biology, engineering, economics, and applied mathematics.

Ben Klemens is a senior statistician at the National Institute of Mental Health. He is also a guest scholar at the Center on Social and Economic Dynamics at the Brookings Institution.

Preface xi Chapter 1. Statistics in the modern day 1 PART I COMPUTING 15 Chapter 2. C 17 2.1 Lines 18 2.2 Variables and their declarations 28 2.3 Functions 34 2.4 The debugger 43 2.5 Compiling and running 48 2.6 Pointers 53 2.7 Arrays and other pointer tricks 59 2.8 Strings 65 2.9 *Errors 69 Chapter 3. Databases 74 3.1 Basic queries 76 3.2 *Doing more with queries 80 3.3 Joins and subqueries 87 3.4 On database design 94 3.5 Folding queries into C code 98 3.6 Maddening details 103 3.7 Some examples 108 Chapter 4. Matrices and models 113 4.1 The GSL's matrices and vectors 114 4.2 apo_da t120 4.3 Shunting data 123 4.4 Linear algebra 129 4.5 Numbers 135 4.6 *gsl_matrixand gsl_ve torinternals 140 4.7 Models 143 Chapter 5. Graphics 157 5.1 plot 160 5.2 *Some common settings 163 5.3 From arrays to plots 166 5.4 A sampling of special plots 171 5.5 Animation 177 5.6 On producing good plots 180 5.7 *Graphs--nodes and flowcharts 182 5.8 Printing and LATEX 185 Chapter 6. *More coding tools 189 6.1 Function pointers 190 6.2 Data structures 193 6.3 Parameters 203 6.4 *Syntactic sugar 210 6.5 More tools 214 PART II STATISTICS 217 Chapter 7. Distributions for description 219 7.1 Moments 219 7.2 Sample distributions 235 7.3 Using the sample distributions 252 7.4 Non-parametric description 261 Chapter 8. Linear projections 264 8.1 *Principal component analysis 265 8.2 OLS and friends 270 8.3 Discrete variables 280 8.4 Multilevel modeling 288 Chapter 9. Hypothesis testing with the CLT 295 9.1 The Central Limit Theorem 297 9.2 Meet the Gaussian family 301 9.3 Testing a hypothesis 307 9.4 ANOVA 312 9.5 Regression 315 9.6 Goodness of fit 319 Chapter 10. Maximum likelihood estimation 325 10.1 Log likelihood and friends 326 10.2 Description: Maximum likelihood estimators 337 10.3 Missing data 345 10.4 Testing with likelihoods 348 Chapter 11. Monte Carlo 356 11.1 Random number generation 357 11.2 Description: Finding statistics for a distribution 364 11.3 Inference: Finding statistics for a parameter 367 11.4 Drawing a distribution 371 11.5 Non-parametric testing 375 Appendix A: Environments and makefiles 381 A.1 Environment variables 381 A.2 Paths 385 A.3 Make 387 Appendix B: Text processing 392 B.1 Shell scripts 393 B.2 Some tools for scripting 398 B.3 Regular expressions 403 B.4 Adding and deleting 413 B.5 More examples 415 Appendix C: Glossary 419 Bibliography 435 Index 443

Erscheint lt. Verlag 26.10.2008
Zusatzinfo 35 line illus. 16 tables.
Verlagsort New Jersey
Sprache englisch
Maße 178 x 254 mm
Gewicht 1021 g
Themenwelt Informatik Datenbanken Data Warehouse / Data Mining
Mathematik / Informatik Informatik Software Entwicklung
Mathematik / Informatik Mathematik Computerprogramme / Computeralgebra
ISBN-10 0-691-13314-X / 069113314X
ISBN-13 978-0-691-13314-0 / 9780691133140
Zustand Neuware
Haben Sie eine Frage zum Produkt?
Mehr entdecken
aus dem Bereich
Datenanalyse für Künstliche Intelligenz

von Jürgen Cleve; Uwe Lämmel

Buch | Softcover (2024)
De Gruyter Oldenbourg (Verlag)
74,95
Auswertung von Daten mit pandas, NumPy und IPython

von Wes McKinney

Buch | Softcover (2023)
O'Reilly (Verlag)
44,90