Statistical Methods in the Atmospheric Sciences - Daniel S. Wilks

Statistical Methods in the Atmospheric Sciences (eBook)

Daniel S. Wilks (Autor)

eBook Download: EPUB

2005 | 2. Auflage
648 Seiten
Elsevier Science (Verlag)
978-0-08-045622-5 (ISBN)

Statistical Methods in the Atmospheric Sciences, Second Edition, explains the latest statistical methods used to describe, analyze, test, and forecast atmospheric data. This revised and expanded text is intended to help students understand and communicate what their data sets have to say, or to make sense of the scientific literature in meteorology, climatology, and related disciplines.

In this new edition, what was a single chapter on multivariate statistics has been expanded to a full six chapters on this important topic. Other chapters have also been revised and cover exploratory data analysis, probability distributions, hypothesis testing, statistical weather forecasting, forecast verification, and time series analysis. There is now an expanded treatment of resampling tests and key analysis techniques, an updated discussion on ensemble forecasting, and a detailed chapter on forecast verification. In addition, the book includes new sections on maximum likelihood and on statistical simulation and contains current references to original research. Students will benefit from pedagogical features including worked examples, end-of-chapter exercises with separate solutions, and numerous illustrations and equations.

This book will be of interest to researchers and students in the atmospheric sciences, including meteorology, climatology, and other geophysical disciplines.

* Presents and explains techniques used in atmospheric data summarization, analysis, testing, and forecasting
* Features numerous worked examples and exercises
* Covers Model Output Statistic (MOS) with an introduction to the Kalman filter, an approach that tolerates frequent model changes
* Includes a detailed section on forecast verification
New in this Edition:
* Expanded treatment of resampling tests and coverage of key analysis techniques
* Updated treatment of ensemble forecasting
* Edits and revisions throughout the text plus updated references

Has been a member of the Atmospheric Sciences faculty at Cornell University since 1987, and is the author of Statistical Methods in the Atmospheric Sciences (2011, Academic Press), which is in its third edition and has been continuously in print since 1995. Research areas include statistical forecasting, forecast postprocessing, and forecast evaluation.

Statistical Methods in the Atmospheric Sciences, Second Edition, explains the latest statistical methods used to describe, analyze, test, and forecast atmospheric data. This revised and expanded text is intended to help students understand and communicate what their data sets have to say, or to make sense of the scientific literature in meteorology, climatology, and related disciplines. In this new edition, what was a single chapter on multivariate statistics has been expanded to a full six chapters on this important topic. Other chapters have also been revised and cover exploratory data analysis, probability distributions, hypothesis testing, statistical weather forecasting, forecast verification, and time series analysis. There is now an expanded treatment of resampling tests and key analysis techniques, an updated discussion on ensemble forecasting, and a detailed chapter on forecast verification. In addition, the book includes new sections on maximum likelihood and on statistical simulation and contains current references to original research. Students will benefit from pedagogical features including worked examples, end-of-chapter exercises with separate solutions, and numerous illustrations and equations. This book will be of interest to researchers and students in the atmospheric sciences, including meteorology, climatology, and other geophysical disciplines. * Presents and explains techniques used in atmospheric data summarization, analysis, testing, and forecasting* Features numerous worked examples and exercises* Covers Model Output Statistic (MOS) with an introduction to the Kalman filter, an approach that tolerates frequent model changes* Includes a detailed section on forecast verificationNew in this Edition:* Expanded treatment of resampling tests and coverage of key analysis techniques* Updated treatment of ensemble forecasting* Edits and revisions throughout the text plus updated references

Front Cover 1
Statistical Methods in the Atmospheric Sciences 4
Copyright Page 5
Contents 6
Preface to the First Edition 16
Preface to the Second Edition 18
PART I: Preliminaries 20
CHAPTER 1. Introduction 22
1.1 What Is Statistics? 22
1.2 Descriptive and Inferential Statistics 22
1.3 Uncertainty about the Atmosphere 23
CHAPTER 2. Review of Probability 26
2.1 Background 26
2.2 The Elements of Probability 26
2.3 The Meaning of Probability 28
2.4 Some Properties of Probability 30
2.5 Exercises 37
PART II: Univariate Statistics 40
CHAPTER 3. Empirical Distributions and Exploratory Data Analysis 42
3.1 Background 42
3.2 Numerical Summary Measures 44
3.3 Graphical Summary Techniques 47
3.4 Reexpression 61
3.5 Exploratory Techniques for Paired Data 68
3.6 Exploratory Techniques for Higher-Dimensional Data 78
3.7 Exercises 88
CHAPTER 4. Parametric Probability Distributions 90
4.1 Background 90
4.2 Discrete Distributions 92
4.3 Statistical Expectations 101
4.4 Continuous Distributions 104
4.5 Qualitative Assessments of the Goodness of Fit 130
4.6 Parameter Fitting Using Maximum Likelihood 133
4.7 Statistical Simulation 139
4.8 Exercises 147
CHAPTER 5. Hypothesis Testing 150
5.1 Background 150
5.2 Some Parametric Tests 157
5.3 Nonparametric Tests 175
5.4 Field Significance and Multiplicity 189
5.5 Exercises 195
CHAPTER 6. Statistical Forecasting 198
6.1 Background 198
6.2 Linear Regression 199
6.3 Nonlinear Regression 220
6.4 Predictor Selection 226
6.5 Objective Forecasts Using Traditional Statistical Methods 236
6.6 Ensemble Forecasting 248
6.7 Subjective Probability Forecasts 264
6.8 Exercises 271
CHAPTER 7. Forecast Verification 274
7.1 Background 274
7.2 Nonprobabilistic Forecasts of Discrete Predictands 279
7.3 Nonprobabilistic Forecasts of Continuous Predictands 295
7.4 Probability Forecasts of Discrete Predictands 301
7.5 Probability Forecasts for Continuous Predictands 321
7.6 Nonprobabilistic Forecasts of Fields 323
7.7 Verification of Ensemble Forecasts 333
7.8 Verification Based on Economic Value 340
7.9 Sampling and Inference for Verification Statistics 345
7.10 Exercises 351
CHAPTER 8. Time Series 356
8.1 Background 356
8.2 Time Domain—I. Discrete Data 358
8.3 Time Domain—II. Continuous Data 371
8.4 Frequency Domain—I. Harmonic Analysis 390
8.5 Frequency Domain—II. Spectral Analysis 400
8.6 Exercises 418
PART III: Multivariate Statistics 420
CHAPTER 9. Matrix Algebra and Random Matrices 422
9.1 Background to Multivariate Statistics 422
9.2 Multivariate Distance 425
9.3 Matrix Algebra Review 427
9.4 Random Vectors and Matrices 445
9.5 Exercises 451
CHAPTER 10. The Multivariate Normal (MVN) Distribution 454
10.1 Definition of the MVN 454
10.2 Four Handy Properties of the MVN 456
10.3 Assessing Multinormality 459
10.4 Simulation from the Multivariate Normal Distribution 463
10.5 Inferences about a Multinormal Mean Vector 467
10.6 Exercises 481
CHAPTER 11. Principal Component (EOF) Analysis 482
11.1 Basics of Principal Component Analysis 482
11.2 Application of PCA to Geophysical Fields 494
11.3 Truncation of the Principal Components 500
11.4 Sampling Properties of the Eigenvalues and Eigenvectors 505
11.5 Rotation of the Eigenvectors 511
11.6 Computational Considerations 518
11.7 Some Additional Uses of PCA 520
11.8 Exercises 526
CHAPTER 12. Canonical Correlation Analysis (CCA) 528
12.1 Basics of CCA 528
12.2 CCA Applied to Fields 536
12.3 Computational Considerations 541
12.4 Maximum Covariance Analysis 545
12.5 Exercises 547
CHAPTER 13. Discrimination and Classification 548
13.1 Discrimination vs. Classification 548
13.2 Separating Two Populations 549
13.3 Multiple Discriminant Analysis (MDA) 557
13.4 Forecasting with Discriminant Analysis 563
13.5 Alternatives to Classical Discriminant Analysis 564
13.6 Exercises 566
CHAPTER 14. Cluster Analysis 568
14.1 Background 568
14.2 Hierarchical Clustering 570
14.3 Nonhierarchical Clustering 578
14.4 Exercises 580
APPENDIX A. Example Data Sets 584
Table A.1. Daily precipitation and temperature data for Ithaca and Canandaigua, New York, for January 1987 585
Table A.2. January precipitation data for Ithaca, New York, 1933–1982 586
Table A.3. June climate data for Guayaquil, Ecuador, 1951–1970 586
APPENDIX B. Probability Tables 588
Table B.1. Cumulative Probabilities for the Standard Gaussian Distribution 589
Table B.2. Quantiles of the Standard Gamma Distribution 590
Table B.3. Right-tail quantiles of the Chi-square distribution 591
APPENDIX C. Answers to Exercises 598
References 606
Index 630

CHAPTER 1

Introduction

1.1 What Is Statistics?

This text is concerned with the use of statistical methods in the atmospheric sciences, specifically in the various specialties within meteorology and climatology. Students (and others) often resist statistics, and the subject is perceived by many to be the epitome of dullness. Before the advent of cheap and widely available computers, this negative view had some basis, at least with respect to applications of statistics involving the analysis of data. Performing hand calculations, even with the aid of a scientific pocket calculator, was indeed tedious, mind-numbing, and time-consuming. The capacity of ordinary personal computers on today’s desktops is well above the fastest mainframe computers of 40 years ago, but some people seem not to have noticed that the age of computational drudgery in statistics has long passed. In fact, some important and powerful statistical techniques were not even practical before the abundant availability of fast computing. Even when liberated from hand calculations, statistics is sometimes seen as dull by people who do not appreciate its relevance to scientific problems. Hopefully, this text will help provide that appreciation, at least with respect to the atmospheric sciences.

Fundamentally, statistics is concerned with uncertainty. Evaluating and quantifying uncertainty, as well as making inferences and forecasts in the face of uncertainty, are all parts of statistics. It should not be surprising, then, that statistics has many roles to play in the atmospheric sciences, since it is the uncertainty in atmospheric behavior that makes the atmosphere interesting. For example, many people are fascinated by weather forecasting, which remains interesting precisely because of the uncertainty that is intrinsic to the problem. If it were possible to make perfect forecasts even one day into the future (i.e., if there were no uncertainty involved), the practice of meteorology would be very dull, and similar in many ways to the calculation of tide tables.

1.2 Descriptive and Inferential Statistics

It is convenient, although somewhat arbitrary, to divide statistics into two broad areas: descriptive statistics and inferential statistics. Both are relevant to the atmospheric sciences.

Descriptive statistics relates to the organization and summarization of data. The atmospheric sciences are awash with data. Worldwide, operational surface and upper-air observations are routinely taken at thousands of locations in support of weather forecasting activities. These are supplemented with aircraft, radar, profiler, and satellite data. Observations of the atmosphere specifically for research purposes are less widespread, but often involve very dense sampling in time and space. In addition, models of the atmosphere consisting of numerical integration of the equations describing atmospheric dynamics produce yet more numerical output for both operational and research purposes.

As a consequence of these activities, we are often confronted with extremely large batches of numbers that, we hope, contain information about natural phenomena of interest. It can be a nontrivial task just to make some preliminary sense of such data sets. It is typically necessary to organize the raw data, and to choose and implement appropriate summary representations. When the individual data values are too numerous to be grasped individually, a summary that nevertheless portrays important aspects of their variations—a statistical model—can be invaluable in understanding the data. It is worth emphasizing that it is not the purpose of descriptive data analyses to play with numbers. Rather, these analyses are undertaken because it is known, suspected, or hoped that the data contain information about a natural phenomenon of interest, which can be exposed or better understood through the statistical analysis.

Inferential statistics is traditionally understood as consisting of methods and procedures used to draw conclusions regarding underlying processes that generate the data. Thiébaux and Pedder (1987) express this point somewhat poetically when they state that statistics is “the art of persuading the world to yield information about itself.” There is a kernel of truth here: Our physical understanding of atmospheric phenomena comes in part through statistical manipulation and analysis of data. In the context of the atmospheric sciences it is probably sensible to interpret inferential statistics a bit more broadly as well, and include statistical weather forecasting. By now this important field has a long tradition, and is an integral part of operational weather forecasting at meteorological centers throughout the world.

1.3 Uncertainty about the Atmosphere

Underlying both descriptive and inferential statistics is the notion of uncertainty. If atmospheric processes were constant, or strictly periodic, describing them mathematically would be easy. Weather forecasting would also be easy, and meteorology would be boring. Of course, the atmosphere exhibits variations and fluctuations that are irregular. This uncertainty is the driving force behind the collection and analysis of the large data sets referred to in the previous section. It also implies that weather forecasts are inescapably uncertain. The weather forecaster predicting a particular temperature on the following day is not at all surprised (and perhaps is even pleased) if the subsequently observed temperature is different by a degree or two. In order to deal quantitatively with uncertainty it is necessary to employ the tools of probability, which is the mathematical language of uncertainty.

Before reviewing the basics of probability, it is worthwhile to examine why there is uncertainty about the atmosphere. After all, we have large, sophisticated computer models that represent the physics of the atmosphere, and such models are used routinely for forecasting its future evolution. In their usual forms these models are deterministic: they do not represent uncertainty. Once supplied with a particular initial atmospheric state (winds, temperatures, humidities, etc., comprehensively through the depth of the atmosphere and around the planet) and boundary forcings (notably solar radiation, and sea-surface and land conditions) each will produce a single particular result. Rerunning the model with the same inputs will not change that result.

In principle these atmospheric models could provide forecasts with no uncertainty, but do not, for two reasons. First, even though the models can be very impressive and give quite good approximations to atmospheric behavior, they are not complete and true representations of the governing physics. An important and essentially unavoidable cause of this problem is that some relevant physical processes operate on scales too small to be represented explicitly by these models, and their effects on the larger scales must be approximated in some way using only the large-scale information.

Even if all the relevant physics could somehow be included in atmospheric models, however, we still could not escape the uncertainty because of what has come to be known as dynamical chaos. This phenomenon was discovered by an atmospheric scientist (Lorenz, 1963), who also has provided a very readable introduction to the subject (Lorenz, 1993). Simply and roughly put, the time evolution of a nonlinear, deterministic dynamical system (e.g., the equations of atmospheric motion, or the atmosphere itself) depends very sensitively on the initial conditions of the system. If two realizations of such a system are started from two only very slightly different initial conditions, the two solutions will eventually diverge markedly. For the case of atmospheric simulation, imagine that one of these systems is the real atmosphere, and the other is a perfect mathematical model of the physics governing the atmosphere. Since the atmosphere is always incompletely observed, it will never be possible to start the mathematical model in exactly the same state as the real system. So even if the model is perfect, it will still be impossible to calculate what the atmosphere will do indefinitely far into the future. Therefore, deterministic forecasts of future atmospheric behavior will always be uncertain, and probabilistic methods will always be needed to describe adequately that behavior.

Whether or not the atmosphere is fundamentally a random system, for many practical purposes it might as well be. The realization that the atmosphere exhibits chaotic dynamics has ended the dream of perfect (uncertainty-free) weather forecasts that formed the philosophical basis for much of twentieth-century meteorology (an account of this history and scientific culture is provided by Friedman, 1989). “Just as relativity eliminated the Newtonian illusion of absolute space and time, and as quantum theory eliminated the Newtonian and Einsteinian dream of a controllable measurement process, chaos eliminates the Laplacian fantasy of long-term deterministic predictability” (Zeng et al., 1993). Jointly, chaotic dynamics and the unavoidable errors in mathematical representations of the atmosphere imply that “all meteorological prediction problems, from weather forecasting to climate-change projection, are essentially probabilistic” (Palmer, 2001).

Finally, it is worth noting that randomness is not a state of “unpredictability,” or “no information,” as is sometimes thought. Rather, random means “not precisely predictable or determinable.” For example, the amount of precipitation that will occur tomorrow where you live is a random quantity,...

Erscheint lt. Verlag	12.12.2005
Sprache	englisch
Themenwelt	Mathematik / Informatik ► Mathematik ► Statistik
	Naturwissenschaften ► Geowissenschaften ► Geologie
	Naturwissenschaften ► Geowissenschaften ► Geophysik
	Naturwissenschaften ► Geowissenschaften ► Meteorologie / Klimatologie
	Naturwissenschaften ► Physik / Astronomie ► Angewandte Physik
	Technik ► Bauwesen
ISBN-10	0-08-045622-7 / 0080456227
ISBN-13	978-0-08-045622-5 / 9780080456225

Haben Sie eine Frage zum Produkt?

EPUB (Adobe DRM)

Kopierschutz: Adobe-DRM
Adobe-DRM ist ein Kopierschutz, der das eBook vor Mißbrauch schützen soll. Dabei wird das eBook bereits beim Download auf Ihre persönliche Adobe-ID autorisiert. Lesen können Sie das eBook dann nur auf den Geräten, welche ebenfalls auf Ihre Adobe-ID registriert sind.
Details zum Adobe-DRM

Dateiformat: EPUB (Electronic Publication)
EPUB ist ein offener Standard für eBooks und eignet sich besonders zur Darstellung von Belletristik und Sachbüchern. Der Fließtext wird dynamisch an die Display- und Schriftgröße angepasst. Auch für mobile Lesegeräte ist EPUB daher gut geeignet.

Systemvoraussetzungen:
PC/Mac: Mit einem PC oder Mac können Sie dieses eBook lesen. Sie benötigen eine Adobe-ID und die Software Adobe Digital Editions (kostenlos). Von der Benutzung der OverDrive Media Console raten wir Ihnen ab. Erfahrungsgemäß treten hier gehäuft Probleme mit dem Adobe DRM auf.
eReader: Dieses eBook kann mit (fast) allen eBook-Readern gelesen werden. Mit dem amazon-Kindle ist es aber nicht kompatibel.
Smartphone/Tablet: Egal ob Apple oder Android, dieses eBook können Sie lesen. Sie benötigen eine Adobe-ID sowie eine kostenlose App.
Geräteliste und zusätzliche Hinweise

Buying eBooks from abroad
For tax law reasons we can sell eBooks just within Germany and Switzerland. Regrettably we cannot fulfill eBook-orders from other countries.

Print-Ausgabe

Buch | Hardcover

69,80 €