Small Area Estimation

J. N. K. Rao, Isabel Molina (Autoren)

Buch | Hardcover

480 Seiten

2015 | 2nd edition
John Wiley & Sons Inc (Verlag)
978-1-118-73578-7 (ISBN)

Artikel merken

Praise for the First Edition "This pioneering work, in which Rao provides a comprehensive and up-to-date treatment of small area estimation, will become a classic. I believe that it has the potential to turn small area estimation. into a larger area of importance to both researchers and practitioners.

Praise for the First Edition

"This pioneering work, in which Rao provides a comprehensive and up-to-date treatment of small area estimation, will become a classic...I believe that it has the potential to turn small area estimation...into a larger area of importance to both researchers and practitioners."
—Journal of the American Statistical Association

Written by two experts in the field, Small Area Estimation, Second Edition provides a comprehensive and up-to-date account of the methods and theory of small area estimation (SAE), particularly indirect estimation based on explicit small area linking models. The model-based approach to small area estimation offers several advantages including increased precision, the derivation of "optimal" estimates and associated measures of variability under an assumed model, and the validation of models from the sample data.

Emphasizing real data throughout, the Second Edition maintains a self-contained account of crucial theoretical and methodological developments in the field of SAE. The new edition provides extensive accounts of new and updated research, which often involves complex theory to handle model misspecifications and other complexities. Including information on survey design issues and traditional methods employing indirect estimates based on implicit linking models, Small Area Estimation, Second Edition also features:

Additional sections describing the use of R code data sets for readers to use when replicating applications
Numerous examples of SAE applications throughout each chapter, including recent applications in U.S. Federal programs
New topical coverage on extended design issues, synthetic estimation, further refinements and solutions to the Fay-Herriot area level model, basic unit level models, and spatial and time series models
A discussion of the advantages and limitations of various SAE methods for model selection from data as well as comparisons of estimates derived from models to reliable values obtained from external sources, such as previous census or administrative data

Small Area Estimation, Second Edition is an excellent reference for practicing statisticians and survey methodologists as well as practitioners interested in learning SAE methods. The Second Edition is also an ideal textbook for graduate-level courses in SAE and reliable small area statistics.

J. N. K. Rao, PhD, is Professor Emeritus and Distinguished Research Professor in the School of Mathematics and Statistics, Carleton University, Ottawa, Canada. He is an editorial advisor for the Wiley Series in Survey Methodology. Isabel Molina, PhD, is Associate Professor of Statistics at Universidad Carlos III de Madrid, Spain.

List of Figures xv

List of Tables xvii

Foreword to the First Edition xix

Preface to the Second Edition xxiii

Preface to the First Edition xxvii

1 *Introduction 1

1.1 What is a Small Area? 1

1.2 Demand for Small Area Statistics 3

1.3 Traditional Indirect Estimators 4

1.4 Small Area Models 4

1.5 Model-Based Estimation 5

1.6 Some Examples 6

1.6.1 Health 6

1.6.2 Agriculture 7

1.6.3 Income for Small Places 8

1.6.4 Poverty Counts 8

1.6.5 Median Income of Four-Person Families 8

1.6.6 Poverty Mapping 8

2 Direct Domain Estimation 9

2.1 Introduction 9

2.2 Design-Based Approach 10

2.3 Estimation of Totals 11

2.3.1 Design-Unbiased Estimator 11

2.3.2 Generalized Regression Estimator 13

2.4 Domain Estimation 16

2.4.1 Case of No Auxiliary Information 16

2.4.2 GREG Domain Estimation 17

2.4.3 Domain-Specific Auxiliary Information 18

2.5 Modified GREG Estimator 21

2.6 Design Issues 23

2.6.1 Minimization of Clustering 24

2.6.2 Stratification 24

2.6.3 Sample Allocation 24

2.6.4 Integration of Surveys 25

2.6.5 Dual-Frame Surveys 25

2.6.6 Repeated Surveys 26

2.7 *Optimal Sample Allocation for Planned Domains 26

2.7.1 Case (i) 26

2.7.2 Case (ii) 29

2.7.3 Two-Way Stratification: Balanced Sampling 31

2.8 Proofs 32

2.8.1 Proof of ŶGR(𝐱) = 𝐗 32

2.8.2 Derivation of Calibration Weights 𝑤∗j 32

2.8.3 Proof of Y = X^T𝐁^when cj = 𝝂T𝐗j 32

3 Indirect Domain Estimation 35

3.1 Introduction 35

3.2 Synthetic Estimation 36

3.2.1 No Auxiliary Information 36

3.2.2 *Area Level Auxiliary Information 36

3.2.3 *Unit Level Auxiliary Information 37

3.2.4 Regression-Adjusted Synthetic Estimator 42

3.2.5 Estimation of MSE 43

3.2.6 Structure Preserving Estimation 45

3.2.7 *Generalized SPREE 49

3.2.8 *Weight-Sharing Methods 53

3.3 Composite Estimation 57

3.3.1 Optimal Estimator 57

3.3.2 Sample-Size-Dependent Estimators 59

3.4 James–Stein Method 63

3.4.1 Common Weight 63

3.4.2 Equal Variances 𝜓i = 𝜓 64

3.4.3 Estimation of Component MSE 68

3.4.4 Unequal Variances 𝜓i 70

3.4.5 Extensions 71

3.5 Proofs 71

4 Small Area Models 75

4.1 Introduction 75

4.2 Basic Area Level Model 76

4.3 Basic Unit Level Model 78

4.4 Extensions: Area Level Models 81

4.4.1 Multivariate Fay–Herriot Model 81

4.4.2 Model with Correlated Sampling Errors 82

4.4.3 Time Series and Cross-Sectional Models 83

4.4.4 *Spatial Models 86

4.4.5 Two-Fold Subarea Level Models 88

4.5 Extensions: Unit Level Models 88

4.5.1 Multivariate Nested Error Regression Model 88

4.5.2 Two-Fold Nested Error Regression Model 89

4.5.3 Two-Level Model 90

4.5.4 General Linear Mixed Model 91

4.6 Generalized Linear Mixed Models 92

4.6.1 Logistic Mixed Models 92

4.6.2 *Models for Multinomial Counts 93

4.6.3 Models for Mortality and Disease Rates 93

4.6.4 Natural Exponential Family Models 94

4.6.5 *Semi-parametric Mixed Models 95

5 Empirical Best Linear Unbiased Prediction (EBLUP): Theory 97

5.1 Introduction 97

5.2 General Linear Mixed Model 98

5.2.1 BLUP Estimator 98

5.2.2 MSE of BLUP 100

5.2.3 EBLUP Estimator 101

5.2.4 ML and REML Estimators 102

5.2.5 MSE of EBLUP 105

5.2.6 Estimation of MSE of EBLUP 106

5.3 Block Diagonal Covariance Structure 108

5.3.1 EBLUP Estimator 108

5.3.2 Estimation of MSE 109

5.3.3 Extension to Multidimensional Area Parameters 110

5.4 *Model Identification and Checking 111

5.4.1 Variable Selection 111

5.4.2 Model Diagnostics 114

5.5 *Software 118

5.6 Proofs 119

5.6.1 Derivation of BLUP 119

5.6.2 Equivalence of BLUP and Best Predictor E(𝐦T𝐯|𝐀T𝐲) 120

5.6.3 Derivation of MSE Decomposition (5.2.29) 121

6 Empirical Best Linear Unbiased Prediction (EBLUP): Basic Area Level Model 123

6.1 EBLUP Estimation 123

6.1.1 BLUP Estimator 124

6.1.2 Estimation of 𝜎2𝑣 126

6.1.3 Relative Efficiency of Estimators of 𝜎2𝑣 128

6.1.4 *Applications 129

6.2 MSE Estimation 136

6.2.1 Unconditional MSE of EBLUP 136

6.2.2 MSE for Nonsampled Areas 139

6.2.3 *MSE Estimation for Small Area Means 140

6.2.4 *Bootstrap MSE Estimation 141

6.2.5 *MSE of a Weighted Estimator 143

6.2.6 Mean Cross Product Error of Two Estimators 144

6.2.7 *Conditional MSE 144

6.3 *Robust Estimation in the Presence of Outliers 146

6.4 *Practical Issues 148

6.4.1 Unknown Sampling Error Variances 148

6.4.2 Strictly Positive Estimators of 𝜎2𝑣 151

6.4.3 Preliminary Test Estimation 154

6.4.4 Covariates Subject to Sampling Errors 156

6.4.5 Big Data Covariates 159

6.4.6 Benchmarking Methods 159

6.4.7 Misspecified Linking Model 165

6.5 *Software 169

7 Basic Unit Level Model 173

7.1 EBLUP Estimation 173

7.1.1 BLUP Estimator 174

7.1.2 Estimation of 𝜎2𝑣 and 𝜎2e 177

7.1.3 *Nonnegligible Sampling Fractions 178

7.2 MSE Estimation 179

7.2.1 Unconditional MSE of EBLUP 179

7.2.2 Unconditional MSE Estimators 181

7.2.3 *MSE Estimation: Nonnegligible Sampling Fractions 182

7.2.4 *Bootstrap MSE Estimation 183

7.3 *Applications 186

7.4 *Outlier Robust EBLUP Estimation 193

7.4.1 Estimation of Area Means 193

7.4.2 MSE Estimation 198

7.4.3 Simulation Results 199

7.5 *M-Quantile Regression 200

7.6 *Practical Issues 205

7.6.1 Unknown Heteroscedastic Error Variances 205

7.6.2 Pseudo-EBLUP Estimation 206

7.6.3 Informative Sampling 211

7.6.4 Measurement Error in Area-Level Covariate 216

7.6.5 Model Misspecification 218

7.6.6 Semi-parametric Nested Error Model: EBLUP 220

7.6.7 Semi-parametric Nested Error Model: REBLUP 224

7.7 *Software 227

7.8 *Proofs 231

7.8.1 Derivation of (7.6.17) 231

7.8.2 Proof of (7.6.20) 232

8 EBLUP: Extensions 235

8.1 *Multivariate Fay–Herriot Model 235

8.2 Correlated Sampling Errors 237

8.3 Time Series and Cross-Sectional Models 240

8.3.1 *Rao–Yu Model 240

8.3.2 State-Space Models 243

8.4 *Spatial Models 248

8.5 *Two-Fold Subarea Level Models 251

8.6 *Multivariate Nested Error Regression Model 253

8.7 Two-Fold Nested Error Regression Model 254

8.8 *Two-Level Model 259

8.9 *Models for Multinomial Counts 261

8.10 *EBLUP for Vectors of Area Proportions 262

8.11 *Software 264

9 Empirical Bayes (EB) Method 269

9.1 Introduction 269

9.2 Basic Area Level Model 270

9.2.1 EB Estimator 271

9.2.2 MSE Estimation 273

9.2.3 Approximation to Posterior Variance 275

9.2.4 *EB Confidence Intervals 281

9.3 Linear Mixed Models 287

9.3.1 EB Estimation of 𝜇i = 𝐥iT𝜷 + 𝐦Ti 𝐯i 287

9.3.2 MSE Estimation 288

9.3.3 Approximations to the Posterior Variance 288

9.4 *EB Estimation of General Finite Population Parameters 289

9.4.1 BP Estimator Under a Finite Population 290

9.4.2 EB Estimation Under the Basic Unit Level Model 290

9.4.3 FGT Poverty Measures 293

9.4.4 Parametric Bootstrap for MSE Estimation 294

9.4.5 ELL Estimation 295

9.4.6 Simulation Experiments 296

9.5 Binary Data 298

9.5.1 *Case of No Covariates 299

9.5.2 Models with Covariates 304

9.6 Disease Mapping 308

9.6.1 Poisson–Gamma Model 309

9.6.2 Log-Normal Models 310

9.6.3 Extensions 312

9.7 *Design-Weighted EB Estimation: Exponential Family Models 313

9.8 Triple-Goal Estimation 315

9.8.1 Constrained EB 316

9.8.2 Histogram 318

9.8.3 Ranks 318

9.9 Empirical Linear Bayes 319

9.9.1 LB Estimation 319

9.9.2 Posterior Linearity 322

9.10 Constrained LB 324

9.11 *Software 325

9.12 Proofs 330

9.12.1 Proof of (9.2.11) 330

9.12.2 Proof of (9.2.30) 330

9.12.3 Proof of (9.8.6) 331

9.12.4 Proof of (9.9.1) 331

10 Hierarchical Bayes (HB) Method 333

10.1 Introduction 333

10.2 MCMC Methods 335

10.2.1 Markov Chain 335

10.2.2 Gibbs Sampler 336

10.2.3 M–H Within Gibbs 336

10.2.4 Posterior Quantities 337

10.2.5 Practical Issues 339

10.2.6 Model Determination 342

10.3 Basic Area Level Model 347

10.3.1 Known 𝜎2𝑣 347

10.3.2 *Unknown 𝜎2𝑣: Numerical Integration 348

10.3.3 Unknown 𝜎2𝑣: Gibbs Sampling 351

10.3.4 *Unknown Sampling Variances 𝜓i 354

10.3.5 *Spatial Model 355

10.4 *Unmatched Sampling and Linking Area Level Models 356

10.5 Basic Unit Level Model 362

10.5.1 Known 𝜎2𝑣 and 𝜎2e 362

10.5.2 Unknown 𝜎2𝑣 and 𝜎2e: Numerical Integration 363

10.5.3 Unknown 𝜎2𝑣 and 𝜎2e: Gibbs Sampling 364

10.5.4 Pseudo-HB Estimation 365

10.6 General ANOVA Model 368

10.7 *HB Estimation of General Finite Population Parameters 369

10.7.1 HB Estimator under a Finite Population 370

10.7.2 Reparameterized Basic Unit Level Model 370

10.7.3 HB Estimator of a General Area Parameter 372

10.8 Two-Level Models 374

10.9 Time Series and Cross-Sectional Models 377

10.10 Multivariate Models 381

10.10.1 Area Level Model 381

10.10.2 Unit Level Model 382

10.11 Disease Mapping Models 383

10.11.1 Poisson-Gamma Model 383

10.11.2 Log-Normal Model 384

10.11.3 Two-Level Models 386

10.12 *Two-Part Nested Error Model 388

10.13 Binary Data 389

10.13.1 Beta-Binomial Model 389

10.13.2 Logit-Normal Model 390

10.13.3 Logistic Linear Mixed Models 393

10.14 *Missing Binary Data 397

10.15 Natural Exponential Family Models 398

10.16 Constrained HB 399

10.17 *Approximate HB Inference and Data Cloning 400

10.18 Proofs 402

10.18.1 Proof of (10.2.26) 402

10.18.2 Proof of (10.2.32) 402

10.18.3 Proof of (10.3.13)–(10.3.15) 402

References 405

Author Index 431

Subject Index 437

Reihe/Serie	Wiley Series in Survey Methodology
Verlagsort	New York
Sprache	englisch
Maße	165 x 241 mm
Gewicht	789 g
Themenwelt	Geisteswissenschaften ► Psychologie
	Mathematik / Informatik ► Mathematik ► Angewandte Mathematik
	Mathematik / Informatik ► Mathematik ► Wahrscheinlichkeit / Kombinatorik
	Sozialwissenschaften ► Soziologie
ISBN-10	1-118-73578-1 / 1118735781
ISBN-13	978-1-118-73578-7 / 9781118735787
Zustand	Neuware