Synthetic Datasets for Statistical Disclosure Control
Theory and Implementation
Seiten
2011
Springer-Verlag New York Inc.
978-1-4614-0325-8 (ISBN)
Springer-Verlag New York Inc.
978-1-4614-0325-8 (ISBN)
Gives the reader a detailed introduction to the different approaches to generating multiply imputed synthetic datasets. This book provides a brief history of synthetic datasets, and also gives useful hints on how to deal with real data problems like nonresponse, skip patterns, or logical constraints.
The aim of this book is to give the reader a detailed introduction to the different approaches to generating multiply imputed synthetic datasets. It describes all approaches that have been developed so far, provides a brief history of synthetic datasets, and gives useful hints on how to deal with real data problems like nonresponse, skip patterns, or logical constraints.
Each chapter is dedicated to one approach, first describing the general concept followed by a detailed application to a real dataset providing useful guidelines on how to implement the theory in practice.
The discussed multiple imputation approaches include imputation for nonresponse, generating fully synthetic datasets, generating partially synthetic datasets, generating synthetic datasets when the original data is subject to nonresponse, and a two-stage imputation approach that helps to better address the omnipresent trade-off between analytical validity and the risk of disclosure.
The book concludes with a glimpse into the future of synthetic datasets, discussing the potential benefits and possible obstacles of the approach and ways to address the concerns of data users and their understandable discomfort with using data that doesn’t consist only of the originally collected values.
The book is intended for researchers and practitioners alike. It helps the researcher to find the state of the art in synthetic data summarized in one book with full reference to all relevant papers on the topic. But it is also useful for the practitioner at the statistical agency who is considering the synthetic data approach for data dissemination in the future and wants to get familiar with the topic.
The aim of this book is to give the reader a detailed introduction to the different approaches to generating multiply imputed synthetic datasets. It describes all approaches that have been developed so far, provides a brief history of synthetic datasets, and gives useful hints on how to deal with real data problems like nonresponse, skip patterns, or logical constraints.
Each chapter is dedicated to one approach, first describing the general concept followed by a detailed application to a real dataset providing useful guidelines on how to implement the theory in practice.
The discussed multiple imputation approaches include imputation for nonresponse, generating fully synthetic datasets, generating partially synthetic datasets, generating synthetic datasets when the original data is subject to nonresponse, and a two-stage imputation approach that helps to better address the omnipresent trade-off between analytical validity and the risk of disclosure.
The book concludes with a glimpse into the future of synthetic datasets, discussing the potential benefits and possible obstacles of the approach and ways to address the concerns of data users and their understandable discomfort with using data that doesn’t consist only of the originally collected values.
The book is intended for researchers and practitioners alike. It helps the researcher to find the state of the art in synthetic data summarized in one book with full reference to all relevant papers on the topic. But it is also useful for the practitioner at the statistical agency who is considering the synthetic data approach for data dissemination in the future and wants to get familiar with the topic.
Jörg Drechsler is a Research Scientist at the German Institute for Employment Research, Department for Statistical Methods. His main areas of research involve statistical disclosure control and imputation with published papers in JASA, Statistica Sinica, JOS, and Survey Methodology.
Introduction.- Background on Multiply Imputed Synthetic Datasets.- Background on Multiple Imputation.- The IAB Establishment Panel.- Multiple Imputation for Nonresponse.- Fully Synthetic Datasets.- Partially Synthetic Datasets.- Multiple Imputation for Nonresponse and Statistical Disclosure Control.- A Two-Stage Imputation Procedure to Balance the Risk-Utility Trade-Off.- Chances and Obstacles for Multiply Imputed Synthetic Datasets.
Reihe/Serie | Lecture Notes in Statistics ; 201 |
---|---|
Zusatzinfo | 19 Illustrations, black and white; XX, 138 p. 19 illus. |
Verlagsort | New York, NY |
Sprache | englisch |
Maße | 155 x 235 mm |
Themenwelt | Mathematik / Informatik ► Mathematik ► Statistik |
Mathematik / Informatik ► Mathematik ► Wahrscheinlichkeit / Kombinatorik | |
Sozialwissenschaften ► Soziologie ► Empirische Sozialforschung | |
Wirtschaft | |
Schlagworte | confidentiality • Disclosure • multiple imputation • Synthetic |
ISBN-10 | 1-4614-0325-1 / 1461403251 |
ISBN-13 | 978-1-4614-0325-8 / 9781461403258 |
Zustand | Neuware |
Haben Sie eine Frage zum Produkt? |
Mehr entdecken
aus dem Bereich
aus dem Bereich
Eine Einführung für Wirtschafts- und Sozialwissenschaftler
Buch | Softcover (2022)
De Gruyter Oldenbourg (Verlag)
29,95 €