Statistical Analysis Techniques in Particle Physics - Ilya Narsky, Frank C. Porter

Blick ins Buch

Statistical Analysis Techniques in Particle Physics (eBook)

Fits, Density Estimation and Supervised Learning

Ilya Narsky, Frank C. Porter (Autoren)

eBook Download: EPUB

2013 | 1. Auflage
459 Seiten
Wiley-VCH (Verlag)
978-3-527-67729-0 (ISBN)

Lese- und Medienproben

Ebook-Leseprobe (EPUB)

The first book written specifically with physicists in mind on analysis techniques in particle physics with an emphasis on machine learning techniques.
Based on lectures given by the authors at Stanford and Caltech, this practical approach shows by means of analysis examples how observables are extracted from data, how signal and background are estimated, and how accurate error estimates are obtained exploiting uni- and multivariate analysis techniques, such as non-parametric density estimation, likelihood fits, neural networks, support vector machines, decision trees, and ensembles of classifiers. It includes simple code snippets that run on popular software suites such as Root and Matlab, and either include the codes for generating data or make use of publically available data that can be downloaded from the Web.
Primarily aimed at master and very advanced undergraduate students, this text is also intended for study and research.

The authors are experts in the use of statistics in particle physics data analysis. Frank C. Porter is Professor at Physics at the California Institute of Technology and has lectured extensively at CalTech, the SLAC Laboratory at Stanford, and elsewhere. Ilya Narsky is Senior Matlab Developer at The MathWorks, a leading developer of technical computing software for engineers and scientists, and the initiator of the StatPatternRecognition, a C++ package for statistical analysis of HEP data. Together, they have taught courses for graduate students and postdocs.

1 Why We Wrote This Book and How You Should Read It
2 Parametric Likelihood Fits
2.1 Preliminaries
2.2 Parametric Likelihood Fits
2.3 Fits for Small Statistics
2.4 Results Near the Boundary of a Physical Region
2.5 Likelihood Ratio Test for Presence of Signal
2.6 sPlots
2.7 Exercises
3 Goodness of Fit
3.1 Binned Goodness of Fit Tests
3.2 Statistics Converging to Chi-Square
3.3 Univariate Unbinned Goodness of Fit Tests
3.4 Multivariate Tests
3.5 Exercises
4 Resampling Techniques
4.1 Permutation Sampling
4.2 Bootstrap
4.3 Jackknife
4.4 BCa Confidence Intervals
4.5 Cross-Validation
4.6 _Resampling Weighted Observations
4.7 Exercises
5 Density Estimation
5.1 Empirical Density Estimate
5.2 Histograms
5.3 Kernel Estimation
5.4 Ideogram
5.5 Parametric vs. Nonparametric Density Estimation
5.6 Optimization
5.7 Estimating Errors
5.8 The Curse of Dimensionality
5.9 Adaptive Kernel Estimation
5.10 Naive Bayes Classification
5.11 Multivariate Kernel Estimation
5.12 Estimation Using Orthogonal Series
5.13 Using Monte Carlo Models
5.14 Unfolding
5.14.1 Unfolding: Regularization
6 Basic Concepts and Definitions of Machine Learning
6.1 Supervised, Unsupervised, and Semi-Supervised
6.2 Tall and Wide Data
6.3 Batch and Online Learning
6.4 Parallel Learning
6.5 Classification and Regression
7 Data Preprocessing
7.1 Categorical Variables
7.2 Missing Values
7.3 Outliers
7.4 Exercises
8 Linear Transformations and Dimensionality Reduction
8.1 Centering, Scaling, Reflection and Rotation
8.2 Rotation and Dimensionality Reduction
8.3 Principal Component Analysis (PCA)
of Components
8.4 Independent Component Analysis (ICA)
8.4.1 Theory
8.5 Exercises
9 Introduction to Classification
9.1 Loss Functions: Hard Labels and Soft Scores
9.2 Bias, Variance, and Noise
9.3 Training, Validating and Testing: The Optimal Splitting Rule
9.4 Resampling Techniques: Cross-Validation and Bootstrap
9.5 Data with Unbalanced Classes
9.6 Learning with Cost
9.7 Exercises
10 Assessing Classifier Performance
10.1 Classification Error and Other Measures of Predictive Power
10.2 Receiver Operating Characteristic (ROC) and Other Curves
10.3 Testing Equivalence of Two Classification Models
10.4 Comparing Several Classifiers
10.5 Exercises
11 Linear and Quadratic Discriminant Analysis, Logistic Regression,
and Partial Least Squares Regression
11.1 Discriminant Analysis
11.2 Logistic Regression
11.3 Classification by Linear Regression
11.4 Partial Least Squares Regression
11.5 Example: Linear Models for MAGIC Telescope Data
11.6 Choosing a Linear Classifier for Your Analysis
11.7 Exercises
12 Neural Networks
12.1 Perceptrons
12.2 The Feed-Forward Neural Network
12.3 Backpropagation
12.4 Bayes Neural Networks
12.5 Genetic Algorithms
12.6 Exercises
13 Local Learning and Kernel Expansion
13.1 From Input Variables to the Feature Space
13.2 Regularization
13.3 Making and Choosing Kernels
13.4 Radial Basis Functions
13.5 Support Vector Machines (SVM)
13.6 Empirical Local Methods
13.7 Kernel Methods: The Good, the Bad and the Curse of Dimensionality
13.8 Exercises
14 Decision Trees
14.1 Growing Trees
14.2 Predicting by Decision Trees
14.3 Stopping Rules
14.4 Pruning Trees
14.5 Trees for Multiple Classes
14.6 Splits on Categorical Variables
14.7 Surrogate Splits
14.8 Missing Values
14.9 Variable importance
14.10 Why Are Decision Trees Good (or Bad)?
14.11 Exercises
15 Ensemble Learning
15.1 Boosting
15.2 Diversifying theWeak Learner: Bagging, Random Subspace and Random Forest
15.3 Choosing an Ensemble for Your Analysis
15.4 Exercises
16 Reducing Multiclass to Binary
16.1 Encoding
16.2 Decoding
16.3 Summary: Choosing the Right Design
17 How to Choose the Right Classifier for Your Analysis and Apply It Correctly
17.1 Predictive Performance and Interpretability
17.2 Matching Classifiers and Variables
17.3 Using Classifier Predictions
17.4 Optimizing Accuracy
17.5 CPU and Memory Requirements
18 Methods for Variable Ranking and Selection
18.1 Definitions
18.2 Variable Ranking
Elimination (SBE), and

Erscheint lt. Verlag	24.10.2013
Sprache	englisch
Themenwelt	Naturwissenschaften ► Physik / Astronomie ► Atom- / Kern- / Molekularphysik
Themenwelt	Technik
Schlagworte	Applied Mathematics in Science • Data Analysis • Datenanalyse • Kern- u. Hochenergiephysik • Mathematics • Mathematik • Mathematik in den Naturwissenschaften • Nuclear & High Energy Physics • Physics • Physik • Statistics • Statistik • Statistische Analyse • Teilchenphysik
ISBN-10	3-527-67729-1 / 3527677291
ISBN-13	978-3-527-67729-0 / 9783527677290

Haben Sie eine Frage zum Produkt?

EPUB (Adobe DRM)
Größe: 10,5 MB

Kopierschutz: Adobe-DRM
Adobe-DRM ist ein Kopierschutz, der das eBook vor Mißbrauch schützen soll. Dabei wird das eBook bereits beim Download auf Ihre persönliche Adobe-ID autorisiert. Lesen können Sie das eBook dann nur auf den Geräten, welche ebenfalls auf Ihre Adobe-ID registriert sind.
Details zum Adobe-DRM

Dateiformat: EPUB (Electronic Publication)
EPUB ist ein offener Standard für eBooks und eignet sich besonders zur Darstellung von Belletristik und Sachbüchern. Der Fließtext wird dynamisch an die Display- und Schriftgröße angepasst. Auch für mobile Lesegeräte ist EPUB daher gut geeignet.

Systemvoraussetzungen:
PC/Mac: Mit einem PC oder Mac können Sie dieses eBook lesen. Sie benötigen eine Adobe-ID und die Software Adobe Digital Editions (kostenlos). Von der Benutzung der OverDrive Media Console raten wir Ihnen ab. Erfahrungsgemäß treten hier gehäuft Probleme mit dem Adobe DRM auf.
eReader: Dieses eBook kann mit (fast) allen eBook-Readern gelesen werden. Mit dem amazon-Kindle ist es aber nicht kompatibel.
Smartphone/Tablet: Egal ob Apple oder Android, dieses eBook können Sie lesen. Sie benötigen eine Adobe-ID sowie eine kostenlose App.
Geräteliste und zusätzliche Hinweise

Buying eBooks from abroad
For tax law reasons we can sell eBooks just within Germany and Switzerland. Regrettably we cannot fulfill eBook-orders from other countries.