Financial Data Analytics with Machine Learning, Optimization and Statistics

Sam Chen, Ka Chun Cheung, Phillip Yam (Autoren)

Buch | Hardcover

816 Seiten

2024
John Wiley & Sons Inc (Verlag)
978-1-119-86337-3 (ISBN)

Artikel merken

An essential introduction to data analytics and Machine Learning techniques in the business sector

In Financial Data Analytics with Machine Learning, Optimization and Statistics, a team consisting of a distinguished applied mathematician and statistician, experienced actuarial professionals and working data analysts delivers an expertly balanced combination of traditional financial statistics, effective machine learning tools, and mathematics. The book focuses on contemporary techniques used for data analytics in the financial sector and the insurance industry with an emphasis on mathematical understanding and statistical principles and connects them with common and practical financial problems. Each chapter is equipped with derivations and proofs—especially of key results—and includes several realistic examples which stem from common financial contexts. The computer algorithms in the book are implemented using Python and R, two of the most widely used programming languages for applied science and in academia and industry, so that readers can implement the relevant models and use the programs themselves.

This book can help readers become well-equipped with the following skills:

To evaluate financial and insurance data quality, and use the distilled knowledge obtained from the data after applying data analytic tools to make timely financial decisions
To apply effective data dimension reduction tools to enhance supervised learning
To describe and select suitable data analytic tools as introduced above for a given dataset depending upon classification or regression prediction purpose

The book covers the competencies tested by several professional examinations, such as the Predictive Analytics Exam offered by the Society of Actuaries, and the Institute and Faculty of Actuaries' Actuarial Statistics Exam.

Besides being an indispensable resource for senior undergraduate and graduate students taking courses in financial engineering, statistics, quantitative finance, risk management, actuarial science, data science, and mathematics for AI, Financial Data Analytics with Machine Learning, Optimization and Statistics also belongs in the libraries of aspiring and practicing quantitative analysts working in commercial and investment banking.

YONGZHAO CHEN (SAM) [BSC(ACTUARSC) & PHD (HKU)] is currently an Assistant Professor at the Department of Mathematics, Statistics and Insurance, The Hang Seng University of Hong Kong. His research interests include actuarial science, especially credibility theory, and data analytics. KA CHUN CHEUNG [BSC(ACTUARSC) & PHD (HKU), ASA (SOA)] was the Director of the Actuarial Science Programme, and is currently Head and full Professor at the Department of Statistics and Actuarial Science in School of Computing and Data Science, The University of Hong Kong. His current research interests include various topics in actuarial science, including optimal reinsurance, stochastic orders, dependence structures, and extreme value theory. PHILLIP YAM [BSC(ACTUARSC) & MPHIL (HKU), MAST (CANTAB), DPHIL (OXON)] is currently Director of QFRM programme, and a full Professor at the Department of Statistics of The Chinese University of Hong Kong, also Assistant Dean (Education) of CUHK Faculty of Science, and a Visiting Professor in Columbia University and UTD Business School. He has more than 100 top journal articles in actuarial science, applied mathematics, data analytics, engineering, financial mathematics, operations management, and statistics. His research project CIBer won a Silver Medal in the 48th International Exhibition of Inventions Geneva in 2023.

About the Authors xvii

Foreword xix

Preface xxi

Acknowledgements xxv

Introduction 1

Development of Financial Data Analytics 1

Organization of the Book 5

References 7

Part One Data Cleansing and Analytical Models

Chapter 1 Mathematical and Statistical Preliminaries 11

1.1 Random Vector 12

1.2 Matrix Theory 16

1.3 Vectors and Matrix Norms 23

1.4 Common Probability Distributions 24

1.5 Introductory Bayesian Statistics 30

References 40

Chapter 2 Introduction to Python and R 41

2.1 What is Python? 41

2.2 What is R? 42

2.3 Package Management in Python and R 42

2.4 Basic Operations in Python and R 44

2.5 One-Way ANOVA and Tukey’s HSD for Stock Market Indices 49

References 64

Chapter 3 Statistical Diagnostics of Financial Data 67

3.1 Normality Assumption for Relative Stock Price Changes 67

3.2 Student’s tν-distribution for Stock Price Changes 76

3.3 Testing for Multivariate Normality 81

3.4 Sample Correlation Matrix 84

3.5 Empirical Properties of Stock Prices 86

3.A Appendix 93

References 97

Chapter 4 Financial Forensics 99

4.1 Benford’s Law 99

4.2 Scaling Invariance and Benford’s Law 101

4.3 Benford’s Law in Business Reports 104

4.4 Benford’s Law in Growth Figures 117

4.5 Zipf’s Law 125

4.6 Zipf’s Law and COVID-19 Figures 127

4.A Appendix 132

References 136

Chapter 5 Numerical Finance 139

5.1 Fundamentals of Simulation 139

5.2 Variance Reduction Technique 146

5.3 A Review of Financial Calculus and Derivative Pricing 158

*5.4 Greeks and their Approximations 179

References 199

Chapter 6 Approximation for Model Inference 201

6.1 EM Algorithm 201

6.2 mm Algorithm 216

*6.3 A Short Course on the Theory of Markov Chains 222

*6.4 Markov Chain Monte Carlo 236

*6.A Appendix 261

References 268

Chapter 7 Time-Varying Volatility Matrix and Kelly Fraction 271

7.1 Fluctuation of Volatilities 271

7.2 Exponentially Weighted Moving Average 275

7.3 ARIMA Time Series Model 277

7.4 ARCH and GARCH Models 291

*7.5 Kelly Fraction 317

7.6 Calendar Effects 330

*7.A Appendix 335

References 343

Chapter 8 Risk Measures, Extreme Values, and Copulae 345

8.1 Value-at-Risk and Expected Shortfall 345

8.2 Basel Accords and Risk Measures 348

8.3 Historical Simulation (Bootstrapping) 350

8.4 Statistical Model Building Approach 354

8.5 Use of Extreme Value Theory 356

8.6 Backtesting 359

8.7 Estimates of Expected Shortfall 364

8.8 Dependence Modelling via Copulae 369

*8.A Appendix 402

References 404

Part Two Linear Models

Chapter 9 Principal Component Analysis and Recommender Systems 409

9.1 US Zero-Coupon Rates 409

9.2 PCA Algorithm 411

9.3 Financial Interpretation of PCs for US Zero-Coupon Rates 417

9.4 PCA as an Eigenvalue Problem 421

9.5 Factor Models via PCA 422

9.6 Value-at-Risk via PCA 424

9.7 Portfolio Immunization 427

9.8 Facial Recognition via PCA 430

9.9 Non-Life Insurance via PCA 439

9.10 Investment Strategies using PCA 442

*9.11 Recommender System 447

*9.A Appendix 456

References 465

Chapter 10 Regression Learning 467

10.1 Simple and Multiple Linear Regression Models and Beyond 467

10.2 Polynomial Regression 473

10.3 Generalized Linear Models 478

10.4 Logistic Regression 484

10.5 Poisson Regression 497

10.6 Model Evaluation and Considerations in Practice 501

*10.7 Principal Component Regression 510

*10.A Appendix 518

References 522

Chapter 11 Linear Classifiers 525

11.1 Perceptron 526

11.2 Support Vector Machine 533

*11.A Appendix 545

References 567

Part Three Nonlinear Models

Chapter 12 Bayesian Learning 571

12.1 Simple Credibility Theory 571

*12.2 Bayesian Asymptotic Inference 573

12.3 Revisiting Polynomial Regression 575

12.4 Bayesian Classifiers 578

12.5 Comonotone-Independence Bayes Classifier (CIBer) 580

12.A Appendix 609

References 612

Chapter 13 Classification and Regression Trees, and Random Forests 613

13.1 Classification (Decision) Trees 613

*13.2 Concepts of Entropies 615

13.3 Information Gain 623

13.4 Other Impurity Measures for Information 626

13.5 Splitting Against Continuous Attributes 629

13.6 Overfitting in Classification Tree 630

13.7 Classification Trees in Python and R 633

13.8 Regression Trees 641

13.9 Random Forest 649

13.A Appendix 654

References 659

Chapter 14 Cluster Analysis 661

14.1 K-Means Clustering 661

14.2 K-Nearest Neighbour 694

*14.3 Kernel Regression 703

*14.A Appendix 714

References 725

Chapter 15 Applications of Deep Learning in Finance 727

15.1 Human Brains and Artificial Neurons 727

15.2 Feedforward Network 729

15.3 ANN with Linear Outputs 730

15.4 ANN with Logistic Outputs 737

15.5 Adaptive Learning Rate 740

15.6 Training Neural Networks via Backpropagation 742

15.7 Multilayer Perceptron 746

15.8 Universal Approximation Theorem 752

15.9 Long Short-Term Memory (LSTM) 754

References 764

Postlude 767

Index 769

Erscheinungsdatum	29.10.2024
Reihe/Serie	Wiley Finance
Verlagsort	New York
Sprache	englisch
Maße	180 x 246 mm
Gewicht	1202 g
Themenwelt	Mathematik / Informatik ► Mathematik
Themenwelt	Wirtschaft ► Betriebswirtschaft / Management
ISBN-10	1-119-86337-6 / 1119863376
ISBN-13	978-1-119-86337-3 / 9781119863373
Zustand	Neuware