COMPSTAT 2006 - Proceedings in Computational Statistics (eBook)
XXV, 537 Seiten
Physica (Verlag)
978-3-7908-1709-6 (ISBN)
Preface 5
Contents 8
Part I Classification and Clustering 25
Issues of robustness and high dimensionality in cluster analysis 26
1 Introduction 26
2 Multivariate t Distribution 29
3 ML Estimation of Mixtures of t Components 30
4 Factor Analysis Model for Dimension Reduction 31
5 Mixtures of Normal Factor Analyzers 32
6 Mixtures of t Factor Analyzers 34
7 Discussion 36
References 36
Fuzzy K-medoids clustering models for fuzzy multivariate time trajectories 39
1 Introduction 39
2 Fuzzy data time arrays, fuzzy multivariate time trajectories and dissimilarity measures 40
3 Fuzzy K-means clustering models for fuzzy multivariate time trajectories [ CD03] 43
4 Fuzzy K-medoids clustering for fuzzy multivariate time trajectories 45
5 Application 47
References 50
Bootstrap methods for measuring classification uncertainty in latent class analysis 52
1 Introduction 52
2 Measures of classification uncertainty 54
3 The bootstrap method 55
4 Bootstrapping LC models 56
5 Applications 57
6 Discussion 60
References 61
A robust linear grouping algorithm 63
1 Introduction 63
2 Linear Grouping Algorithm 64
3 Robust Linear Grouping Algorithm 65
4 Examples 67
5 Discussion 70
References 72
Computing and using the deviance with classification trees 74
1 Introduction 74
2 Tree induction principle: an illustrative example 75
3 Validating the tree descriptive ability 77
4 Computational aspects 82
5 Conclusion 84
References 84
Estimation procedures for the false discovery rate: a systematic comparison for microarray data 86
1 Introduction 86
2 The testing problem 87
3 The false discovery rate 88
4 Estimation procedures 89
5 The data sets 92
6 Outline of the comparative study 95
7 Results and conclusions 96
Acknowledgment 98
References 98
A unifying model for biclustering* 99
1 Illustrative Example 99
2 Biclustering 100
3 A Unifying Biclustering Model 101
4 Data Analysis 103
5 Concluding Remarks 104
References 105
Part II Image Analysis and Signal Processing 107
Non-rigid image registration using mutual information 108
1 Introduction 108
2 Non-rigid registration 109
3 The mutual information criterion 112
4 Non-rigid registration using mutual information 113
5 Validation 116
References 117
Musical audio analysis using sparse representations 121
1 Introduction 121
2 Finding Sparse Representations 122
3 Sparse Representations for Music Transcription 125
4 Source Separation 128
5 Conclusions 130
Acknowledgements 130
References 131
Robust correspondence recognition for computer vision 134
1 Introduction 134
2 Stability and Digraph Kernels 138
3 Properties of Strict Sub-Kernels 142
4 A Simple Algorithm for Interval Orientations 144
5 Discussion 144
References 145
Blind superresolution 147
1 Introduction 147
2 Mathematical Model 150
3 Blind Superresolution 152
4 Experiments 155
5 Conclusions 156
Acknowledgment 157
References 157
Analysis of Music Time Series 160
1 Introduction 160
2 Model building 161
3 Applied models 164
4 Studies 166
5 Conclusion 171
References 172
Part III Data Visualization 173
Tying up the loose ends in simple, multiple, joint correspondence analysis 174
1 Introduction 174
2 Basic CA theory 175
3 Multiple and joint correspondence analysis 177
4 Data sets used as illustrations 177
5 Measuring variance and comparing different tables 178
6 The myth of the influential outlier 179
7 The scaling problem in CA 180
8 To rotate or not to rotate 186
9 Statistical significance of results 189
10 Loose ends in MCA and JCA 191
Acknowledgments 194
References 194
3 dimensional parallel coordinates plot and its use for variable selection 197
1 Introduction 197
2 Parallel coordinates plot and interactive operations 198
3 3 dimensional parallel coordinates plot 199
4 Implementation of 3D PCP software 203
5 Concluding remarks 204
References 204
Geospatial distribution of alcohol-related violence in Northern Virginia 206
1 Introduction 206
2 Overview of the Model 207
3 The Data 211
4 Estimating the Probabilities 212
5 Geospatial Visualization of Acute Outcomes 213
6 Conclusions 214
Acknowledgements 215
References 216
Visualization in comparative music research 217
1 Introduction 217
2 Music representations 218
3 Musical databases 219
4 Musical feature extraction 220
5 Data mining 220
6 Examples of visualization of musical collections 222
7 Conclusion 225
References 226
Exploratory modelling analysis: visualizing the value of variables 228
1 Introduction 228
2 Example — Florida 2004 229
3 Selection — More than just Variable Selection 231
4 Graphics for Variable Selection 233
5 Small or LARGE Datasets 236
6 Summary and Outlook 236
References 237
Density estimation from streaming data using wavelets 238
1 Introduction 238
2 Recursive Formulation 242
3 Discounting Old Data 243
4 A Case Study: Internet Header Traffic Data 245
References 249
Part IV Multivariate Analysis 250
Reducing conservatism of exact small-sample methods of inference for discrete data 251
1 Introduction 251
2 Small-Sample Inference for Discrete Distributions 253
3 Ways of Reducing Conservatism 255
4 Fuzzy Inference Using Discrete Data 259
5 The Mid-P Quasi-Exact Approach 260
Acknowledgement 264
References 265
Symbolic data analysis: what is it? 267
1 Symbolic Data 267
2 Structure 270
3 Analysis: Symbolic vis-a-vis Classical Approach 272
4 Conclusion 273
References 274
A dimensional reduction method for ordinal three- way contingency table 276
1 Introduction 276
2 Decomposing a Non Symmetric Index 277
3 The Partition of a Predictability Measure 279
4 Ordinal Three-Way Non Symmetrical Correspondence Analysis 280
5 Example 284
References 287
Operator related to a data matrix: a survey 289
1 The initial choices 289
2 Joint analysis of several data matrices (the STATIS method) 293
3 Principal component analysis with respect to instrumental variables 295
4 Conclusions 298
Acknowledgements 299
References 299
Factor interval data analysis and its application 302
1 Introduction 302
2 Methodology of Interval Data and Its Possible Limitations 303
3 Methodology of Factor Interval Data and Its Advantages 307
4 Application in Chinese Stock Markets 309
5 Conclusion 315
References 315
Identifying excessively rounded or truncated data 316
1 Data 316
2 DensityModels 317
3 Asymptotic Behavior 322
4 Conclusion 325
Acknowledgements 325
References 326
Statistical inference and data mining: false discoveries control 327
Introduction 327
1 Data Mining Specificities and Statistical Inference 328
2 Validation of Interesting Features 329
3 Controlling UAFWER Using the BS FD Algorithm 332
4 Experimentation 335
Conclusion and Perspectives 337
References 337
Is ‘Which model . . .?’ the right question? 339
1 Introduction 339
2 Preliminaries 340
3 From choice to synthesis 342
4 Example 347
5 Conclusion 350
References 350
Use of latent class regression models with a random intercept to remove the effects of the overall response rating level 352
1 Introduction 352
2 Description of the cracker case study 353
3 The LC ordinal regression model with a random intercept 354
4 Results obtained with the cracker data set 356
5 General discussion 357
References 360
Discrete functional data analysis 362
1 Introduction 362
2 Functional Data 363
3 Difference Operators 363
4 Detection of Relations among Differences 365
5 Concluding Remarks 369
References 369
Self organizing MAPS: understanding, measuring and reducing variability 371
1 Introduction 372
2 Several Approaches Concerning the Preservation of the Topology 373
3 Understanding Variability of SOM’ Neighbourhood Structure Visualizing Distances between All Classes 375
4 The R-map Method to Increase SOM Reliability 376
5 Application: Validating the Number of Units for a SOM Network 379
6 Conclusion 381
References 382
Parameterization and estimation of path models for categorical data 383
1 Introduction 383
2 Log-linear, graphical and DAG models 384
3 DAG models as marginal models 386
4 Parameterization of DAG models 386
5 Path models 387
6 Maximum likelihood estimation 388
7 An example 390
References 394
Latent class model with two latent variables for analysis of count data 395
1 Introduction 395
2 Model 396
3 Analysis of retail market data 397
References 399
Part V Web Based Teaching 400
Challenges concerning web data mining 401
1 Motivation 401
2 Challenges Concerning Algorithmic Aspects 405
3 Conclusions and Further Research 412
References 412
e-Learning statistics – a selective review 415
1 Introduction 415
2 Modern e-Learning Materials 416
3 Evaluation 423
4 Conclusion 424
References 425
Quality assurance of web based e-Learning for statistical education 427
1 Introduction 427
2 Important Features of the e-StatEdu System 429
3 Quality Assurance 432
4 Discussion 435
Acknowledgement 435
References 435
Part VI Algorithms 437
Genetic algorithms for building double threshold generalized autoregressive conditional heteroscedastic models of time series 438
1 Introduction 439
2 The DTGARCHModel 441
3 A Genetic Algorithm for DTGARCH Model Building 442
4 Application to Financial Data 444
5 Conclusions 447
References 448
Nonparametric evaluation of matching noise 450
1 Introduction and preliminaries 450
2 Statistical framework for matching noise 451
3 Matching noise for KNN distance hot-deck 453
4 An important special case: distance hot-deck 454
5 d0-Kernel hot-deck 455
6 A comparison among different techniques 456
References 457
Subset selection algorithm based on mutual information 458
1 Introduction 458
2 Estimation of mutual information using normal mixture 460
3 Algorithm for subset selection 461
4 Numerical investigation with real data set 465
References 465
Visiting near-optimal solutions using local search algorithms 468
1 Background and motivation 468
2 Definitions and notation 469
3 The ß-acceptable solution probability 471
4 Visiting a ß-acceptable solution 473
5 Computational results 474
6 Conclusions 477
References 478
The convergence of optimization based GARCH estimators: theory and application* 479
1 Introduction 479
2 Convergence of Optimization Based Estimators 480
3 Application to GARCH Model 483
4 Results 484
5 Conclusions 488
References 489
The stochastics of threshold accepting: analysis of an application to the uniform design problem 491
1 Introduction 491
2 Formal Framework 492
3 Results for Uniform Design Implementation 493
4 Conclusions and Outlook 498
References 498
Part VII Robustness 500
Robust classification with categorical variables 501
1 Introduction 501
2 Cluster detection through diagnostic monitoring 502
3 Performance of the method 505
4 E-government data 509
Acknowledgement 512
References 512
Multiple group linear discriminant analysis: robustness and error rate 514
1 Introduction 514
2 Estimation and Robustness 516
3 Optimal Error Rate for Three Groups 518
4 Simulations 520
5 Conclusions 524
References 524
Author Index 526
Erscheint lt. Verlag | 3.12.2007 |
---|---|
Zusatzinfo | XXV, 537 p. |
Verlagsort | Heidelberg |
Sprache | englisch |
Themenwelt | Mathematik / Informatik ► Informatik ► Datenbanken |
Mathematik / Informatik ► Informatik ► Theorie / Studium | |
Mathematik / Informatik ► Mathematik ► Angewandte Mathematik | |
Mathematik / Informatik ► Mathematik ► Computerprogramme / Computeralgebra | |
Mathematik / Informatik ► Mathematik ► Statistik | |
Mathematik / Informatik ► Mathematik ► Wahrscheinlichkeit / Kombinatorik | |
Technik | |
Schlagworte | classification • cluster analysis • Clustering • Computational Statistics • Data Analysis • Data Mining • Estimator • Image Analysis • Latent Class Analysis • linear discriminant analysis • parametric statistics • resampling • Statistical Data Analysis • Statistical Multivariate Methods • statistical software • Statistics • Time Series • Truncated Data |
ISBN-10 | 3-7908-1709-0 / 3790817090 |
ISBN-13 | 978-3-7908-1709-6 / 9783790817096 |
Haben Sie eine Frage zum Produkt? |
Größe: 12,4 MB
DRM: Digitales Wasserzeichen
Dieses eBook enthält ein digitales Wasserzeichen und ist damit für Sie personalisiert. Bei einer missbräuchlichen Weitergabe des eBooks an Dritte ist eine Rückverfolgung an die Quelle möglich.
Dateiformat: PDF (Portable Document Format)
Mit einem festen Seitenlayout eignet sich die PDF besonders für Fachbücher mit Spalten, Tabellen und Abbildungen. Eine PDF kann auf fast allen Geräten angezeigt werden, ist aber für kleine Displays (Smartphone, eReader) nur eingeschränkt geeignet.
Systemvoraussetzungen:
PC/Mac: Mit einem PC oder Mac können Sie dieses eBook lesen. Sie benötigen dafür einen PDF-Viewer - z.B. den Adobe Reader oder Adobe Digital Editions.
eReader: Dieses eBook kann mit (fast) allen eBook-Readern gelesen werden. Mit dem amazon-Kindle ist es aber nicht kompatibel.
Smartphone/Tablet: Egal ob Apple oder Android, dieses eBook können Sie lesen. Sie benötigen dafür einen PDF-Viewer - z.B. die kostenlose Adobe Digital Editions-App.
Buying eBooks from abroad
For tax law reasons we can sell eBooks just within Germany and Switzerland. Regrettably we cannot fulfill eBook-orders from other countries.
aus dem Bereich