Introduction to Data Mining - Pang-Ning Tan, Michael Steinbach, Vipin Kumar

Introduction to Data Mining

Buch | Hardcover
792 Seiten
2005
Pearson (Verlag)
978-0-321-32136-7 (ISBN)
143,30 inkl. MwSt
zur Neuauflage
  • Titel erscheint in neuer Auflage
  • Artikel merken
Zu diesem Artikel existiert eine Nachauflage
Presents fundamental concepts and algorithms for those learning data mining for the first time. This book explores each concept and features each major topic organized into two chapters, beginning with basic concepts that provide necessary background for understanding each data mining technique, followed by more advanced concepts and algorithms.

 

1 Introduction

1.1 What is Data Mining?

1.2 Motivating Challenges

1.3 The Origins of Data Mining

1.4 Data Mining Tasks

1.5 Scope and Organization of the Book 

1.6 Bibliographic Notes

1.7 Exercises

 

2 Data

2.1 Types of Data

2.2 Data Quality

2.3 Data Preprocessing

2.4 Measures of Similarity and Dissimilarity

2.5 Bibliographic Notes

2.6 Exercises

 

3 Exploring Data

3.1 The Iris Data Set 

3.2 Summary Statistics

3.3 Visualization

3.4 OLAP and Multidimensional Data Analysis

3.5 Bibliographic Notes

3.6 Exercises

 

4 Classification: Basic Concepts, Decision Trees, and Model Evaluation

4.1 Preliminaries

4.2 General Approach to Solving a Classification Problem

4.3 Decision Tree Induction

4.4 Model Overfitting

4.5 Evaluating the Performance of a Classifier

4.6 Methods for Comparing Classifiers

4.7 Bibliographic Notes

4.8 Exercises

 

5 Classification: Alternative Techniques

5.1 Rule-Based Classifier

5.2 Nearest-Neighbor Classifiers

5.3 Bayesian Classifiers

5.4 Artificial Neural Network (ANN)

5.5 Support Vector Machine (SVM)

5.6 Ensemble Methods

5.7 Class Imbalance Problem

5.8 Multiclass Problem

5.9 Bibliographic Notes

5.10 Exercises

 

6 Association Analysis: Basic Concepts and Algorithms

6.1 Problem Definition

6.2 Frequent Itemset Generation

6.3 Rule Generation

6.4 Compact Representation of Frequent Itemsets

6.5 Alternative Methods for Generating Frequent Itemsets

6.6 FP-Growth Algorithm

6.7 Evaluation of Association Patterns

6.8 Effect of Skewed Support Distribution

6.9 Bibliographic Notes

6.10 Exercises

 

7 Association Analysis: Advanced Concepts  

7.1 Handling Categorical Attributes

7.2 Handling Continuous Attributes

7.3 Handling a Concept Hierarchy

7.4 Sequential Patterns

7.5 Subgraph Patterns

7.6 Infrequent Patterns

7.7 Bibliographic Notes

7.8 Exercises

 

8 Cluster Analysis: Basic Concepts and Algorithms

8.1 Overview

8.2 K-means

8.3 Agglomerative Hierarchical Clustering

8.4 DBSCAN

8.5 Cluster Evaluation

8.6 Bibliographic Notes

8.7 Exercises

 

9 Cluster Analysis: Additional Issues and Algorithms

9.1 Characteristics of Data, Clusters, and Clustering Algorithms

9.2 Prototype-Based Clustering

9.3 Density-Based Clustering

9.4 Graph-Based Clustering

9.5 Scalable Clustering Algorithms

9.6 Which Clustering Algorithm?

9.7 Bibliographic Notes

9.8 Exercises

 

10 Anomaly Detection

10.1 Preliminaries

10.2 Statistical Approaches

10.3 Proximity-Based Outlier Detection

10.4 Density-Based Outlier Detection

10.5 Clustering-Based Techniques

10.6 Bibliographic Notes

10.7 Exercises

 

Appendix A Linear Algebra

Appendix B Dimensionality Reduction

Appendix C Probability and Statistics

Appendix D Regression

Appendix E Optimization

 

Author Index

Subject Index

Erscheint lt. Verlag 7.7.2005
Sprache englisch
Maße 199 x 238 mm
Gewicht 1360 g
Themenwelt Mathematik / Informatik Informatik Datenbanken
ISBN-10 0-321-32136-7 / 0321321367
ISBN-13 978-0-321-32136-7 / 9780321321367
Zustand Neuware
Haben Sie eine Frage zum Produkt?
Mehr entdecken
aus dem Bereich
Der Grundkurs für Ausbildung und Praxis

von Ralf Adams

Buch (2023)
Carl Hanser (Verlag)
29,99