Mathematical Tools for Data Mining (eBook)

Set Theory, Partial Orders, Combinatorics
eBook Download: PDF
2008 | 2008
XII, 615 Seiten
Springer London (Verlag)
978-1-84800-201-2 (ISBN)

Lese- und Medienproben

Mathematical Tools for Data Mining - Dan A. Simovici, Chaabane Djeraba
Systemvoraussetzungen
149,79 inkl. MwSt
  • Download sofort lieferbar
  • Zahlungsarten anzeigen
This volume was born from the experience of the authors as researchers and educators,whichsuggeststhatmanystudentsofdataminingarehandicapped in their research by the lack of a formal, systematic education in its mat- matics. The data mining literature contains many excellent titles that address the needs of users with a variety of interests ranging from decision making to p- tern investigation in biological data. However, these books do not deal with the mathematical tools that are currently needed by data mining researchers and doctoral students. We felt it timely to produce a book that integrates the mathematics of data mining with its applications. We emphasize that this book is about mathematical tools for data mining and not about data mining itself; despite this, a substantial amount of applications of mathematical c- cepts in data mining are presented. The book is intended as a reference for the working data miner. In our opinion, three areas of mathematics are vital for data mining: set theory,includingpartially orderedsetsandcombinatorics;linear algebra,with its many applications in principal component analysis and neural networks; and probability theory, which plays a foundational role in statistics, machine learning and data mining. Thisvolumeisdedicatedtothestudyofset-theoreticalfoundationsofdata mining. Two further volumes are contemplated that will cover linear algebra and probability theory. The ?rst part of this book, dedicated to set theory, begins with a study of functionsandrelations.Applicationsofthesefundamentalconceptstosuch- sues as equivalences and partitions are discussed. Also, we prepare the ground for the following volumes by discussing indicator functions, ?elds and?-?elds, and other concepts.
This volume was born from the experience of the authors as researchers and educators,whichsuggeststhatmanystudentsofdataminingarehandicapped in their research by the lack of a formal, systematic education in its mat- matics. The data mining literature contains many excellent titles that address the needs of users with a variety of interests ranging from decision making to p- tern investigation in biological data. However, these books do not deal with the mathematical tools that are currently needed by data mining researchers and doctoral students. We felt it timely to produce a book that integrates the mathematics of data mining with its applications. We emphasize that this book is about mathematical tools for data mining and not about data mining itself; despite this, a substantial amount of applications of mathematical c- cepts in data mining are presented. The book is intended as a reference for the working data miner. In our opinion, three areas of mathematics are vital for data mining: set theory,includingpartially orderedsetsandcombinatorics;linear algebra,with its many applications in principal component analysis and neural networks; and probability theory, which plays a foundational role in statistics, machine learning and data mining. Thisvolumeisdedicatedtothestudyofset-theoreticalfoundationsofdata mining. Two further volumes are contemplated that will cover linear algebra and probability theory. The ?rst part of this book, dedicated to set theory, begins with a study of functionsandrelations.Applicationsofthesefundamentalconceptstosuch- sues as equivalences and partitions are discussed. Also, we prepare the ground for the following volumes by discussing indicator functions, ?elds and?-?elds, and other concepts.

Preface 5
Contents 7
Part I Set Theory 14
1 Sets, Relations, and Functions 15
1.1 Introduction 15
1.2 Sets and Collections 15
1.3 Relations and Functions 21
1.4 The Axiom of Choice 46
1.5 Countable Sets 47
1.6 Elementary Combinatorics 50
1.7 Multisets 56
1.8 Relational Databases 58
Exercises and Supplements 61
Bibliographical Comments 67
2 Algebras 69
2.1 Introduction 69
2.2 Operations and Algebras 69
2.3 Morphisms, Congruences, and Subalgebras 73
2.4 Linear Spaces 76
2.5 Matrices 80
Exercises and Supplements 86
Bibliographical Comments 89
3 Graphs and Hypergraphs 91
3.1 Introduction 91
3.2 Basic Notions of Graph Theory 91
3.3 Trees 104
3.4 Flows in Digraphs 123
3.5 Hypergraphs 130
Exercises and Supplements 133
Bibliographical Comments 136
Part II Partial Orders 139
4 Partially Ordered Sets 141
4.1 Introduction 141
4.2 Partial Orders 141
4.3 Special Elements of Partially Ordered Sets 145
4.4 The Poset of Real Numbers 149
4.5 Closure and Interior Systems 151
4.6 The Poset of Partitions of a Set 156
4.7 Chains and Antichains 160
4.8 Poset Product 167
4.9 Functions and Posets 170
4.10 Posets and the Axiom of Choice 172
4.11 Locally Finite Posets and M¨ obius Functions 174
Exercises and Supplements 180
Bibliographical Comments 184
5 Lattices and Boolean Algebras 185
5.1 Introduction 185
5.2 Lattices as Partially Ordered Sets and Algebras 185
5.3 Special Classes of Lattices 192
5.4 Complete Lattices 200
5.5 Boolean Algebras and Boolean Functions 204
5.6 Logical Data Analysis 223
Exercises and Supplements 231
Bibliographical Comments 236
6 Topologies and Measures 237
6.1 Introduction 237
6.2 Topologies 237
6.3 Closure and Interior Operators in Topological Spaces 238
6.4 Bases 247
6.5 Compactness 251
6.6 Continuous Functions 253
6.7 Connected Topological Spaces 256
6.8 Separation Hierarchy of Topological Spaces 259
6.9 Products of Topological Spaces 261
6.10 Fields of Sets 263
6.11 Measures 268
Exercises and Supplements 277
Bibliographical Comments 284
7 Frequent Item Sets and Association Rules 285
7.1 Introduction 285
7.2 Frequent Item Sets 285
7.3 Borders of Collections of Sets 291
7.4 Association Rules 293
7.5 Levelwise Algorithms and Posets 295
7.6 Lattices and Frequent Item Sets 300
Exercises and Supplements 302
Bibliographical Comments 304
8 Applications to Databases and Data Mining 307
8.1 Introduction 307
8.2 Tables and Indiscernibility Relations 307
8.3 Partitions and Functional Dependencies 310
8.4 Partition Entropy 317
8.5 Generalized Measures and Data Mining 333
8.6 Di.erential Constraints 337
Exercises and Supplements 342
Bibliographical Comments 344
9 Rough Sets 345
9.1 Introduction 345
9.2 Approximation Spaces 345
9.3 Decision Systems and Decision Trees 349
9.4 Closure Operators and Rough Sets 357
Exercises and Supplements 359
Bibliographical Comments 360
Part III Metric Spaces 361
10 Dissimilarities, Metrics, and Ultrametrics 363
10.1 Introduction 363
10.2 Classes of Dissimilarities 363
10.3 Tree Metrics 369
10.4 Ultrametric Spaces 378
10.5 Metrics on 389
10.6 Metrics on Collections of Sets 400
10.7 Metrics on Partitions 406
10.8 Metrics on Sequences 410
10.9 Searches in Metric Spaces 414
Exercises and Supplements 423
Bibliographical Comments 433
11 Topologies and Measures on Metric Spaces 435
11.1 Introduction 435
11.2 Metric Space Topologies 435
11.3 Continuous Functions in Metric Spaces 438
11.4 Separation Properties of Metric Spaces 439
11.5 Sequences in Metric Spaces 447
11.6 Completeness of Metric Spaces 451
11.7 Contractions and Fixed Points 457
11.8 Measures in Metric Spaces 461
11.9 Embeddings of Metric Spaces 464
Exercises and Supplements 466
Bibliographical Comments 470
12 Dimensions of Metric Spaces 471
12.1 Introduction 471
12.2 The Dimensionality Curse 471
12.3 Inductive Dimensions of Topological Metric Spaces 474
12.4 The Covering Dimension 484
12.5 The Cantor Set 487
12.6 The Box-Counting Dimension 491
12.7 The Hausdor.-Besicovitch Dimension 494
12.8 Similarity Dimension 498
Exercises and Supplements 502
Bibliographical Comments 505
13 Clustering 507
13.1 Introduction 507
13.2 Hierarchical Clustering 508
13.3 The 524
Means 524
Algorithm 524
13.4 The PAM Algorithm 526
13.5 Limitations of Clustering 528
13.6 Clustering Quality 532
Exercises and Supplements 535
Bibliographical Comments 537
Part IV Combinatorics 539
14 Combinatorics 541
14.1 Introduction 541
14.2 The Inclusion-Exclusion Principle 541
14.3 Ramsey’s Theorem 545
14.4 Combinatorics of Partitions 548
14.5 Combinatorics of Collections of Sets 551
Exercises and Supplements 556
Bibliographical Comments 561
15 The Vapnik-Chervonenkis Dimension 563
15.1 Introduction 563
15.2 The Vapnik-Chervonenkis Dimension 563
15.3 Perceptrons 575
Exercises and Supplements 577
Bibliographical Comments 579
Part V Appendices 581
A Asymptotics 583
B Convex Sets and Functions 585
C Useful Integrals and Formulas 595
C.1 Euler’s Integrals 595
C.2 Wallis’s Formula 599
C.3 Stirling’s Formula 600
C.4 The Volume of an 602
Dimensional 602
Sphere 602
D A Characterization of a Function 605
References 609
Topic Index 617

Erscheint lt. Verlag 15.8.2008
Reihe/Serie Advanced Information and Knowledge Processing
Advanced Information and Knowledge Processing
Zusatzinfo XII, 615 p.
Verlagsort London
Sprache englisch
Themenwelt Informatik Datenbanken Data Warehouse / Data Mining
Mathematik / Informatik Informatik Theorie / Studium
Mathematik / Informatik Mathematik Angewandte Mathematik
Technik
Schlagworte Algebra • Boolean algebra • Cluster • Clustering • Collection • combinatorics • Covering dimension • Database • Databases • Data Mining • Dendrogram • Entropy • Ferrers diagram • flow • Frequent item set • functional dependency • Galois connections • Generalized entropy • Generalized measure • Graph • Hausdorff-Besicovitch dimension • Inclusion-exclusion principle • Inductive topological dimension • lattice • Levelwise algorithms • measure • metric • Mobius function • Operation • Outer measure • Perceptron • Sets • Similarity dimension • Sperner systems • Submodular function • Supramodular function • Topology • Tree • Tree metric • Ultrametric • Vapnik-Chervonenkis dimension
ISBN-10 1-84800-201-7 / 1848002017
ISBN-13 978-1-84800-201-2 / 9781848002012
Haben Sie eine Frage zum Produkt?
PDFPDF (Wasserzeichen)
Größe: 13,1 MB

DRM: Digitales Wasserzeichen
Dieses eBook enthält ein digitales Wasser­zeichen und ist damit für Sie persona­lisiert. Bei einer missbräuch­lichen Weiter­gabe des eBooks an Dritte ist eine Rück­ver­folgung an die Quelle möglich.

Dateiformat: PDF (Portable Document Format)
Mit einem festen Seiten­layout eignet sich die PDF besonders für Fach­bücher mit Spalten, Tabellen und Abbild­ungen. Eine PDF kann auf fast allen Geräten ange­zeigt werden, ist aber für kleine Displays (Smart­phone, eReader) nur einge­schränkt geeignet.

Systemvoraussetzungen:
PC/Mac: Mit einem PC oder Mac können Sie dieses eBook lesen. Sie benötigen dafür einen PDF-Viewer - z.B. den Adobe Reader oder Adobe Digital Editions.
eReader: Dieses eBook kann mit (fast) allen eBook-Readern gelesen werden. Mit dem amazon-Kindle ist es aber nicht kompatibel.
Smartphone/Tablet: Egal ob Apple oder Android, dieses eBook können Sie lesen. Sie benötigen dafür einen PDF-Viewer - z.B. die kostenlose Adobe Digital Editions-App.

Zusätzliches Feature: Online Lesen
Dieses eBook können Sie zusätzlich zum Download auch online im Webbrowser lesen.

Buying eBooks from abroad
For tax law reasons we can sell eBooks just within Germany and Switzerland. Regrettably we cannot fulfill eBook-orders from other countries.

Mehr entdecken
aus dem Bereich
Datenschutz und Sicherheit in Daten- und KI-Projekten

von Katharine Jarmul

eBook Download (2024)
O'Reilly Verlag
49,90