Sequence Data Mining (eBook)

, (Autoren)

eBook Download: PDF
2007 | 2007
XVI, 150 Seiten
Springer US (Verlag)
978-0-387-69937-0 (ISBN)

Lese- und Medienproben

Sequence Data Mining - Guozhu Dong, Jian Pei
Systemvoraussetzungen
96,29 inkl. MwSt
  • Download sofort lieferbar
  • Zahlungsarten anzeigen

Understanding sequence data, and the ability to utilize this hidden knowledge, will create a significant impact on many aspects of our society. Examples of sequence data include DNA, protein, customer purchase history, web surfing history, and more.

This book provides thorough coverage of the existing results on sequence data mining as well as pattern types and associated pattern mining methods. It offers balanced coverage on data mining and sequence data analysis, allowing readers to access the state-of-the-art results in one place.


Understanding sequence data, and the ability to utilize this hidden knowledge, will create a significant impact on many aspects of our society. Examples of sequence data include DNA, protein, customer purchase history, web surfing history, and more.This book provides thorough coverage of the existing results on sequence data mining as well as pattern types and associated pattern mining methods. It offers balanced coverage on data mining and sequence data analysis, allowing readers to access the state-of-the-art results in one place.

Foreword 7
Biography 9
Preface 10
Contents 12
Introduction 15
Examples and Applications of Sequence Data 15
Examples of Sequence Data 16
Examples of Sequence Mining Applications 18
Basic Definitions 20
Sequences and Sequence Types 20
Characteristics of Sequence Data 21
Sequence Patterns and Sequence Models 22
General Data Mining Processes and Research Issues 25
Overview of the Book 26
Frequent and Closed Sequence Patterns 28
Sequential Patterns 28
GSP: An Apriori-like Method 31
PrefixSpan: A Pattern-growth, Depth-first Search Method 33
Apriori-like, Breadth-first Search versus Pattern-growth, Depth-first Search 33
PrefixSpan 35
Pseudo-Projection 39
Mining Sequential Patterns with Constraints 41
Categories of Constraints 42
Mining Sequential Patterns with Prefix-Monotone Constraints 46
Prefix-Monotone Property 46
Pushing Prefix-Monotone Constraints into Sequential Pattern Mining 48
Handling Tough Aggregate Constraints by Prefix-growth 52
Mining Closed Sequential Patterns 55
Closed Sequential Patterns 55
Efficiently Mining Closed Sequential Patterns 57
Summary 58
Classification, Clustering, Features and Distances of Sequence Data 60
Three Tasks on Sequence Classification/Clustering 60
Sequence Features 61
Sequence Feature Types 61
Sequence Feature Selection 63
Distance Functions over Sequences 64
Overview on Sequence Distance Functions 64
Edit, Hamming, and Alignment based Distances 65
Conditional Probability Distribution based Distance 66
An Example of Feature based Distance: d2 66
Web Session Similarity 67
Classification of Sequence Data 68
Support Vector Machines 68
Artificial Neural Networks 70
Other Methods 71
Evaluation of Classifiers and Classification Algorithms 71
Clustering Sequence Data 73
Popular Sequence Clustering Approaches 73
Quality Evaluation of Clustering Results 78
Sequence Motifs: Identifying and Characterizing Sequence Families 79
Motivations and Problems 80
Motivations 80
Four Motif Analysis Problems 81
Motif Representations 82
Consensus Sequence 83
Position Weight Matrix (PWM) 83
Markov Chain Model 86
Hidden Markov Model (HMM) 89
Representative Algorithms for Motif Problems 91
Dynamic Programming for Sequence Scoring and Explanation with HMM 92
Gibbs Sampling for Constructing PWM-based Motif 94
Expectation Maximization for Building HMM 96
Discussion 98
Mining Partial Orders from Sequences 100
Mining Frequent Closed Partial Orders 102
Problem Definition 102
How Is Frequent Closed Partial Order Mining Different from Other Data Mining Tasks? 105
TranClose: A Rudimentary Method 108
Algorithm Frecpo 111
Applications 117
Mining Global Partial Orders 118
Motivation and Preliminaries 118
Mining Algorithms 119
Mixture Models 122
Summary 123
Distinguishing Sequence Patterns 124
Categories of Distinguishing Sequence Patterns 124
Class-Characteristics Distinguishing Sequence Patterns 126
Definitions and Terminology 126
The ConSGapMiner Algorithm 128
Extending ConSGapMiner: Minimum Gap Constraints 135
Extending ConSGapMiner: Coverage and Prefix-Based Pattern Minimization 137
Surprising Sequence Patterns 139
Related Topics 142
Structured-Data Mining 142
Partial Periodic Pattern Mining 143
Bioinformatics 145
Sequence Alignment 146
Biological Sequence Databases and Biological Data Analysis Resources 148
References 149
Index 157

Erscheint lt. Verlag 31.10.2007
Reihe/Serie Advances in Database Systems
Advances in Database Systems
Zusatzinfo XVI, 150 p.
Verlagsort New York
Sprache englisch
Themenwelt Informatik Datenbanken Data Warehouse / Data Mining
Mathematik / Informatik Informatik Netzwerke
Informatik Theorie / Studium Künstliche Intelligenz / Robotik
Naturwissenschaften Biologie
Schlagworte Bioengineering • Bioinformatics • Data Analysis • Data Mining • Genome • genomics • pattern mining • pattern types • sequence patterns • Web Services
ISBN-10 0-387-69937-6 / 0387699376
ISBN-13 978-0-387-69937-0 / 9780387699370
Haben Sie eine Frage zum Produkt?
PDFPDF (Wasserzeichen)
Größe: 1,9 MB

DRM: Digitales Wasserzeichen
Dieses eBook enthält ein digitales Wasser­zeichen und ist damit für Sie persona­lisiert. Bei einer missbräuch­lichen Weiter­gabe des eBooks an Dritte ist eine Rück­ver­folgung an die Quelle möglich.

Dateiformat: PDF (Portable Document Format)
Mit einem festen Seiten­layout eignet sich die PDF besonders für Fach­bücher mit Spalten, Tabellen und Abbild­ungen. Eine PDF kann auf fast allen Geräten ange­zeigt werden, ist aber für kleine Displays (Smart­phone, eReader) nur einge­schränkt geeignet.

Systemvoraussetzungen:
PC/Mac: Mit einem PC oder Mac können Sie dieses eBook lesen. Sie benötigen dafür einen PDF-Viewer - z.B. den Adobe Reader oder Adobe Digital Editions.
eReader: Dieses eBook kann mit (fast) allen eBook-Readern gelesen werden. Mit dem amazon-Kindle ist es aber nicht kompatibel.
Smartphone/Tablet: Egal ob Apple oder Android, dieses eBook können Sie lesen. Sie benötigen dafür einen PDF-Viewer - z.B. die kostenlose Adobe Digital Editions-App.

Zusätzliches Feature: Online Lesen
Dieses eBook können Sie zusätzlich zum Download auch online im Webbrowser lesen.

Buying eBooks from abroad
For tax law reasons we can sell eBooks just within Germany and Switzerland. Regrettably we cannot fulfill eBook-orders from other countries.

Mehr entdecken
aus dem Bereich
Datenschutz und Sicherheit in Daten- und KI-Projekten

von Katharine Jarmul

eBook Download (2024)
O'Reilly Verlag
24,99