Introduction to Data Mining and its Applications -  S. Sumathi,  S. N. Sivanandam

Introduction to Data Mining and its Applications (eBook)

eBook Download: PDF
2006 | 1. Auflage
851 Seiten
Springer-Verlag
978-3-540-34351-6 (ISBN)
Systemvoraussetzungen
123,95 inkl. MwSt
  • Download sofort lieferbar
  • Zahlungsarten anzeigen

This book explores the concepts of data mining and data warehousing, a promising and flourishing frontier in data base systems and new data base applications and is also designed to give a broad, yet in-depth overview of the field of data mining.



Data mining is a multidisciplinary field, drawing work from areas including database technology, AI, machine learning, NN, statistics, pattern recognition, knowledge based systems, knowledge acquisition, information retrieval, high performance computing and data visualization.



This book is intended for a wide audience of readers who are not necessarily experts in data warehousing and data mining, but are interested in receiving a general introduction to these areas and their many practical applications. Since data mining technology has become a hot topic not only among academic students but also for decision makers, it provides valuable hidden business and scientific intelligence from a large amount of historical data.



It is also written for technical managers and executives as well as for technologists interested in learning about data mining.

Contents 6
1 Introduction to Data Mining Principles 24
1.1 Data Mining and Knowledge Discovery 25
1.2 Data Warehousing and Data Mining - Overview 28
1.3 Summary 43
1.4 Review Questions 43
2 Data Warehousing, Data Mining, and OLAP 44
2.1 Data Mining Research Opportunities and Challenges 46
2.2 Evolving Data Mining into Solutions for Insights 58
2.3 Knowledge Extraction Through Data Mining 60
2.4 Data Warehousing and OLAP 80
2.5 Data Mining and OLAP 84
2.6 Summary 95
2.7 Review Questions 95
3 Data Marts and Data Warehouse: Information Architecture for the Millennium 98
3.1 Data Marts, Data Warehouse, and OLAP 100
3.2 Data Warehousing for Healthcare: The Greatest Weapon in your Competitive Arsenal 130
3.3 Data Warehousing in the Telecommunications Industry 135
3.4 The Telecommunications Lifecycle 145
3.5 Security Issues in Data Warehouse 152
3.6 Data Warehousing: To Buy or To Build a Fundamental Choice for Insurers 163
3.7 Summary 171
3.8 Review Questions 172
4 Evolution and Scaling of Data Mining Algorithms 174
4.1 Data-Driven Evolution of Data Mining Algorithms 175
4.2 Scaling Mining Algorithms to Large DataBases 180
4.3 Summary 186
4.4 Review Questions 187
5 Emerging Trends and Applications of Data Mining 188
5.1 Emerging Trends in Business Analytics 189
5.2 Business Applications of Data Mining 193
5.3 Emerging Scienti.c Applications in Data Mining 200
5.4 Summary 205
5.5 Review Questions 206
6 Data Mining Trends and Knowledge Discovery 208
6.1 Getting a Handle on the Problem 209
6.2 KDD and Data Mining: Background 210
6.3 Related Fields 214
6.4 Summary 217
6.5 Review Questions 217
7 Data Mining Tasks, Techniques, and Applications 218
7.1 Reality Check for Data Mining 219
7.2 Data Mining: Tasks, Techniques, and Applications 227
7.3 Summary 238
7.4 Review Questions 239
8 Data Mining: an Introduction – Case Study 240
8.1 The Data Flood 241
8.2 Data Holds Knowledge 241
8.3 Data Mining: A New Approach to Information Overload 242
8.4 Summary 252
8.5 Review Questions 252
9 Data Mining & KDD
9.1 Data Mining and KDD – Overview 255
9.2 Data Mining: The Two Cultures 261
9.3 Summary 264
9.4 Review Questions 264
10 Statistical Themes and Lessons for Data Mining 266
10.1 Data Mining and O.cial Statistics 267
10.2 Statistical Themes and Lessons for Data Mining 269
10.3 Summary 285
10.4 Review Questions 286
11 Theoretical Frameworks for Data Mining 288
11.1 Two Simple Approaches 289
11.2 Microeconomic View of Data Mining 291
11.3 Inductive Databases 292
11.4 Summary 293
11.5 Review Questions 293
12 Major and Privacy Issues in Data Mining and Knowledge Discovery 294
12.1 Major Issues in Data Mining 295
12.2 Privacy Issues in Knowledge Discovery and Data Mining 298
12.3 Some Privacy Issues in Knowledge Discovery: The OECD Personal Privacy Guidelines 306
12.4 Summary 313
12.5 Review Questions 314
13 Active Data Mining 316
13.1 Shape De.nitions 318
13.2 Queries 320
13.3 Triggers 322
13.4 Summary 325
13.5 Review Questions 325
14 Decomposition in Data Mining - A Case Study 326
14.1 Decomposition in the Literature 327
14.2 Typology of Decomposition in Data Mining 328
14.3 Hybrid Models 329
14.4 Knowledge Structuring 332
14.5 Rule-Structuring Model 333
14.6 Decision Tables, Maps, and Atlases 334
14.7 Summary 335
14.8 Review Questions 336
15 Data Mining System Products and Research Prototypes 338
15.1 How to Choose a Data Mining System 339
15.2 Examples of Commercial Data Mining Systems 341
15.3 Summary 342
15.4 Review Questions 343
16 Data Mining in Customer Value and Customer Relationship Management 344
16.1 Data Mining: A Concept of Customer Relationship Marketing 345
16.2 Introduction to Customer Acquisition 351
16.3 Customer Relationship Management (CRM) 358
16.4 Data Mining and Customer Value and Relationships 371
16.5 CRM: Technologies and Applications 379
16.6 Data Management in Analytical Customer Relationship Management 392
16.7 Summary 408
16.8 Review Questions 408
17 Data Mining in Business 410
17.1 Business Focus on Data Engineering 411
17.2 Data Mining for Business Problems 413
17.3 Data Mining and Business Intelligence 419
17.4 Data Mining in Business - Case Studies 422
18 Data Mining in Sales Marketing and Finance 434
18.1 Data Mining can Bring Pinpoint Accuracy to Sales 436
18.2 From Data Mining to Database Marketing 437
18.3 Data Mining for Marketing Decisions 442
18.4 Increasing Customer Value by Integrating Data Mining and Campaign Management Software 448
18.5 Completing a Solution for Market-Basket Analysis – Case Study 454
18.6 Data Mining in Finance 458
18.7 Data Mining for Financial Data Analysis 459
18.8 Summary 460
18.9 Review Questions 461
19 Banking and Commercial Applications 462
19.1 Bringing Data Mining to the Forefront of Business Intelligence in Wholesale Banking 464
19.2 Distributed Data Mining Through a Centralized Solution – A Case Study 465
19.3 Data Mining in Commercial Applications 467
19.4 Decision Support Systems – Case Study 469
19.5 Keys to the Commercial Success of Data Mining – Case Studies 475
19.6 Data Mining Supports E-Commerce 481
19.7 Data Mining for the Retail Industry 485
19.8 Business Intelligence and Retailing 486
19.9 Summary 494
19.10 Review Questions 495
20 Data Mining for Insurance 496
20.1 Insurance Underwriting: Data Mining as an Underwriting Decision Support Systems 497
20.2 Business Intelligence and Insurance – Application of Business Intelligence Tools like Data Warehousing, OLAP and Data Mining in Insurance 510
20.3 Summary 520
20.4 Review Questions 521
21 Data Mining in Biomedicine and Science 522
21.1 Applications in Medicine 524
21.2 Data Mining for Biomedical and DNA Data Analysis 525
21.3 An Unsupervised Neural Network Approach to Medical Data Mining Techniques: Case Study 527
21.4 Data Mining – Assisted Decision Support for Fever Diagnosis – Case Study 538
21.5 Data Mining and Science 543
21.6 Knowledge Discovery in Science as Opposed to Business-Case Study 545
21.7 Data Mining in a Scienti.c Environment 552
21.8 Flexible Earth Science Data Mining System Architecture 557
21.9 Summary 565
21.10 Review Questions 566
22 Text and Web Mining 568
22.1 Data Mining and the Web 570
22.2 An Overview on Web Mining 572
22.3 Text Mining 581
22.4 Discovering Web Access Patterns and Trends 586
22.5 Web Usage Mining on Proxy Servers: A Case Study 595
22.6 Text Data Mining in Biomedical Literature 604
Approach – Case Study 604
22.7 Related Work 608
22.8 Summary 611
22.9 Review Questions 612
23 Data Mining in Information Analysis and Delivery 614
23.1 Information Analysis: Overview 615
23.2 Intelligent Information Delivery – Case Study 618
23.3 A Characterization of Data Mining Technologies and Processes – Case Study 622
23.4 Summary 635
23.5 Review Questions 636
24 Data Mining in Telecommunications and Control 638
24.1 Data Mining for the Telecommunication Industry 639
24.2 Data Mining Focus Areas in Telecommunication 641
24.3 A Learning System for Decision Support in Telecommunications – Case Study 644
24.4 Knowledge Processing in Control Systems 646
24.5 Data Mining for Maintenance of Complex Systems – A Case Study 649
24.6 Summary 650
24.7 Review Questions 650
25 Data Mining in Security 652
25.1 Data Mining in Security Systems 653
25.2 Real Time Data Mining-Based Intrusion Detection Systems – Case Study 654
25.3 Summary 669
Review Questions 671
APPENDIX-I Data Mining Research Projects 672
A.1 National University of Singapore: Data Mining Research Projects 672
A.2 HP Labs Research: Software Technology Laboratory 681
A.3 CRISP-DM: An Overview 684
A.4 Data Mining SuiteTM 686
A.5 The Quest Data Mining System, IBM Almaden Research Center, CA, USA 692
A.6 The Australian National University Research Projects 699
A.7 Data Mining Research Group, Monash University Australia 705
A.8 Current Projects, University of Alabama in Huntsville, AL 711
A.9 Kensington Approach Toward Enterprise Data Mining 719
APPENDIX-II Data Mining Standards 722
II.1 Data Mining Standards 723
II.2 Developing Data Mining Application Using Data Mining Standards 742
II.3 Analysis 745
II.4 Application Examples 746
II.5 Conclusion 753
Appendix 3A Intelligent Miner 754
3A.1 Data Mining Process 754
3A.2 Interpreting the Results 756
3A.3 Overview of the Intelligent Miner Components 757
3A.4 Running Intelligent Miner Servers 757
3A.5 How the Intelligent Miner Creates Output Data 759
3A.6 Performing Common Tasks 760
3A.7 Understanding Basic Concepts 761
3A.8 Main Window Areas 761
3A.9 Conclusion 763
Appendix 3B Clementine 764
3B.1 Key Findings 764
3B.2 Background Information 765
3B.3 Product Availability 766
3B.4 Software Description 767
3B.5 Architecture 768
3B.6 Methodology 769
3B.7 Clementine Server 776
3B.8 How Clementine Server Improves Performance on Large Datasets 777
3B.9 Conclusion 781
Appendix 3C Crisp 784
3C.1 Hierarchical Breakdown 784
3C.2 Mapping Generic Models to Specialized Models 785
3C.3 The CRISP-DM Reference Model 786
3C.4 Data Understanding 792
3C.5 Data Preparation 794
3C.6 Modeling 797
3C.7 Evaluation 799
3C.8 Conclusion 800
Appendix 3D Mineset 802
3D.1 Introduction 802
3D.2 Architecture 802
3D.3 MineSet Tools for Data Mining Tasks 803
3D.4 About the Raw Data 804
3D.5 Analytical Algorithms 804
3D.6 Visualization 805
3D.7 KDD Process Management 806
3D.8 History 807
3D.9 Commercial Uses 808
3D.10 Conclusion 809
Appendix 3E Enterprise Miner 810
3E.1 Tools For Data Mining Process 810
3E.2 Why Enterprise Miner 811
3E.3 Product Overview 812
3E.4 SAS Enterprise Miner 5.2 Key Features 813
3E.5 Enterprise Miner Software 816
3E.6 Enterprise Miner Process for Data Mining 819
3E.7 Client/Server Capabilities 819
3E.8 Client/Server Requirements 819
3E.9 Conclusion 820
References 822

2 Data Warehousing, Data Mining, and OLAP (p. 21)

Objectives:

• This deals with the concept of data mining, need and opportunities, trends and challenges, data mining process, common and new applications of data mining, data warehousing, and OLAP concepts.

• It gives an introduction to data mining: what it is, why it is important, and how it can be used to provide increased understanding of critical relationships in rapidly expanding corporate data warehouse.

• Data mining and knowledge discovery are emerging as a new discipline with important applications in science, engineering, health care, education, and business.

• New disciplined approaches to data warehousing and mining are emerging as part of the vertical solutions approach.

• Extracting the information and knowledge in the form of new relationships, patterns, or clusters for decision making purposes.

• We briefly describe some success stories involving data mining and knowledge discovery.

• We describe five external trends that promise to have a fundamental impact on data mining.

• The research challenges are divided into five broad areas: A) improving the scalability of data mining algorithms, B) mining nonvector data, C) mining distributed data, D) improving the ease of use of the data mining systems and environments, and E) privacy and security issues for data mining.

• We present the concept of data mining and aim at providing an understanding of the overall process and tools involved: how the process turns out, what can be done with it, what are the main techniques behind it, and which are the operational aspects.

• OLAP servers logically organize data in multiple dimensions, which allows users to quickly and easily analyze complex data relationships.

• OLAP database servers support common analytical operations, including consolidation, drill-down, and slicing and dicing.

• OLAP servers are very eficient when storing and processing multidimensional data.

Abstract.

This deals with the concept of data mining, need and opportunities, trends and challenges, process, common and new applications, data warehousing, and OLAP concepts. Data mining is also a promising computational paradigm that enhances traditional approaches to discovery and increases the opportunities for breakthroughs in the understanding of complex physical and biological systems.

Researchers from many intellectual communities have much to contribute to this field. Data mining refers to the act of extracting patterns or models from data. The rate growth of disk storage and the gap between Moore’s law and storage law growth trends represent a very interesting pattern in the state of technology evolution. The ability to capture and store data has produced a phenomenon we call the data tombs or data stores that are effectively write-only.

"Data Mining" (DM) is a folkloric denomination of a complex activity, which aims at extracting synthesized and previously unknown information from large databases. It also denotes a multidisciplinary field of research and development of algorithms and software environments to support this activity in the context of real-life problems where often huge amounts of data are available for mining.

Erscheint lt. Verlag 1.1.2006
Sprache englisch
Themenwelt Informatik Datenbanken Data Warehouse / Data Mining
Technik
ISBN-10 3-540-34351-2 / 3540343512
ISBN-13 978-3-540-34351-6 / 9783540343516
Haben Sie eine Frage zum Produkt?
PDFPDF (Wasserzeichen)
Größe: 6,6 MB

DRM: Digitales Wasserzeichen
Dieses eBook enthält ein digitales Wasser­zeichen und ist damit für Sie persona­lisiert. Bei einer missbräuch­lichen Weiter­gabe des eBooks an Dritte ist eine Rück­ver­folgung an die Quelle möglich.

Dateiformat: PDF (Portable Document Format)
Mit einem festen Seiten­layout eignet sich die PDF besonders für Fach­bücher mit Spalten, Tabellen und Abbild­ungen. Eine PDF kann auf fast allen Geräten ange­zeigt werden, ist aber für kleine Displays (Smart­phone, eReader) nur einge­schränkt geeignet.

Systemvoraussetzungen:
PC/Mac: Mit einem PC oder Mac können Sie dieses eBook lesen. Sie benötigen dafür einen PDF-Viewer - z.B. den Adobe Reader oder Adobe Digital Editions.
eReader: Dieses eBook kann mit (fast) allen eBook-Readern gelesen werden. Mit dem amazon-Kindle ist es aber nicht kompatibel.
Smartphone/Tablet: Egal ob Apple oder Android, dieses eBook können Sie lesen. Sie benötigen dafür einen PDF-Viewer - z.B. die kostenlose Adobe Digital Editions-App.

Zusätzliches Feature: Online Lesen
Dieses eBook können Sie zusätzlich zum Download auch online im Webbrowser lesen.

Buying eBooks from abroad
For tax law reasons we can sell eBooks just within Germany and Switzerland. Regrettably we cannot fulfill eBook-orders from other countries.

Mehr entdecken
aus dem Bereich
Datenschutz und Sicherheit in Daten- und KI-Projekten

von Katharine Jarmul

eBook Download (2024)
O'Reilly Verlag
24,99