Enabling Real-Time Business Intelligence (eBook)
181 Seiten
Springer-Verlag
978-3-642-14559-9 (ISBN)
Preface 5
Organization 6
Table of Contents 7
Queries over Unstructured Data: Probabilistic Methods to the Rescue (Keynote) 8
Unstructured Data in Enterprises 8
Probabilistic Models for Information Extraction 10
Representing Noisy Extractions as Imprecise Databases 11
Multi-attribute Extractions 13
Imprecise Data Models for Representing Uncertainty of De-duplication 15
Probability of Two Records Being Duplicates 15
Probability over Entity Groupings 15
Queries over Imprecise Duplicates 16
Concluding Remarks 18
References 19
Federated Stream Processing Support for Real-Time Business Intelligence Applications 21
Introduction 21
Related Work 22
The MaxStream Federated Stream Processing System 24
Architecture 26
Two Key Building Blocks 28
Hybrid Queries: Using Persistence with Streams 30
Using MaxStream in Real-Time BI Scenarios 32
Reducing Latency in Event-Driven Business Intelligence 32
Persistent Events in Supply-Chain Monitoring 33
Other Real-Time BI Applications 34
Feasibility Study 34
Conclusions and Future Directions 36
References 37
VPipe: Virtual Pipelining for Scheduling of DAG Stream Query Plans 39
Introduction 39
Preliminaries 42
Review of the Chain Scheduling 42
Problem Definition 43
The VPipe Execution Scheme 44
Change of Operator Logic 45
Discussion 47
Stochastic Analysis of Chain 47
System Model Basis 48
Case 1: System Analysis for SOS Synchronization 48
Case 2: System Analysis for IDS Synchronization 50
Performance Study 53
Experiment 1: Response Time Comparison 53
Experiment 2: Broken Pipeline Probability 54
Related Work 54
Conclusion 55
References 55
Ad-Hoc Queries over Document Collections – A Case Study 57
Introduction 57
Query Planning and Query Plan Execution 59
Understanding “Human-Powered” Query Execution Strategies 59
Elementary Plan Operators 60
The Coverage-Join (CJ) and Density-Join (DJ) Operator 64
Example Query and Example Plans 64
Plan Enumeration 65
Case Study 66
Heuristics for Plan Selection 66
Results and Discussion 67
Related Work 69
Summary and Future Work 70
References 71
Appendix: Implementing the KEYWORD-Operator 72
ASSET Queries: A Set-Oriented and Column-Wise Approach to Modern OLAP 73
Introduction 73
Grouping Analysis: A Retrospective 74
Group by 75
Cubes 75
Grouping Variables and the MD-Join 76
Windows 76
MapReduce 77
Associated Sets (ASSET) Queries 77
Definitions 77
SQL Syntax 78
DataMingler: A Spreadsheet-Like GUI 79
ASSET Queries and Data Streams (COSTES) 80
Financial Application Motivating Examples 81
COSTES: Continuous Spreadsheet-Like Computations 83
ASSET Queries and Persistent Data Sources (ASSET QE) 84
Social Networks: A Motivating Example 84
ASSET Query Engine (QE) 86
Conclusions and Future Work 88
References 89
Evaluation of Load Scheduling Strategies for Real-Time Data Warehouse Environments 91
Introduction 91
System Model and Problem Statement 93
System Architecture 93
Workload Model 94
Scheduling Performance Objective 95
Problem Statement 96
Scheduling Policies 97
Scheduling Algorithms for Push-Based Update Propagation 97
Evaluation and Discussion 98
Simulation Framework 98
Effect of the Data Production Process Length 99
Comparison of Local and Global Scheduling 100
Effects of Stage-Concurrent and Long-Running Updates 101
Ratio of Stage-Concurrent Updates 102
Pruning of Irretrievable Queries 103
Effects of Long-Running Update and Queries during Runtime 103
Related Work 104
Conclusion 105
References 106
Near Real-Time Data Warehousing Using State-of-the-Art ETL Tools 107
Near Real-Time Data Warehousing 107
Related Work 108
Data Warehouse Refreshment Anomalies 110
Properties of Operational Data Sources 114
Preventing Refreshment Anomalies 115
Preventing a Change Data Mismatch 116
Making Change Propagation Anomaly-Proof 119
Conclusion 123
References 123
Addressing BI Transactional Flows in the Real-TimeEnterprise Using GoldenGate TDM (Industrial Paper) 125
Background 125
Operational Data, and the Real-Time Enterprise 126
Transactional Data 126
Real-Time 127
Heterogeneous Systems and Interoperability 129
Transactional Consistency 130
Emerging Trends and Problems 130
Amount of Data 130
Adoption of Newer Datatypes 131
Growing Number of Users 131
Changing Nature of Applications 131
Micro-batching 132
Real-Time Data Acquisition 132
ETL and Real-Time Challenges 132
ESB 133
Change Data Capture (CDC) 133
GoldenGate TDM Platform 136
GoldenGate Architecture, Key Components 137
Key Architectural Features and Benefits 141
Use Cases 143
Example Customer Case Studies with Business Challanges,Real-Time Solutions 144
Challenges 146
Conclusion 147
References 148
Near Real–Time Call Detail Record ETL Flows(Industrial Paper) 149
Introduction 149
MVNO Background 150
Problem Statement 151
Our Solution 153
Transformation Rules 154
ETL Flows 155
MVNO CDR Flows 157
Arroyo 159
Related Work 160
Conclusions 160
References 161
Comparing Global Optimization and Default Settings of Stream-Based Joins (Experimental Paper) 162
Introduction 162
Meshjoin 164
Basic Operation 164
Architecture 165
Algorithm 165
Problem Definition 166
Tuning and Performance Comparisons 168
Proposed Investigation 168
Experimental Setup 169
Tuning of Disk-Buffer for Different Memory Budgets 170
Performance Analysis Using Default and Optimum Values forthe Disk-Buffer Size 170
Cost Validation 172
Approach for Choosing the Default Value 173
Related Work 174
Conclusions and Future Work 175
References 175
Merging OLTP and OLAP – Back to the Future (Panel) 178
Introduction 178
Panelists 179
Summary 180
Author Index 181
Erscheint lt. Verlag | 1.1.2010 |
---|---|
Sprache | englisch |
Themenwelt | Mathematik / Informatik ► Informatik ► Datenbanken |
Mathematik / Informatik ► Informatik ► Theorie / Studium | |
Mathematik / Informatik ► Mathematik ► Finanz- / Wirtschaftsmathematik | |
Sozialwissenschaften ► Politik / Verwaltung ► Staat / Verwaltung | |
ISBN-10 | 3-642-14559-0 / 3642145590 |
ISBN-13 | 978-3-642-14559-9 / 9783642145599 |
Haben Sie eine Frage zum Produkt? |
Größe: 4,9 MB
DRM: Digitales Wasserzeichen
Dieses eBook enthält ein digitales Wasserzeichen und ist damit für Sie personalisiert. Bei einer missbräuchlichen Weitergabe des eBooks an Dritte ist eine Rückverfolgung an die Quelle möglich.
Dateiformat: PDF (Portable Document Format)
Mit einem festen Seitenlayout eignet sich die PDF besonders für Fachbücher mit Spalten, Tabellen und Abbildungen. Eine PDF kann auf fast allen Geräten angezeigt werden, ist aber für kleine Displays (Smartphone, eReader) nur eingeschränkt geeignet.
Systemvoraussetzungen:
PC/Mac: Mit einem PC oder Mac können Sie dieses eBook lesen. Sie benötigen dafür einen PDF-Viewer - z.B. den Adobe Reader oder Adobe Digital Editions.
eReader: Dieses eBook kann mit (fast) allen eBook-Readern gelesen werden. Mit dem amazon-Kindle ist es aber nicht kompatibel.
Smartphone/Tablet: Egal ob Apple oder Android, dieses eBook können Sie lesen. Sie benötigen dafür einen PDF-Viewer - z.B. die kostenlose Adobe Digital Editions-App.
Zusätzliches Feature: Online Lesen
Dieses eBook können Sie zusätzlich zum Download auch online im Webbrowser lesen.
Buying eBooks from abroad
For tax law reasons we can sell eBooks just within Germany and Switzerland. Regrettably we cannot fulfill eBook-orders from other countries.
aus dem Bereich