Knowledge and Data Management in GRIDs (eBook)

eBook Download: PDF
2007 | 2007
XVIII, 254 Seiten
Springer US (Verlag)
978-0-387-37831-2 (ISBN)

Lese- und Medienproben

Knowledge and Data Management in GRIDs -
Systemvoraussetzungen
106,99 inkl. MwSt
  • Download sofort lieferbar
  • Zahlungsarten anzeigen

Current research activities are leveraging the Grid to create generic- and domain-specific solutions and services for data management and knowledge discovery. Knowledge and Data Management in Grids is the third volume of the CoreGRID series; it gathers contributions by researchers and scientists working on storage, data, and knowledge management in Grid and Peer-to-Peer systems. This volume presents the latest Grid solutions and research results in key areas such as distributed storage management, Grid databases, Semantic Grid and Grid-aware data mining. Written for a professional audience of researchers and practitioners in industry, it is suitable for graduate-level students in computer science.


Data and knowledge play a key role in both current and future GRIDs. The issues concerning representation, discovery, and integration of data and knowledge in dynamic distributed environments can be addressed by exploiting features offered by GRID Technologies. Current research activities are leveraging the GRID for the provision of generic- and domain-specific solutions and services for data management and knowledge discovery.Knowledge and Data Management in GRIDs is the third volume of the CoreGRID series and brings together scientific contributions by researchers and scientists working on storage, data, and knowledge management in GRID and Peer-to-Peer systems. This volume presents the latest GRID solutions and research results in key areas of knowledge and data management such as distributed storage management, GRID databases, Semantic GRID and GRID-aware data mining.Knowledge and Data Management in GRIDs is designed for a professional audience, composed of researchers and practitioners in industry. This book is also suitable for graduate-level students in computer science.

Contents 6
Foreword 8
Preface 10
Contributing Authors 14
I GRID DATA MANAGEMENT 19
ACCESSING DATA IN GRIDS USING OGSA-DAI 21
1. Introduction 22
2. Web Services and Grids 23
3. Architectural Requirements for Data Middleware 24
4. An Overview of OGSA-DAI 25
5. Activities and Perform Documents 28
6. How OGSA-DAI is being Used 30
7. Related Work 31
8. Importance of Standards 32
9. Conclusions 33
Acknowledgments 34
References 34
SERVICE CHOREOGRAPHY FOR DATA INTEGRATION ON THE GRID 37
1. Introduction 38
2. Background 39
3. Architecture and Service Interactions 41
4. The XMAP Integration Framework 43
5. Combining query processing and reformulation services 44
6. Conclusions 49
Acknowledgments 50
References 50
ACCESSING WEB DATABASES USING OGSA-DAI IN BDWORLD* 53
1. Introduction 54
2. Generic Bioinformatics Data Access and Integration Requirements 56
3. BDWorld Data Integration Issues 58
4. The BioDA Exemplar 60
5. Conclusion 65
Acknowledgments 66
References 66
FAILURE RECOVERY ALTERNATIVES IN GRID- BASED DISTRIBUTED QUERY PROCESSING: A CASE STUDY 69
1. Introduction 70
2. Related Work 70
3. Recovery Options 72
4. Implementation 74
5. Experimental Results 77
6. Conclusions 79
Acknowledgments 80
References 80
II GRID DATA STORAGE 83
CONDUCTOR: SUPPORT FOR AUTONOMOUS CONFIGURATION OF STORAGE SYSTEMS 85
1. Introduction 86
2. Related work 88
3. System Overview 89
4. Initial Configuration Mode 91
5. Evaluation 96
6. Conclusions and future work 98
Acknowledgments 99
References 99
VIOLIN: A FRAMEWORK FOR EXTENSIBLE BLOCK-LEVEL STORAGE 101
1. Introduction 102
2. System Architecture 103
3. Advanced Virtualization Scenarios 111
4. Related Work 114
5. Conclusions 115
Acknowledgments 115
References 115
CLUSTERIX DATA MANAGEMENT SYSTEM (CDMS) - ARCHITECTURE AND USE CASES * 117
1. Introduction 118
2. Data Management System 118
3. System Architecture 124
4. System Interface 126
5. Integration of End-User Applications with CDMS 127
6. Integration with GRMS 127
7. Related work 131
8. Conclusions 132
References 132
III SEMANTIC GRID 135
ARCHITECTURAL PATTERNS FOR THE SEMANTIC GRID * 137
1. Introduction 138
2. Semantic Grid concepts 139
3. The Grid scheduling use case 141
4. Service interaction patterns for the Semantic Grid 145
5. Discussion 148
6. Future Directions 150
References 151
A METADATA MODEL FOR THE DISCOVERY AND EXPLOITATION OF SCIENTIFIC STUDIES 153
1. Introduction 154
2. A Science Data Portal 155
3. The Metadata Structure 157
4. Metadata Conformance 162
5. An Example 162
6. Conclusions and Future Development 165
Acknowledgments 166
References 166
IDEAS FOR THE PROVISION OF ONTOLOGY ACCESS IN GRID ENVIRONMENTS 169
1. Introduction 170
2. Lessons Learnt from the Semantic Web 170
3. Possibilities for Providing Ontology Access in the Grid 171
4. WS-DAIOnt: a Proposal of an Ontology Access Mechanism in the Grid 183
5. Conclusions 185
Acknowledgments 185
References 185
SEMANTIC SUPPORT FOR META-SCHEDULING IN GRIDS 187
1. Introduction 188
2. Requirements for the Scheduling Domain Knowledge Model 191
3. A Semantic Model for Grid Scheduling 192
4. Environment for Semantic Exploitation 196
5. Future Perspectives 199
Acknowledgments 199
References 200
SEMANTIC GRID RESOURCE DISCOVERY IN ATLAS* 203
1. Introduction 204
2. Related Work 205
3. The P2P System Atlas 206
4. Atlas in Operation: Service Discovery in OntoKit 213
5. Conclusions 215
References 215
IV DISTRIBUTED DATA MINING 219
WSRF-BASED SERVICES FOR DISTRIBUTED DATA MINING 221
1. Introduction 222
2. SOA and the WS-Resource Framework 223
3. WSRF-based Data Mining Services 224
4. Application Modeling and Representation 230
5. WSRF Service Execution Performance 235
6. Related work 236
7. Conclusions 237
Acknowledgments 237
References 237
MINING FREQUENT CLOSED ITEMSETS FROM DISTRIBUTED REPOSITORIES 239
1. Introduction. 240
2. Frequent and Closed Itemsets 241
3. Distributed Frequent Itemsets 243
4. Distributed Frequent Closed Itemsets 245
5. Conclusion 250
References 250
DISTRIBUTED DATA MINING AND KNOWLEDGE MANAGEMENT WITH NETWORKS OF SENSOR ARRAYS 253
1. Introduction 254
2. Industrial context 255
3. Data mining in TELEMAC 256
4. The Grids context 260
5. Grids based approach to TELEMAC 263
Acknowledgments 267
References 267
Index 269

1. Introduction (p. 20)

The Grid, as an emerging infrastructure for the discovery, access and use of distributed computational resources [15], offers new opportunities and raises new challenges in data management. Many aspects differentiate the Grid from a traditional distributed environment, such aspects include the large scale, dynamic, autonomous, and distributed nature of data sources.

A Grid can include related data resources maintained in different syntaxes, managed by different software systems, and accessible through different protocols and interfaces. Due to this diversity in data resources, one of the most demanding issue in managing data on Grids is reconciliation of data heterogeneity [8].

Therefore, in order to provide facilities for addressing requests over multiple heterogeneous data sources, it is necessary to provide data integration models and mechanisms.

Data integration is one of the most persistent problems that the database and information management community has to deal with. Although significant progress has been made in several aspects of data integration, the increase in availability of web-based data sources has led to new challenges. More specif- ically, efficient techniques have been developed and approaches have been devised to schema mediation languages, query answering algorithms, optimisation strategies, query execution policies, industrial development, and so on [17].

However, effective techniques for the generation and handling of semantic mappings are still in their infancy. The need for semantic correlation of data sources is particularly felt in Grid settings. Moreoever, in a Grid, a centralized structure for coordinating all the nodes may not be practical because it can be- come a bottleneck and, more importantly, it cannot accommodate the dynamic and distributed nature of Grid resources.

Data access and integration services have been attracting significant interest from the Grid community. Data Grids that rely on the coordinated sharing of and interaction across multiple autonomous database management systems play a key role in many industrial and scientific initiatives. To this end, middleware services have been developed.

Two notable examples are the OGSA Data Access and Integration (OGSA-DAI) [6] and the OGSA Distributed Query Processor (OGSA-DQP)' [5,4] projects. These projects have moved toward a servide-oriented architecture quite early in their lifecycle. OGSA-DAI exposes database management systems (including Oracle, MySQL, SQLServer, DB2, and so on) in a uniform way, whereas OGSA-DQP provides distributed query processing functionalities on top of OGSA-DAI. As such, OGSA-DQP can combine and integrate data from multiple data sources. To enhance performance, it employs parallel query execution techniques, nevertheless it relies on the user for the semantic interpretation of the data and does not address any schema integration requirements.

To date, only few projects (e.g., [ l l , 91) actually meet the schemaintegration requirements that are necessary for establishing semantic connections among heterogeneous data sources. To address this limitation, the use of the XMAP framework for integrating heterogeneous data sources distributed over a Grid has been proposed [12] .



Erscheint lt. Verlag 15.2.2007
Zusatzinfo XVIII, 254 p. 76 illus.
Verlagsort New York
Sprache englisch
Themenwelt Informatik Datenbanken Data Warehouse / Data Mining
Mathematik / Informatik Informatik Netzwerke
Informatik Theorie / Studium Algorithmen
Informatik Weitere Themen Hardware
Naturwissenschaften
Schlagworte Architecture • Configuration • currentsmp • Data • Database • data structures • Dom • Getov • Getov series editor • Grids • Knowledge • knowledge management • Management • mgmt • Scheduling • Talia
ISBN-10 0-387-37831-6 / 0387378316
ISBN-13 978-0-387-37831-2 / 9780387378312
Haben Sie eine Frage zum Produkt?
PDFPDF (Wasserzeichen)
Größe: 14,7 MB

DRM: Digitales Wasserzeichen
Dieses eBook enthält ein digitales Wasser­zeichen und ist damit für Sie persona­lisiert. Bei einer missbräuch­lichen Weiter­gabe des eBooks an Dritte ist eine Rück­ver­folgung an die Quelle möglich.

Dateiformat: PDF (Portable Document Format)
Mit einem festen Seiten­layout eignet sich die PDF besonders für Fach­bücher mit Spalten, Tabellen und Abbild­ungen. Eine PDF kann auf fast allen Geräten ange­zeigt werden, ist aber für kleine Displays (Smart­phone, eReader) nur einge­schränkt geeignet.

Systemvoraussetzungen:
PC/Mac: Mit einem PC oder Mac können Sie dieses eBook lesen. Sie benötigen dafür einen PDF-Viewer - z.B. den Adobe Reader oder Adobe Digital Editions.
eReader: Dieses eBook kann mit (fast) allen eBook-Readern gelesen werden. Mit dem amazon-Kindle ist es aber nicht kompatibel.
Smartphone/Tablet: Egal ob Apple oder Android, dieses eBook können Sie lesen. Sie benötigen dafür einen PDF-Viewer - z.B. die kostenlose Adobe Digital Editions-App.

Zusätzliches Feature: Online Lesen
Dieses eBook können Sie zusätzlich zum Download auch online im Webbrowser lesen.

Buying eBooks from abroad
For tax law reasons we can sell eBooks just within Germany and Switzerland. Regrettably we cannot fulfill eBook-orders from other countries.

Mehr entdecken
aus dem Bereich
Datenschutz und Sicherheit in Daten- und KI-Projekten

von Katharine Jarmul

eBook Download (2024)
O'Reilly Verlag
24,99