Linked Data

Structured data on the Web. Foreword by Tim Berners-Lee

David Wood, Marsha Zaidman, Luke Ruth, Michael Hausenblas (Autoren)

Buch | Softcover

425 Seiten

2013
Manning Publications (Verlag)
978-1-61729-039-8 (ISBN)

Titel ist leider vergriffen;
keine Neuauflage

Artikel merken

Linked Data is a standards-driven model for representing structured data on the Web that gives developers, publishers, and information architects a consistent, predictable way to publish, merge and consume data. It's been adopted by many well-known institutions, including Google, Facebook, IBM, Oracle, and government agencies, as well projects such as Drupal and WordPress. Linked Data presents the Linked Data model in plain, jargon-free language and offers practical techniques using everyday tools like JavaScript and Python. It works through examples of increasing complexity while explaining foundational concepts such as HTTP URIs, the Resource Description Framework (RDF), and the SPARQL query language. Readers will learn to use various Linked Data document formats to create powerful Web applications and mashups, and to effectively use emerging Web standards to access, find, and query structured data on the Web.

Written by Web developers for web developers A step-by-step, hands-on guide to using Linked Data Shows how to utilize the power of tomorrow's Web today. Written for Web developers by Web developers, this book requires no previous exposure to Linked Data technologies.

The launch of Schema.org in June 2011 by Google, Microsoft and Yahoo! and the publication of Linked Data by retailers such as Best Buy, Sears and Volkswagen brought Linked Data into the mainstream. Linked Data is a standards-driven model for representing structured data on the Web that gives developers, publishers, and information architects a consistent, predictable way to publish, merge and consume data.

Linked Data is a set of techniques to represent and connect structured data on the Web. This book shows you how to access, create, and use Linked Data. Linked Data has one amazing property: it can be easily combined with other Linked Data to form new knowledge.

Linked Data makes the World Wide Web into a global database that we call the Web of Data. Developers can query Linked Data using a query language called SPARQL from multiple sources at once and combine those results dynamically, something difficult or impossible to do with traditional data-management technologies. The examples in this book are intentionally drawn from public sources, but the techniques illustrated can just as easily be used with private data. You may be unfamiliar with some of the resources that we use, but they’re readily accessible on the Web, and we encourage you to check them out as you encounter them. We apologize in advance for any inconsistencies between the screen shots and URLs referenced in the text and the actual content when you visit those sites on the Web. The Web is a rapidly changing entity, and no printed matter can absolutely represent that. We do promise that all the screen shots and URLs were correct as we entered production.

The techniques of Linked Data enable us to more easily share our knowledge with others. Literally anything can be described by Linked Data. Linked Data on the World Wide Web may be found, shared, and combined with other people’s data. Unlike traditional data-management systems, Linked Data frees information from proprietary containers so anyone can use it. As with any data, the consumer is responsible for evaluating its quality and utility. We use sources whose data we trust.

Linked Data: Structured data on the Web should be read by application developers who want to appreciate, consume, and publish Linked Data. This book assumes that you have a basic familiarity with fundamental web technologies such as HTML, URIs, and HTTP. We introduce you to Linked Data, place it in context, outline its principles, and show you how to use it by walking you through the process of finding, consuming, and publishing Linked Data on the Web. We illustrate this process with real-world applications of gradually increasing complexity.
Roadmap

This book has eleven chapters, divided into four parts, a glossary, and two appendixes.

Part 1 “The Linked Data Web” provides an introduction to the fundamentals of Linked Data, the Resource Description Framework (RDF) data model, and the common standard serializations used in representing this data. It guides the reader in identifying and consuming Linked Data on the Web.

Chapter 1, an introduction to Linked Data, places it in context, outlines its principles, and shows you how to use it by walking you through a Linked Data application.
Chapter 2 introduces the Resource Description Framework and its relationship to Linked Data. We describe the RDF data model along with the key concepts that you’re likely to use in your own Linked Data. In closing this chapter, we address common issues of file types and web servers and provide techniques for resolving those issues.
Chapter 3 acquaints you with the distributed nature of the Web and how data and documents are interlinked. You become aware of the relationship between the Web of Documents and the Web of Data. You learn how to find and consume Linked Data on the Web.

Part 2 “Taming Linked Data” emphasizes techniques for developing and publishing your own Linked Data and enhanced searching techniques for aggregating such data. You learn how to use the SPARQL query language to search for relevant Linked Data datasets and aggregate the results.

Chapter 4 covers methods of creating, linking, and publishing Linked Data on the Web using the Friend of a Friend (FOAF) and Relationship vocabularies.
Chapter 5 introduces the SPARQL query language for RDF. SPARQL enables you to query the Web of Data as if it were a database, albeit a very large one with many distributed datasets.

Part 3 “Linked Data in the wild” illustrates how to use RDFa to achieve search engine optimization of your web pages. It introduces you to RDF databases and illustrates the differences between these and the traditional RDBMS. We illustrate how you can best share your datasets and projects on the Web and optimize the inclusion of your projects and datasets in Semantic Web search results.

Chapter 6 illustrates how to use Resource Description Framework in Attributes (RDFa) to enhance your HTML web pages to achieve enhanced results from search engines. You’re introduced to the GoodRelations business-oriented vocabulary and similar techniques using schema.org.
Chapter 7 introduces RDF databases and the differences and benefits of such data stores over RDBMS. In general, integrating information already in RDF format is painless. But data that you need and would like to use is often stored in non-RDF sources. This chapter illustrates how non-RDF data can be transformed into RDF for ease of integration into other applications.
Chapter 8 provides an introduction to all the ways that new Linked Data should be described and linked into the larger Linked Data world. It describes and applies the Description of a Project (DOAP) vocabulary to describe projects, the Vocabulary of Interlinked Datasets (VoID) to describe datasets, and semantic sitemaps to describe the Linked Data offerings on a site. This chapter also presents guidelines to publishing your data on the LOD cloud.

Part 4 “Putting it all together” pulls all the concepts covered in parts 1, 2, and 3 together. We develop a complex, real-world application using an open source application server for Linked Data and help you summarize the process of preparation to publication of Linked Data.

Chapter 9 introduces the Callimachus Project, an open source application server for Linked Data. We show you how to get started with Callimachus, how to generate web pages from RDF data, and how to build applications using it.
Chapter 10 summarizes the process of publishing Linked Data from preparation to publication. We identify and clarify easily overlooked steps, like minting URIs and customizing vocabularies.
Chapter 11 surveys the current state of the Semantic Web and the role of Linked Data. We identify some interesting applications of Linked Data and attempt to predict the future direction of the Semantic Web and Linked Data.

The appendixes provide supplementary information.

Appendix A is a quick reference to the development environment setups of the tools used in the book.
Appendix B is a guide to interpreting SPARQL query results formats.
A glossary lists and defines terms used in this book.

How to use this book

We expect you to get the most from this material by reading the chapters in sequence, downloading and executing the sample applications, and then trying modifications of the applications to increase your understanding of the concepts. In those applications where you need particular software tools, we guide you in locating and obtaining those resources. We expect this book to provide you with a foundation to appreciate, consume, and publish Linked Data on the Web.
Code conventions and downloads

All source code in this book is in a fixed-width font like this, which sets it off from the surrounding text. In many listings, the code is annotated to point out the key concepts. In some cases, source code is in bold fixed-width font for emphasis. We have tried to format the code so that it fits within the available page space in the book by adding line breaks and using indentation carefully. Sometimes, however, very long lines include line-continuation markers.

Source code for all the working examples in the book is available from http://LinkedDataDeveloper.com or from the publisher’s website at www.manning.com/LinkedData.

A Readme.txt file is provided in the root folder and also in each chapter folder; the files provide details on how to install and run the code. Code examples appear throughout this book. Longer listings appear under clear listing headers, shorter listings appear between lines of text.
Author Online

Purchase of Linked Data includes free access to a private Web forum run by Manning Publications where you can make comments about the book, ask technical questions, and receive help from the authors and from other users. To access the forum and subscribe to it, point your browser to www.manning.com/LinkedData. This page provides information on how to get on the forum once you’re registered, what kind of help is available, and the rules of conduct on the forum.

Manning’s commitment to our readers is to provide a venue where a meaningful dialog between individual readers and between readers and the authors can take place. It’s not a commitment to any specific amount of participation on the part of the authors, whose contribution to the AO remains voluntary (and unpaid). We suggest you ask the authors challenging questions lest their interest stray!

David Wood architected the first large-scale RDF database (http://mulgara.org), re-architected the Persistent URL service (http://purl.org, http://purlz.org) to support Linked Data, and co-founded the Callimachus Project (http://callimachusproject.org). He is co-chair of the World Wide Web Consortium's RDF Working Group (http://w3.org/2011/rdf-wg/).

Marsha Zaidman is Associate Professor Emerita of Computer Science at the University of Mary Washington, where she served as chair of the Department of Computer Science from 1997 to 2009.

Luke Ruth is a Linked Data developer supporting the Callimachus Project (http://callimachusproject.org).

Michael Hausenblas leads the Linked Data Research Centre in Galway, Ireland. He is the project coordinator of the European Commission FP7 Support Action LOD Around-The-Clock (LATC) and other W3C standardization activities.

foreword by Tim Berners-Lee
preface
acknowledgments
about this book
about the cover illustration
Part 1 The Linked Data Web

1 Introducing Linked Data
1.1 Linked Data defined
1.2 What Linked Data won’t do for you
1.3 Linked Data in action
1.4 The Linked Data principles
1.5 The Linking Open Data project
1.6 Describing data
1.7 RDF: a data model for Linked Data
1.8 Anatomy of a Linked Data application
1.9 Summary
2 RDF: the data model for Linked Data
2.1 The Linked Data principles extend RDF
2.2 The RDF data model
2.3 RDF vocabularies
2.4 RDF formats for Linked Data
2.5 Issues related to web servers and published Linked Data
2.6 File types and web servers
2.7 When you have limited control over Apache
2.8 Linked Data platforms
2.9 Summary
3 Consuming Linked Data
3.1 Thinking like the Web
3.2 How to consume Linked Data
3.3 Tools for finding distributed Linked Data
3.4 Aggregating Linked Data
3.5 Crawling the Linked Data Web and aggregating data
3.6 Summary

Part 2 Taming Linked Data

4 Creating Linked Data with FOAF
4.1 Creating a personal FOAF profile
4.2 Adding more content to a FOAF profile
4.3 Publishing your FOAF profile
4.4 Visualization of a FOAF profile
4.5 Application: linking RDF documents using a custom vocabulary
4.6 Summary
5 SPARQL—querying the Linked Data Web
5.1 An overview of a typical SPARQL query
5.2 Querying flat RDF files with SPARQL
5.3 Querying SPARQL endpoints
5.4 Types of SPARQL queries
5.5 SPARQL result formats (XML, JSON)
5.6 Creating web pages from SPARQL queries
5.7 Summary

Part 3 Linked Data in the wild

6 Enhancing results from search engines
6.1 Enhancing HTML by embedding RDFa
6.2 Embedding RDFa using the GoodRelations vocabulary
6.3 Embedding RDFa using the schema.org vocabulary
6.4 How do you choose between using schema.org or GoodRelations?
6.5 Extracting RDFa from HTML and applying SPARQL 155
6.6 Summary
7 RDF database fundamentals
7.1 Classifying RDF databases
7.2 Transforming spreadsheet data to RDF
7.3 Application: collecting Linked Data in an RDF database
7.4 Summary
8 Datasets
8.1 Description of a Project
8.2 Documenting your datasets using VoID
8.3 Sitemaps
8.4 Linking to other people’s data
8.5 Examples of using owl:sameAs to interlink datasets
8.6 Joining Data Hub
8.7 Requesting outgoing links from DBpedia to your dataset
8.8 Summary

Part 4 Pulling it all together

9 Callimachus: a Linked Data management system
9.1 Getting started with Callimachus
9.2 Creating web pages using RDF classes
9.3 Creating and editing class instances
9.4 Application: creating a web page from multiple data sources
9.5 Summary
10 Publishing Linked Data—a recap
10.1 Preparing your data
10.2 Minting URIs
10.3 Selecting vocabularies
10.4 Customizing vocabulary
10.5 Interlinking your data to other datasets
10.6 Publishing your data
10.7 Summary
11 The evolving Web
11.1 The relationship between Linked Data and the Semantic Web
11.2 What’s coming
11.3 Conclusion

appendix A Development environments
appendix B SPARQL results formats
glossary
index

»A friendly introduction to the use and publication of structured data on the WWW.« Tim Berners-Lee, Director of W3C

»A practical guide for integrating and publishing structured data on the Web.« Christofer Weber, NeoGrid

»Takes a complex academic subject and makes it clear and relevant.« Mike Westaway, AstraZeneca

»Highly recommended to all explorers of the Semantic Web.« Rob Crowther, Author of »Hello! HTML5 & CSS«

Linked Data: Structured data on the Web the book is just what Linked Data the technology has needed. It is a friendly introduction to the use and publication of structured data on the World Wide Web. Linked Data was part of my initial vision for the Web and is an important part of the Web’s future. The Web took off as a web of hyperlinked documents which were exciting to read, but which could not effectively be used as data. And, yes, in fact, much of the Web is data-driven, and the data has been hidden on files inside the server. In slides from my wrap-up talk at the very first WWW conference in 1994, I pointed out that while documents talk about people and things, such as a title deed saying who owns a house, the system was not capturing the data—the actual ownership fact—in a way that could be processed. As the Web evolved, and became more driven by data, there has been frustration that changing, hidden data is not exposed to the reader. Linked Data standards allow you to publish data in a way that can be read by people and processed by machines so that previously hidden flows of data become evident. Linked Data may not be as exciting as a hypertext Web to read, but it is more exciting in terms of making everything work more effectively, from business to scientific research. Machines can read, follow, and combine Linked Data much more effectively than they can perform those actions using other forms of data currently on the Web. The role of machines has previously been subservient to the role of people in the technology used to allow people to communicate. Now machines are beginning to become active participants in the communication. Linked Data allows machines to become more useful partners in our daily lives. Linked Data has come of age in the last couple of years. In the last two years we have seen Google announce its Knowledge Graph and adopt the JSON-LD serialization format for Gmail, and produce a large set of terms for general use at schema.org; IBM announce that the DB2 database will become a Linked Data server; and Facebook expose Linked Data via its Graph API. Other large companies and government organizations have followed suit. We have needed a book like this one to introduce Linked Data development to a new and wider group of programmers. Linked Data will provide you with the questions to ask, even if it doesn’t answer them all. It is a great place to begin your study and kick-start your development. I have known Dave Wood for just about a decade. We met when he started his work with the World Wide Web Consortium. We later worked on a Web research project together. Dave has worked tirelessly to develop Semantic Web and Linked Data frameworks since the late 1990s. As a developer, he is well-placed to show others how it is done. The building blocks of Linked Data are not particularly new. The original proposal for the World Wide Web that I wrote in 1989 for my bosses at CERN included hyperlinks with semantics. The proposal read, in part, “The system we need is like a diagram of circles and arrows, where circles and arrows can stand for anything.” In fact, the Enquire program I had written in 1980 captured the relationships between things in a graph. That was the vision. Now Linked Data is delivering on this vision, by adding meaning that computers can process. As we all know, in the basic hypertext Web, the arrows we ended up with all stood for the same thing: “There is some interesting information over here!” Linked Data extends the “document Web” by allowing arrows to stand for anything we can name with a URI. Hyperlinks gain the semantics they need, and, in the process become much more useful. The Web of hypertext-linked documents is complemented by the very powerful Linked Web of Data. Why linked? Well, think of how the value of a Web page is very much a function of what it links to, as well as the inherent value of the information within the Web page. So it is—in a way even more so—also in the Semantic Web of Linked Data. The data itself is valuable, but the links to other data make it much more so. I believe that the Web should evolve to serve all of us, regardless of our nationality, language, economic motivation, or interests. Linked Data is just one part of that evolution. It is not the end—it is just another part of the beginning. There is still plenty to do, so come join us in building the next generation of the Web! Tim Berners-Lee Director of the World Wide Web Consortium (W3C) 3Com Founders Professor of Engineering, Massachusetts Institute of Technology Professor in the Electronics and Computer Science Department, University of Southampton UK

We love the Web and we love the way it’s evolving from the rather simple web of linked documents of the early 1990s into the framework for the world’s information. Representing data on the Web is an obvious, but slightly harder, next step. We each came to the Web in our own ways but came to Linked Data nearly together. David found the Web as a programmer and later as an entrepreneur, Marsha as an educator, and Luke as a student. Marsha and David are old enough to have started computing with punch cards and paper tape. The Web was a very welcome degree of abstraction from ones and zeros. David was introduced to the Web at Digital Equipment Corporation’s fabled Western Research Lab in California in 1993. It was an eye-opener. One of the first large websites showed photos of thousands of pieces of artwork held by the Vatican. Another showed a list of projects that Digital researchers were working on and linked to each of their own individual web servers for detailed documents. David was hooked. Tellingly, it was the project website that he found most interesting. If only you could link into databases and spreadsheets the way you could link to documents. Marsha also found the Web in the early days, when Gopher was the primary search tool and Web browsers worked in a terminal, and she kept up to date with its rapid changes in order to teach new generations of computer scientists. Her career has lasted long enough for her to see the incredible changes wrought by the invention of spreadsheets and databases on decision making, and this fostered an interest in moving data to the Web. Marsha gave David the chance to teach at the University of Mary Washington just as the Linking Open Data project was starting. Luke took the first class offered to U.S. undergraduates on Linked Data in 2011, followed by an independent study and an internship, all with David. He was eventually hired by David to work on Linked Data projects. Luke and David contribute to the Callimachus Project, an open source Linked Data platform described in this book. We’ve used it to build applications for a variety of organizations, from U.S. government agencies and pharmaceutical companies to publishers and health-care companies. Each of those projects is based on the creation, manipulation, and use of Linked Data. We decided to write a Linked Data book for Web developers because there simply wasn’t one. We all had to learn Linked Data from the specifications or by readying academic papers. There are some other books on Linked Data (David edited two of them), but none are aimed specifically at developers. We thought that our combination of real-world development experience and experience teaching technology would result in a useful book. We hope you agree. It’s our privilege to work with a loosely affiliated international group of people working to bring data to the Web. We hope that you’ll read this book and then join us. We can’t wait to see what the Web will become next.

Erscheint lt. Verlag	24.1.2014
Vorwort	Tim Berners-Lee
Verlagsort	New York
Sprache	englisch
Gewicht	514 g
Einbandart	kartoniert
Themenwelt	Mathematik / Informatik ► Informatik ► Datenbanken
	Informatik ► Netzwerke ► Webserver
	Informatik ► Software Entwicklung ► SOA / Web Services
	Informatik ► Web / Internet ► Suchmaschinen / Web Analytics
ISBN-10	1-61729-039-4 / 1617290394
ISBN-13	978-1-61729-039-8 / 9781617290398
Zustand	Neuware

Informationen gemäß Produktsicherheitsverordnung (GPSR)
Haben Sie eine Frage zum Produkt?

eBook-Ausgabe

EPUB (Adobe DRM)

45,77 €

Sie befinden sich hier:

auf Facebook teilen

bei Twitter

Link zu dieser Seite kopieren