Mining the Social Web - Matthew Russell

Mining the Social Web

Data Mining Facebook, Twitter, LinkedIn, Google+, GitHub, and More

(Autor)

Buch | Softcover
448 Seiten
2013 | 2nd Revised edition
O'Reilly Media, Inc, USA (Verlag)
978-1-4493-6761-9 (ISBN)
49,35 inkl. MwSt
zur Neuauflage
  • Titel ist leider vergriffen;
    keine Neuauflage
  • Artikel merken
Zu diesem Artikel existiert eine Nachauflage
Facebook, Twitter, LinkedIn, Google+, and other social web properties generate a wealth of valuable social data, but how can you tap into this data and discover who's connecting with whom, which insights are lurking just beneath the surface, and what people are talking about? This book shows you how to answer these questions and many more.
How can you tap into the wealth of social web data to discover who’s making connections with whom, what they’re talking about, and where they’re located? With this expanded and thoroughly revised edition, you’ll learn how to acquire, analyze, and summarize data from all corners of the social web, including Facebook, Twitter, LinkedIn, Google+, GitHub, email, websites, and blogs.
  • Employ the Natural Language Toolkit, NetworkX, and other scientific computing tools to mine popular social web sites
  • Apply advanced text-mining techniques, such as clustering and TF-IDF, to extract meaning from human language data
  • Bootstrap interest graphs from GitHub by discovering affinities among people, programming languages, and coding projects
  • Build interactive visualizations with D3.js, an extraordinarily flexible HTML5 and JavaScript toolkit
  • Take advantage of more than two-dozen Twitter recipes, presented in O’Reilly’s popular "problem/solution/discussion" cookbook format

The example code for this unique data science book is maintained in a public GitHub repository. It’s designed to be easily accessible through a turnkey virtual machine that facilitates interactive learning with an easy-to-use collection of IPython Notebooks.

Matthew Russell, Chief Technology Officer at Digital Reasoning Systems (http://www.digitalreasoning.com/) and Principal at Zaffra (http://zaffra.com), is a computer scientist who is passionate about data mining, open source, and web application technologies. He's also the author of Dojo: The Definitive Guide (O'Reilly)

A Guided Tour of the Social Web
Prelude
Chapter 1 Mining Twitter: Exploring Trending Topics, Discovering What People Are Talking About, and More
Overview
Why Is Twitter All the Rage?
Exploring Twitter's API
Analyzing the 140 Characters
Closing Remarks
Recommended Exercises
Online Resources
Chapter 2 Mining Facebook: Analyzing Fan Pages, Examining Friendships, and More
Overview
Exploring Facebook's Social Graph API
Analyzing Social Graph Connections
Closing Remarks
Recommended Exercises
Online Resources
Chapter 3 Mining LinkedIn: Faceting Job Titles, Clustering Colleagues, and More
Overview
Exploring the LinkedIn API
Crash Course on Clustering Data
Closing Remarks
Recommended Exercises
Online Resources
Chapter 4 Mining Google+: Computing Document Similarity, Extracting Collocations, and More
Overview
Exploring the Google+ API
A Whiz-Bang Introduction to TF-IDF
Querying Human Language Data with TF-IDF
Closing Remarks
Recommended Exercises
Online Resources
Chapter 5 Mining Web Pages: Using Natural Language Processing to Understand Human Language, Summarize Blog Posts, and More
Overview
Scraping, Parsing, and Crawling the Web
Discovering Semantics by Decoding Syntax
Entity-Centric Analysis: A Paradigm Shift
Quality of Analytics for Processing Human Language Data
Closing Remarks
Recommended Exercises
Online Resources
Chapter 6 Mining Mailboxes: Analyzing Who's Talking to Whom About What, How Often, and More
Overview
Obtaining and Processing a Mail Corpus
Analyzing the Enron Corpus
Discovering and Visualizing Time-Series Trends
Analyzing Your Own Mail Data
Closing Remarks
Recommended Exercises
Online Resources
Chapter 7 Mining GitHub: Inspecting Software Collaboration Habits, Building Interest Graphs, and More
Overview
Exploring GitHub's API
Modeling Data with Property Graphs
Analyzing GitHub Interest Graphs
Closing Remarks
Recommended Exercises
Online Resources
Chapter 8 Mining the Semantically Marked-Up Web: Extracting Microformats, Inferencing over RDF, and More
Overview
Microformats: Easy-to-Implement Metadata
From Semantic Markup to Semantic Web: A Brief Interlude
The Semantic Web: An Evolutionary Revolution
Closing Remarks
Recommended Exercises
Online Resources
Twitter Cookbook
Chapter 9 Twitter Cookbook
Accessing Twitter's API for Development Purposes
Doing the OAuth Dance to Access Twitter’s API for Production Purposes
Discovering the Trending Topics
Searching for Tweets
Constructing Convenient Function Calls
Saving and Restoring JSON Data with Text Files
Saving and Accessing JSON Data with MongoDB
Sampling the Twitter Firehose with the Streaming API
Collecting Time-Series Data
Extracting Tweet Entities
Finding the Most Popular Tweets in a Collection of Tweets
Finding the Most Popular Tweet Entities in a Collection of Tweets
Tabulating Frequency Analysis
Finding Users Who Have Retweeted a Status
Extracting a Retweet’s Attribution
Making Robust Twitter Requests
Resolving User Profile Information
Extracting Tweet Entities from Arbitrary Text
Getting All Friends or Followers for a User
Analyzing a User’s Friends and Followers
Harvesting a User’s Tweets
Crawling a Friendship Graph
Analyzing Tweet Content
Summarizing Link Targets
Analyzing a User’s Favorite Tweets
Closing Remarks
Recommended Exercises
Online Resources
Appendixes
Appendix Information About This Book's Virtual Machine Experience
Appendix OAuth Primer
Overview
Appendix Python and IPython Notebook Tips & Tricks
Colophon

Erscheint lt. Verlag 8.10.2013
Zusatzinfo black & white illustrations
Verlagsort Sebastopol
Sprache englisch
Maße 178 x 233 mm
Gewicht 730 g
Einbandart kartoniert
Themenwelt Informatik Datenbanken Data Warehouse / Data Mining
Informatik Web / Internet Social Web
ISBN-10 1-4493-6761-5 / 1449367615
ISBN-13 978-1-4493-6761-9 / 9781449367619
Zustand Neuware
Haben Sie eine Frage zum Produkt?
Mehr entdecken
aus dem Bereich
Auswertung von Daten mit pandas, NumPy und IPython

von Wes McKinney

Buch | Softcover (2023)
O'Reilly (Verlag)
44,90
Das umfassende Handbuch

von Wolfram Langer

Buch | Hardcover (2023)
Rheinwerk (Verlag)
49,90