Agile Data Science - Russell Jurney

Agile Data Science

Building Data Analytics Applications with Hadoop

(Autor)

Buch | Softcover
178 Seiten
2013
O'Reilly Media (Verlag)
978-1-4493-2626-5 (ISBN)
35,90 inkl. MwSt
zur Neuauflage
  • Titel ist leider vergriffen;
    keine Neuauflage
  • Artikel merken
Zu diesem Artikel existiert eine Nachauflage
Mining big data requires a deep investment in people and time. How can you be sure you’re building the right models? With this hands-on book, you’ll learn a flexible toolset and methodology for building effective analytics applications with Hadoop.

Using lightweight tools such as Python, Apache Pig, and the D3.js library, your team will create an agile environment for exploring data, starting with an example application to mine your own email inboxes. You’ll learn an iterative approach that enables you to quickly change the kind of analysis you’re doing, depending on what the data is telling you. All example code in this book is available as working Heroku apps.
  • Create analytics applications by using the agile big data development methodology
  • Build value from your data in a series of agile sprints, using the data-value stack
  • Gain insight by using several data structures to extract multiple features from a single dataset
  • Visualize data with charts, and expose different aspects through interactive reports
  • Use historical data to predict the future, and translate predictions into action
  • Get feedback from users after each sprint to keep your project on track

Russell Jurney cut his data teeth in casino gaming, building web apps to analyze the performance of slot machines in the US and Mexico. After dabbling in entrepreneurship, interactive media and journalism, he moved to silicon valley to build analytics applications at scale at Ning and LinkedIn. He lives on the ocean in Pacifica, California with his wife Kate and two fuzzy dogs.

Setup
Chapter 1 Theory
Agile Big Data
Big Words Defined
Agile Big Data Teams
Agile Big Data Process
Code Review and Pair Programming
Agile Environments: Engineering Productivity
Realizing Ideas with Large-Format Printing
Chapter 2 Data
Email
Working with Raw Data
SQL
NoSQL
Data Perspectives
Chapter 3 Agile Tools
Scalability = Simplicity
Agile Big Data Processing
Setting Up a Virtual Environment for Python
Serializing Events with Avro
Collecting Data
Data Processing with Pig
Publishing Data with MongoDB
Searching Data with ElasticSearch
Reflecting on our Workflow
Lightweight Web Applications
Presenting Our Data
Conclusion
Chapter 4 To the Cloud!
Introduction
GitHub
dotCloud
Amazon Web Services
Instrumentation
Climbing the Pyramid
Chapter 5 Collecting and Displaying Records
Putting It All Together
Collect and Serialize Our Inbox
Process and Publish Our Emails
Presenting Emails in a Browser
Agile Checkpoint
Listing Emails
Searching Our Email
Conclusion
Chapter 6 Visualizing Data with Charts
Good Charts
Extracting Entities: Email Addresses
Visualizing Time
Conclusion
Chapter 7 Exploring Data with Reports
Building Reports with Multiple Charts
Linking Records
Extracting Keywords from Emails with TF-IDF
Conclusion
Chapter 8 Making Predictions
Predicting Response Rates to Emails
Personalization
Conclusion
Chapter 9 Driving Actions
Properties of Successful Emails
Better Predictions with Naive Bayes
P(Reply | From & To)
P(Reply | Token)
Making Predictions in Real Time
Logging Events
Conclusion
Colophon

Zusatzinfo black & white illustrations, figures
Verlagsort Sebastopol
Sprache englisch
Maße 178 x 233 mm
Gewicht 299 g
Einbandart Paperback
Themenwelt Informatik Datenbanken Data Warehouse / Data Mining
Informatik Software Entwicklung Objektorientierung
Mathematik / Informatik Informatik Theorie / Studium
ISBN-10 1-4493-2626-9 / 1449326269
ISBN-13 978-1-4493-2626-5 / 9781449326265
Zustand Neuware
Haben Sie eine Frage zum Produkt?
Wie bewerten Sie den Artikel?
Bitte geben Sie Ihre Bewertung ein:
Bitte geben Sie Daten ein:
Mehr entdecken
aus dem Bereich
Auswertung von Daten mit pandas, NumPy und IPython

von Wes McKinney

Buch | Softcover (2023)
O'Reilly (Verlag)
44,90
Das umfassende Handbuch

von Wolfram Langer

Buch | Hardcover (2023)
Rheinwerk (Verlag)
49,90