Principles of Data Wrangling - Joseph Hellerstein, Connor Carreras, Sean Kandel, Tye Rattenbury, Jeffrey Heer

Principles of Data Wrangling

Practical Techniques for Data Preparation
Buch | Softcover
60 Seiten
2017
O'Reilly Media (Verlag)
978-1-4919-3892-8 (ISBN)
29,95 inkl. MwSt
Written by key executives at Trifacta, this book teaches you a shared language and a comprehensive understanding of data wrangling, with an emphasis on recent agile analytic processes used by many of today's data-driven organizations.
A key task that any aspiring data-driven organization needs to learn is data wrangling, the process of converting raw data into something truly useful.

This practical guide provides business analysts with an overview of various data wrangling techniques and tools, and puts the practice of data wrangling into context by asking, "What are you trying to do and why?"

Wrangling data consumes roughly 50-80% of an analyst's time before any kind of analysis is possible. Written by key executives at Trifacta, this book walks you through the wrangling process by exploring several factors-time, granularity, scope, and structure-that you need to consider as you begin to work with data.

You'll learn a shared language and a comprehensive understanding of data wrangling, with an emphasis on recent agile analytic processes used by many of today's data-driven organizations.

Appreciate the importance - and the satisfaction - of wrangling data the right way.
  • Understand what kind of data is available
  • Choose which data to use and at what level of detail
  • Meaningfully combine multiple sources of data
  • Decide how to distill the results to a size and shape that can drive downstream analysis

Joseph M. Hellerstein is a Chief Strategy Officer at Trifacta and Chancellor's Professor of Computer Science at UC Berkeley. His work focuses on data-centric systems and the way they drive computing. He is an ACM Fellow, an Alfred P. Sloan Fellow and the recipient of three ACM-SIGMOD Test of Time awards for his research. He has been listed by Fortune Magazine among the 50 smartest people in technology, and MIT Technology Review included his work on their TR10 list of the 10 technologies most likely to change our world.

Jeffrey Heer is a co-founder and CXO (Chief Experience Officer) at Trifacta, a start-up company creating new tools for enhancing the productivity of data analysts. He is also a professor of Computer Science at Stanford University, where he leads the Stanford Visualization Group. His group has created a number of popular tools, including D3.js (Data-Driven Documents) and Data Wrangler. In Fall 2013, Jeff will join the faculty of Computer Science & Engineering at the University of Washington. In 2009 Jeff was named to MIT Technology Review's TR35; in 2012 he was named a Sloan Foundation Research Fellow. He holds BS, MS and PhD degrees in Computer Science from the University of California, Berkeley.

Erscheinungsdatum
Verlagsort Sebastopol
Sprache englisch
Maße 178 x 233 mm
Einbandart kartoniert
Themenwelt Informatik Datenbanken Data Warehouse / Data Mining
Wirtschaft Betriebswirtschaft / Management Wirtschaftsinformatik
Schlagworte Business Analytics • Data Driven • datadriven Business • Data Preparation
ISBN-10 1-4919-3892-7 / 1491938927
ISBN-13 978-1-4919-3892-8 / 9781491938928
Zustand Neuware
Haben Sie eine Frage zum Produkt?
Mehr entdecken
aus dem Bereich
Datenanalyse für Künstliche Intelligenz

von Jürgen Cleve; Uwe Lämmel

Buch | Softcover (2024)
De Gruyter Oldenbourg (Verlag)
74,95
Auswertung von Daten mit pandas, NumPy und IPython

von Wes McKinney

Buch | Softcover (2023)
O'Reilly (Verlag)
44,90