Applied Text Analysis with Python - Benjamin Bengfort, Rebecca Bilbro, Tony Ojeda

Applied Text Analysis with Python

Enabling Language-Aware Data Products with Machine Learning
Buch | Softcover
350 Seiten
2018
O'Reilly Media (Verlag)
978-1-4919-6304-3 (ISBN)
59,45 inkl. MwSt
From news and speeches to informal chatter on social media, natural language is one of the richest and most underutilized sources of data. Not only does it come in a constant stream, always changing and adapting in context; it also contains information that is not conveyed by traditional data sources. The key to unlocking natural language is through the creative application of text analytics. This practical book presents a data scientist's approach to building language-aware products with applied machine learning.

You'll learn robust, repeatable, and scalable techniques for text analysis with Python, including contextual and linguistic feature engineering, vectorization, classification, topic modeling, entity resolution, graph analysis, and visual steering. By the end of the book, you'll be equipped with practical methods to solve any number of complex real-world problems.

Preprocess and vectorize text into high-dimensional feature representations
Perform document classification and topic modeling
Steer the model selection process with visual diagnostics
Extract key phrases, named entities, and graph structures to reason about data in text
Build a dialog framework to enable chatbots and language-driven interaction
Use Spark to scale processing power and neural networks to scale model complexity

Benjamin Bengfort is a Data Scientist who lives inside the beltway but ignores politics (the normal business of DC) favoring technology instead. He is currently working to finish his PhD at the University of Maryland where he studies machine learning and distributed computing. His lab does have robots (though this field of study is not one he favors) and, much to his chagrin, they seem to constantly arm said robots with knives and tools; presumably to pursue culinary accolades. Having seen a robot attempt to slice a tomato, Benjamin prefers his own adventures in the kitchen where he specializes in fusion French and Guyanese cuisine as well as BBQ of all types. A professional programmer by trade, a Data Scientist by vocation, Benjamin's writing pursues a diverse range of subjects from Natural Language Processing, to Data Science with Python to analytics with Hadoop and Spark. Tony is the founder of District Data Labs and focuses on applied analytics for business strategy. He has published a book on practical data science, and has experience with hands-on education and data science curricula. Rebecca is a data scientist at the U.S. Department of Commerce Data Service. She specializes in data visualization for machine learning and has given several talks related to improving the model selection process with visualization.

Erscheinungsdatum
Verlagsort Sebastopol
Sprache englisch
Maße 182 x 233 mm
Gewicht 598 g
Einbandart kartoniert
Themenwelt Informatik Datenbanken Data Warehouse / Data Mining
Informatik Programmiersprachen / -werkzeuge Python
Informatik Theorie / Studium Künstliche Intelligenz / Robotik
Mathematik / Informatik Informatik Web / Internet
ISBN-10 1-4919-6304-2 / 1491963042
ISBN-13 978-1-4919-6304-3 / 9781491963043
Zustand Neuware
Haben Sie eine Frage zum Produkt?
Mehr entdecken
aus dem Bereich
Datenanalyse für Künstliche Intelligenz

von Jürgen Cleve; Uwe Lämmel

Buch | Softcover (2024)
De Gruyter Oldenbourg (Verlag)
74,95
Auswertung von Daten mit pandas, NumPy und IPython

von Wes McKinney

Buch | Softcover (2023)
O'Reilly (Verlag)
44,90