Machine Learning Upgrade - Kristen Kehrer, Caleb Kaiser

Machine Learning Upgrade

A Data Scientist's Guide to MLOps, LLMs, and ML Infrastructure
Buch | Softcover
240 Seiten
2024
John Wiley & Sons Inc (Verlag)
978-1-394-24963-3 (ISBN)
38,65 inkl. MwSt
A much-needed guide to implementing new technology in workspaces

From experts in the field comes Machine Learning Upgrade: A Data Scientist's Guide to MLOps, LLMs, and ML Infrastructure, a book that provides data scientists and managers with best practices at the intersection of management, large language models (LLMs), machine learning, and data science. This groundbreaking book will change the way that you view the pipeline of data science. The authors provide an introduction to modern machine learning, showing you how it can be viewed as a holistic, end-to-end system—not just shiny new gadget in an otherwise unchanged operational structure. By adopting a data-centric view of the world, you can begin to see unstructured data and LLMs as the foundation upon which you can build countless applications and business solutions. This book explores a whole world of decision making that hasn't been codified yet, enabling you to forge the future using emerging best practices.



Gain an understanding of the intersection between large language models and unstructured data
Follow the process of building an LLM-powered application while leveraging MLOps techniques such as data versioning and experiment tracking
Discover best practices for training, fine tuning, and evaluating LLMs
Integrate LLM applications within larger systems, monitor their performance, and retrain them on new data

This book is indispensable for data professionals and business leaders looking to understand LLMs and the entire data science pipeline.

Kristen Kehrer has been providing innovative and practical statistical modeling solutions since 2010. In 2018, she achieved recognition as a LinkedIn Top Voice in Data Science & Analytics. Kristen is also the founder of Data Moves Me, LLC. Caleb Kaiser is a Full Stack Engineer at Comet. Caleb was previously on the Founding Team at Cortex Labs. Caleb also worked at Scribe Media on the Author Platform Team.

Introduction ix

1 A Gentle Introduction to Modern Machine Learning 1

Data Science Is Diverging from Business Intelligence 3

From CRISP-DM to Modern, Multicomponent ml Systems 4

The Emergence of LLMs Has Increased ML’s Power and Complexity 7

What You Can Expect from This Book 9

2 An End-to-End Approach 11

Components of a YouTube Search Agent 13

Principles of a Production Machine Learning System 16

Observability 19

Reproducibility 19

Interoperability 20

Scalability 21

Improvability 22

A Note on Tools 23

3 A Data-Centric View 25

The Emergence of Foundation Models 25

The Role of Off-the-Shelf Components 27

The Data-Driven Approach 28

A Note on Data Ethics 28

Building the Dataset 30

Working with Vector Databases 34

Data Versioning and Management 50

Getting Started with Data Versioning 53

Knowing “Just Enough” Engineering 57

4 Standing Up Your LLM 61

Selecting Your LLM 61

What Type of Inference Do I Need to Perform? 65

How Open-Ended Is This Task? 66

What Are the Privacy Concerns for This Data? 66

How Much Will This Model Cost? 67

Experiment Management with LLMs 68

LLM Inference 74

Basics of Prompt Engineering 74

In-Context Learning 77

Intermediary Computation 85

Augmented Generation 89

Agentic Techniques 94

Optimizing LLM Inference with Experiment Management 102

Fine-Tuning LLMs 111

When to Fine-Tune an LLM 112

Quantization, QLOrA, and Parameter Efficient Fine-Tuning 113

Wrapping Things Up 121

5 Putting Together an Application 123

Prototyping with Gradio 125

Creating Graphics with Plotnine 128

Adding the Author Selector 137

Adding a Logo 138

Adding a Tab 139

Adding a Title and Subtitle 140

Changing the Color of the Buttons 140

Click to Download Button 141

Putting It All Together 141

Deploying Models as APIs 144

Implementing an API with FastAPI 146

Implementing Uvicorn 148

Monitoring an LLM 149

Dockerizing Your Service 151

Deploying Your Own LLM 154

Wrapping Things Up 159

6 Rounding Out the ML Life Cycle 161

Deploying a Simple Random Forest Model 161

An Introduction to Model Monitoring 167

Model Monitoring with Evidently AI 175

Building a Model Monitoring System 176

Final Thoughts on Monitoring 187

7 Review of Best Practices 189

Step 1: Understand the Problem 189

Step 2: Model Selection and Training 190

Step 3: Deploy and Maintain 192

Step 4: Collaborate and Communicate 196

Emerging Trends in LLMs 197

Next Steps in Learning 199

Appendix: Additional LLM Example 201

Index 209

Erscheinungsdatum
Verlagsort New York
Sprache englisch
Maße 150 x 226 mm
Gewicht 272 g
Themenwelt Informatik Datenbanken Data Warehouse / Data Mining
Informatik Theorie / Studium Künstliche Intelligenz / Robotik
ISBN-10 1-394-24963-2 / 1394249632
ISBN-13 978-1-394-24963-3 / 9781394249633
Zustand Neuware
Haben Sie eine Frage zum Produkt?
Mehr entdecken
aus dem Bereich
Auswertung von Daten mit pandas, NumPy und IPython

von Wes McKinney

Buch | Softcover (2023)
O'Reilly (Verlag)
44,90