Open Source MLOPs - Matthew Upson

Open Source MLOPs

Version Control and Automation for Machine Learning Pipelines With DVC and CML

(Autor)

Buch | Softcover
2024
Packt Publishing Limited (Verlag)
978-1-80181-320-4 (ISBN)
37,40 inkl. MwSt
Build automated machine learning pipelines using CI/CD techniques applied to the domain of machine learning

Key Features

Create reproducible and automated machine learning pipelines using DVC and CML
Speed up your machine learning development and promote collaboration using CI/CD techniques
Ensure you stay ahead of the curve in the fiercely competitive machine learning market

Book DescriptionThe process of deriving useful insights from machine learning can be an arduous, though rewarding, one, even for data science practitioners. It’s worth investing in any tools or techniques that can assist with the process.

Open Source MLOPs with DVC and CML will take you through two such techniques, which will allow you to automate your machine learning pipelines and make them eminently reproducible.

You'll begin with an introduction to Data Version Control (DVC) and learn how it can help you keep track of your machine learning artifacts using a familiar Git-like approach. This will lead you on to building end-to-end machine learning pipelines, complete with visualizations of the results. We move on to Continuous Machine Learning (CML), with which you can automate the training and testing of machine learning models so they can run alongside the rest of your CI/CD pipeline, ensuring stability and reproducibility.

By the end of this book, you will be able to develop reproducible pipelines as directed acyclic graphs and run those pipelines effortlessly in the cloud to speed up the development of your machine learning models.What you will learn

Create an S3 bucket to act as a remote repository
Use remote storage and a GitHub repository to create a model registry
Construct pipelines in YAML format in the dvc.yaml file
Define for loops within the DVC pipeline to reduce repetition
Share experiments with a coworker
Access and save objects using DVC’s Python API
Run CML workloads on AWS EC2 instances including GPU-equipped machines
Report results such as DVC metrics and plots to a GitHub pull request

Who this book is forPredominantly this book will be for people who want to learn how to use DVC and CML to build pipelines of the deployment of machine learning models. These people are most likely to be data scientists, or possibly software engineers, or students in training on PhD or MSc programs who are developing machine learning models. The book may also be useful for those interested in the Data Version Control aspect who are not (or not currently) developing or deploying machine learning models.

A bare minimum knowledge of data analytics, and a concern for producing analysis reproducibly and eagerness to learn is expected.

Matthew Upson is a Data Scientist and Founder of MantisNLP experienced in Natural Language Processing and Machine Learning / Data Engineering problems. Previously he was the Lead Data Scientist at Juro, a legal tech startup where he used AI to make contracts faster, smarter, and more human. Prior to working at Juro he worked as a Data Scientist in the UK Government predominantly on Machine Learning services for Natural Language Processing. Version Control, Continuous integration, and Cloud Computing.

Table of Contents

A Brief Introduction to MLOps
First Steps with DVC
Using Remote Storage
Sharing Data with Registries
Troubleshooting Issues with DVC
Building pipelines with DVC
Advanced Pipelines – Parameterization and foreach Stages
Creating Plots with DVC
Experiment Tracking
Deploying models with DVC
Automating your pipelines with github actions

Erscheinungsdatum
Verlagsort Birmingham
Sprache englisch
Maße 191 x 235 mm
Themenwelt Informatik Theorie / Studium Künstliche Intelligenz / Robotik
ISBN-10 1-80181-320-5 / 1801813205
ISBN-13 978-1-80181-320-4 / 9781801813204
Zustand Neuware
Haben Sie eine Frage zum Produkt?
Mehr entdecken
aus dem Bereich
von absurd bis tödlich: Die Tücken der künstlichen Intelligenz

von Katharina Zweig

Buch | Softcover (2023)
Heyne (Verlag)
20,00
dem Menschen überlegen – wie KI uns rettet und bedroht

von Manfred Spitzer

Buch | Hardcover (2023)
Droemer (Verlag)
24,00