Data Engineering Best Practices
Packt Publishing Limited (Verlag)
978-1-80324-498-3 (ISBN)
Key Features
Architect and engineer optimized data solutions in the cloud with best practices for performance and cost-effectiveness
Explore design patterns and use cases to balance roles, technology choices, and processes for a future-proof design
Learn from experts to avoid common pitfalls in data engineering projects
Purchase of the print or Kindle book includes a free PDF eBook
Book DescriptionRevolutionize your approach to data processing in the fast-paced business landscape with this essential guide to data engineering. Discover the power of scalable, efficient, and secure data solutions through expert guidance on data engineering principles and techniques. Written by two industry experts with over 60 years of combined experience, it offers deep insights into best practices, architecture, agile processes, and cloud-based pipelines.
You’ll start by defining the challenges data engineers face and understand how this agile and future-proof comprehensive data solution architecture addresses them. As you explore the extensive toolkit, mastering the capabilities of various instruments, you’ll gain the knowledge needed for independent research. Covering everything you need, right from data engineering fundamentals, the guide uses real-world examples to illustrate potential solutions. It elevates your skills to architect scalable data systems, implement agile development processes, and design cloud-based data pipelines. The book further equips you with the knowledge to harness serverless computing and microservices to build resilient data applications.
By the end, you'll be armed with the expertise to design and deliver high-performance data engineering solutions that are not only robust, efficient, and secure but also future-ready.What you will learn
Architect scalable data solutions within a well-architected framework
Implement agile software development processes tailored to your organization's needs
Design cloud-based data pipelines for analytics, machine learning, and AI-ready data products
Optimize data engineering capabilities to ensure performance and long-term business value
Apply best practices for data security, privacy, and compliance
Harness serverless computing and microservices to build resilient, scalable, and trustworthy data pipelines
Who this book is forIf you are a data engineer, ETL developer, or big data engineer who wants to master the principles and techniques of data engineering, this book is for you. A basic understanding of data engineering concepts, ETL processes, and big data technologies is expected. This book is also for professionals who want to explore advanced data engineering practices, including scalable data solutions, agile software development, and cloud-based data processing pipelines.
Richard J. Schiller is a chief architect, distinguished engineer, and startup entrepreneur with 40 years of experience delivering real-time large-scale data processing systems. He holds an MS in computer engineering from Columbia University's School of Engineering and Applied Science and a BA in computer science and applied mathematics. He has been involved with two prior successful startups and has coauthored three patents. He is a hands-on systems developer and innovator. David Larochelle has been involved in data engineering for startups, Fortune 500 companies, and research institutes. He holds a BS in computer science from the College of William & Mary, a Masters in computer science from the University of Virginia, and a Master's in communication from the University of Pennsylvania. David's career spans over 20 years, and his strong background has enabled him to work in a wide range of organizations, including startups, established companies, and research labs.
Table of Contents
Overview of the Business Problem Statement
A Data Engineer's Journey – Background Challenges
A Data Engineer's Journey – IT's Vision and Mission
Architecture Principles
Architecture Framework – Conceptual Architecture Best Practices
Architecture Framework – Logical Architecture Best Practices
Architecture Framework – Physical Architecture Best Practices
Software Engineering Best Practice Considerations
Key Considerations for Agile SDLC Best Practices
Key Considerations for Quality Testing Best Practices
Key Considerations for IT Operational Service Best Practices
Key Considerations for Data Service Best Practices
Key Considerations for Management Best Practices
Key Considerations for Data Delivery Best Practices
Other Considerations – Measures, Calculations, Restatements, and Data Science Best Practices
Machine Learning Pipeline Best Practices and Processes
Takeaway Summary – Putting It All Together
Appendix and Use Cases
Erscheinungsdatum | 06.09.2024 |
---|---|
Verlagsort | Birmingham |
Sprache | englisch |
Maße | 191 x 235 mm |
Themenwelt | Informatik ► Datenbanken ► Data Warehouse / Data Mining |
Informatik ► Software Entwicklung ► User Interfaces (HCI) | |
ISBN-10 | 1-80324-498-4 / 1803244984 |
ISBN-13 | 978-1-80324-498-3 / 9781803244983 |
Zustand | Neuware |
Haben Sie eine Frage zum Produkt? |
aus dem Bereich