Streaming Architecture - Ted Dunning, Ellen Friedman

Streaming Architecture

New Designs Using Apache Kafka and MapR Streams
Buch | Softcover
120 Seiten
2016
O'Reilly Media (Verlag)
978-1-4919-5392-1 (ISBN)
22,40 inkl. MwSt
More and more data-driven companies are looking to adopt stream processing and streaming analytics. With this concise book, you’ll learn best practices for designing a reliable architecture that supports this emerging big-data paradigm.

Authors Ted Dunning and Ellen Friedman (Real World Hadoop) help you explore some of the best technologies to handle stream processing and analytics, with a focus on the upstream queuing or message-passing layer. To illustrate the effectiveness of these technologies, this book also includes specific use cases.

Ideal for developers and non-technical people alike, this book describes:
Key elements in good design for streaming analytics, focusing on the essential characteristics of the messaging layer
New messaging technologies, including Apache Kafka and MapR Streams, with links to sample code
Technology choices for streaming analytics: Apache Spark Streaming, Apache Flink, Apache Storm, and Apache Apex
How stream-based architectures are helpful to support microservices
Specific use cases such as fraud detection and geo-distributed data streams

Ted Dunning is Chief Applications Architect at MapR Technologiesand active in the open source community. He currently serves as VP for Incubator at the Apache Foundation, as a champion and mentor for a large number of projects, and ascommitter and PMC member of the Apache ZooKeeper and Drillprojects. He developed the t-digest algorithm used to estimateextreme quantiles. T-digest has been adopted by several open sourceprojects. He also developed the open source log-synth projectdescribed in the book Sharing Big Data Safely (O Reilly).Ted was the chief architect behind the MusicMatch (now YahooMusic) and Veoh recommendation systems, built fraud-detectionsystems for ID Analytics (LifeLock), and has issued 24 patents todate. Ted has a PhD in computing science from University of Sheffield.When he s not doing data science, he plays guitar and mandolin.Ted is on Twitter as @ted_dunning.

Ellen Friedman is a solutions consultant and well-known speaker and author, currently writing mainly about big data topics. She is acommitter for the Apache Drill and Apache Mahout projects. With aPhD in Biochemistry, she has years of experience as a research scientistand has written about a variety of technical topics, includingmolecular biology, nontraditional inheritance, and oceanography.Ellen is also coauthor of a book of magic-themed cartoons, A Rabbit Under the Hat (The Edition House). Ellen is on Twitter as@Ellen_Friedman.

Chapter 1Why Stream?
Planes, Trains, and Automobiles: Connected Vehicles and the IoT
Streaming Data: Life As It Happens
Beyond Real Time: More Benefits of Streaming Architecture
Emerging Best Practices for Streaming Architectures
Healthcare Example with Data Streams
Streaming Data as a Central Aspect of Architectural Design
Chapter 2Stream-based Architecture
A Limited View: Single Real-Time Application
Key Aspects of a Universal Stream-based Architecture
Importance of the Messaging Technology
Choices for Real-Time Analytics
Comparison of Capabilities for Streaming Analytics
Summary
Chapter 3Streaming Architecture: Ideal Platform for Microservices
Why Microservices Matter
What Is Needed to Support Microservices
Microservices in More Detail
Designing a Streaming Architecture: Online Video Service Example
Importance of a Universal Microarchitecture
What’s in a Name?
Why Use Distributed Files and NoSQL Databases?
New Design for the Video Service
Summary: The Converged Platform View
Chapter 4Kafka as Streaming Transport
Motivations for Kafka
Kafka Innovations
Kafka Basic Concepts
The Kafka APIs
Kafka Utility Programs
Kafka Gotchas
Summary
Chapter 5MapR Streams
Innovations in MapR Streams
History and Context of MapR’s Streaming System
How MapR Streams Works
How to Configure MapR Streams
Geo-Distributed Replication
MapR Streams Gotchas
Chapter 6Fraud Detection with Streaming Data
Card Velocity
Fast Response Decision to the Question: “Is It Fraud?”
Multiuse Streaming Data
Scaling Up the Fraud Detector
Summary
Chapter 7Geo-Distributed Data Streams
Stakeholders
Design Goals
Design Choices
Advantages of Streams-based Geo-Replication
Chapter 8Putting It All Together
Benefits of Stream-based Architectures
Making the Transition to Streaming Architecture
Conclusion
Appendix Additional Resources
Streaming Data Topics
Selected O’Reilly Publications by the Authors

Erscheinungsdatum
Zusatzinfo colour illustrations
Verlagsort Sebastopol
Sprache englisch
Maße 164 x 230 mm
Gewicht 180 g
Einbandart kartoniert
Themenwelt Informatik Datenbanken Data Warehouse / Data Mining
Mathematik / Informatik Informatik Theorie / Studium
Mathematik / Informatik Informatik Web / Internet
Schlagworte Apache • Data Driven • data stream analysis • Datengetrieben • Datenstromverarbeitung • Kafka
ISBN-10 1-4919-5392-6 / 1491953926
ISBN-13 978-1-4919-5392-1 / 9781491953921
Zustand Neuware
Haben Sie eine Frage zum Produkt?
Mehr entdecken
aus dem Bereich
Auswertung von Daten mit pandas, NumPy und IPython

von Wes McKinney

Buch | Softcover (2023)
O'Reilly (Verlag)
44,90
Datenanalyse für Künstliche Intelligenz

von Jürgen Cleve; Uwe Lämmel

Buch | Softcover (2024)
De Gruyter Oldenbourg (Verlag)
74,95