Practical Cassandra

A Developer's Approach

Russell Bradberry, Eric Lubow (Autoren)

Buch | Softcover

208 Seiten

2013
Addison-Wesley Educational Publishers Inc (Verlag)
978-0-321-93394-2 (ISBN)

Titel ist leider vergriffen;
keine Neuauflage

Artikel merken

”Eric and Russell were early adopters of Cassandra at SimpleReach. In Practical Cassandra, you benefit from their experience in the trenches administering Cassandra, developing against it, and building one of the first CQL drivers. If you are deploying Cassandra soon, or you inherited a Cassandra cluster to tend, spend some time with the deployment, performance tuning, and maintenance chapters… If you are new to Cassandra, I highly recommend the chapters on data modeling and CQL.”

–From the Foreword by Jonathon Ellis, Apache Cassandra Chair

Build and Deploy Massively Scalable, Super-fast Data Management Applications with Apache Cassandra

Practical Cassandra is the first hands-on developer’s guide to building Cassandra systems and applications that deliver breakthrough speed, scalability, reliability, and performance. Fully up to date, it reflects the latest versions of Cassandra–including Cassandra Query Language (CQL), which dramatically lowers the learning curve for Cassandra developers.

Pioneering Cassandra developers and Datastax MVPs Russell Bradberry and Eric Lubow walk you through every step of building a real production application that can store enormous amounts of structured, semi-structured, and unstructured data. Drawing on their exceptional expertise, Bradberry and Lubow share practical insights into issues ranging from querying to deployment, management, maintenance, monitoring, and troubleshooting.

The authors cover key issues, from architecture to migration, and guide you through crucial decisions about configuration and data modeling. They provide tested sample code, detailed explanations of how Cassandra works ”under the covers,” and new case studies from three cutting-edge users: Ooyala, Hailo, and eBay.

Coverage includes

Understanding Cassandra’s approach, architecture, key concepts, and primary use cases– and why it’s so blazingly fast
Getting Cassandra up and running on single nodes and large clusters
Applying the new design patterns, philosophies, and features that make Cassandra such a powerful data store
Leveraging CQL to simplify your transition from SQL-based RDBMSes
Deploying and provisioning through the cloud or on bare-metal hardware
Choosing the right configuration options for each type of workload
Tweaking Cassandra to get maximum performance from your hardware, OS, and JVM
Mastering Cassandra’s essential tools for maintenance and monitoring
Efficiently solving the most common problems with Cassandra deployment, operation, and application development

Russell Bradberry (Twitter: @devdazed) is the principal architect at SimpleReach, where he is responsible for designing and building out highly scalable, high-volume, distributed data solutions. He has brought to market a wide range of products, including a real-time bidding ad server, a rich media ad management tool, a content recommendation system, and, most recently, a real-time social intelligence platform. He is a U.S. Navy veteran, a DataStax MVP for Apache Cassandra, and the author of the NodeJS Cassandra driver Helenus. Eric Lubow (Twitter: @elubow) is currently chief technology officer of SimpleReach, where he builds highly scalable, distributed systems for processing social data. He began his career building secure Linux systems. Since then he has worked on building and administering various types of ad systems, maintaining and deploying large-scale Web applications, and building email delivery and analytics systems. He is also a U.S. Army combat veteran and a DataStax MVP for Apache Cassandra. Eric and Russ are regular speakers about Cassandra and distributed systems, and both live in New York City.

Foreword by Jonathon Ellis xiii

Foreword by Paul Dix xv

Preface xvii

Acknowledgments xxi

About the Authors xxiii

Chapter 1: Introduction to Cassandra 1

A Greek Story 1

What Is NoSQL? 2

There’s No Such Thing as “Web Scale” 2

ACID, CAP, and BASE 2

Where Cassandra Fits In 5

What Is Cassandra? 5

Cassandra Terminology 8

Our Hope 9

Chapter 2: Installation 11

Prerequisites 11

Installation 11

Configuration 13

Cluster Setup 15

Summary 16

Chapter 3: Data Modeling 17

The Cassandra Data Model 17

Model Queries—Not Data 19

Collections 22

Summary 25

Chapter 4: CQL 27

A Familiar Way of Doing Things 27

Summary 39

Chapter 5: Deployment and Provisioning 41

Keyspace Creation 41

Replication Strategies 42

Snitches 43

Partitioners 46

Node Layout 48

Firewalls 49

Platforms 49

Summary 50

Chapter 6: Performance Tuning 51

Methodology 51

Tuning 52

System Tuning 62

Solid-State Drives 64

JVM Tuning 65

Summary 67

Chapter 7: Maintenance 69

Understanding nodetool 69

Ring Information 72

ColumnFamily Statistics 73

Thread Pool Statistics 74

Compactions 76

Backup and Restore 79

CommitLog Archiving 81

Summary 82

Chapter 8: Monitoring 83

Logging 83

JMX and MBeans 85

Health Checks 91

Summary 96

Chapter 9: Drivers and Sample Code 99

Java 100

C# 104

Python 108

Ruby 112

Summary 117

Chapter 10: Troubleshooting 119

Toolkit 119

Common Problems 121

Summary 126

Chapter 11: Architecture 127

Meta Keyspaces 127

Gossip Protocol 129

Failure Detection 130

HintedHandoffs 131

Bloom Filters 131

Summary 134

Chapter 12: Case Studies 135

Ooyala 135

Hailo 137

eBay 141

Summary 147

Appendix A: Getting Help 149

Preparing Information 149

IRC 149

Mailing Lists 149

Appendix B: Enterprise Cassandra 151

DataStax 151

Acunu 152

Titan by Aurelius 153

Pentaho 154

Instaclustr 154

Index 157

Erscheint lt. Verlag	31.12.2013
Verlagsort	New Jersey
Sprache	englisch
Maße	178 x 231 mm
Gewicht	336 g
Themenwelt	Informatik ► Datenbanken ► Data Warehouse / Data Mining
Themenwelt	Mathematik / Informatik ► Informatik ► Grafik / Design
ISBN-10	0-321-93394-X / 032193394X
ISBN-13	978-0-321-93394-2 / 9780321933942
Zustand	Neuware