Cassandra: The Definitive Guide - Helion
ISBN: 978-14-493-9664-0
stron: 332, Format: ebook
Data wydania: 2010-11-12
Księgarnia: Helion
Cena książki: 139,00 zł
What could you do with data if scalability wasn't a problem? With this hands-on guide, you'll learn how Apache Cassandra handles hundreds of terabytes of data while remaining highly available across multiple data centers -- capabilities that have attracted Facebook, Twitter, and other data-intensive companies. Cassandra: The Definitive Guide provides the technical details and practical examples you need to assess this database management system and put it to work in a production environment.
Author Eben Hewitt demonstrates the advantages of Cassandra's nonrelational design, and pays special attention to data modeling. If you're a developer, DBA, application architect, or manager looking to solve a database scaling issue or future-proof your application, this guide shows you how to harness Cassandra's speed and flexibility.
- Understand the tenets of Cassandra's column-oriented structure
- Learn how to write, update, and read Cassandra data
- Discover how to add or remove nodes from the cluster as your application requires
- Examine a working application that translates from a relational model to Cassandra's data model
- Use examples for writing clients in Java, Python, and C#
- Use the JMX interface to monitor a cluster's usage, memory patterns, and more
- Tune memory settings, data storage, and caching for better performance
Osoby które kupowały "Cassandra: The Definitive Guide", wybierały także:
- MongoDB w akcji 89,00 zł, (44,50 zł -50%)
- NoSQL. Przyjazny przewodnik 69,00 zł, (34,50 zł -50%)
- NoSQL. Kompendium wiedzy 39,00 zł, (19,50 zł -50%)
- Wprowadzenie do systemów baz danych. Wydanie VII 179,00 zł, (134,25 zł -25%)
- Wprowadzenie do systemów baz danych. Wydanie VII 179,00 zł, (134,25 zł -25%)
Spis treści
Cassandra: The Definitive Guide eBook -- spis treści
- Cassandra: The Definitive Guide
- Dedication
- SPECIAL OFFER: Upgrade this ebook with OReilly
- A Note Regarding Supplemental Files
- Foreword
- Preface
- Why Apache Cassandra?
- Is This Book for You?
- Whats in This Book?
- Finding Out More
- Conventions Used in This Book
- Using Code Examples
- Safari Enabled
- How to Contact Us
- Acknowledgments
- 1. Introducing Cassandra
- Whats Wrong with Relational Databases?
- A Quick Review of Relational Databases
- RDBMS: The Awesome and the Not-So-Much
- Transactions, ACID-ity, and two-phase commit
- Schema
- Sharding and shared-nothing architecture
- Summary
- Web Scale
- RDBMS: The Awesome and the Not-So-Much
- The Cassandra Elevator Pitch
- Cassandra in 50 Words or Less
- Distributed and Decentralized
- Elastic Scalability
- High Availability and Fault Tolerance
- Tuneable Consistency
- Brewers CAP Theorem
- Row-Oriented
- Schema-Free
- High Performance
- Where Did Cassandra Come From?
- Use Cases for Cassandra
- Large Deployments
- Lots of Writes, Statistics, and Analysis
- Geographical Distribution
- Evolving Applications
- Who Is Using Cassandra?
- Summary
- 2. Installing Cassandra
- Installing the Binary
- Extracting the Download
- Whats In There?
- Building from Source
- Additional Build Targets
- Building with Maven
- Running Cassandra
- On Windows
- On Linux
- Starting the Server
- Running the Command-Line Client Interface
- Basic CLI Commands
- Help
- Connecting to a Server
- Describing the Environment
- Creating a Keyspace and Column Family
- Writing and Reading Data
- Summary
- Installing the Binary
- 3. The Cassandra Data Model
- The Relational Data Model
- A Simple Introduction
- Clusters
- Keyspaces
- Column Families
- Column Family Options
- Columns
- Wide Rows, Skinny Rows
- Column Sorting
- Super Columns
- Composite Keys
- Design Differences Between RDBMS and Cassandra
- No Query Language
- No Referential Integrity
- Secondary Indexes
- Sorting Is a Design Decision
- Denormalization
- Design Patterns
- Materialized View
- Valueless Column
- Aggregate Key
- Some Things to Keep in Mind
- Summary
- 4. Sample Application
- Data Design
- Hotel App RDBMS Design
- Hotel App Cassandra Design
- Hotel Application Code
- Creating the Database
- Loading the schema
- Data Structures
- Getting a Connection
- Prepopulating the Database
- The Search Application
- Creating the Database
- Twissandra
- Summary
- 5. The Cassandra Architecture
- System Keyspace
- Peer-to-Peer
- Gossip and Failure Detection
- Anti-Entropy and Read Repair
- Memtables, SSTables, and Commit Logs
- Hinted Handoff
- Compaction
- Bloom Filters
- Tombstones
- Staged Event-Driven Architecture (SEDA)
- Managers and Services
- Cassandra Daemon
- Storage Service
- Messaging Service
- Hinted Handoff Manager
- Summary
- 6. Configuring Cassandra
- Keyspaces
- Creating a Column Family
- Transitioning from 0.6 to 0.7
- Replicas
- Replica Placement Strategies
- Simple Strategy
- Old Network Topology Strategy
- Network Topology Strategy
- Replication Factor
- Increasing the Replication Factor
- Partitioners
- Random Partitioner
- Order-Preserving Partitioner
- Collating Order-Preserving Partitioner
- Byte-Ordered Partitioner
- Snitches
- Simple Snitch
- PropertyFileSnitch
- Creating a Cluster
- Changing the Cluster Name
- Adding Nodes to a Cluster
- Multiple Seed Nodes
- Dynamic Ring Participation
- Security
- Using SimpleAuthenticator
- Programmatic Authentication
- Using MD5 Encryption
- Providing Your Own Authentication
- Miscellaneous Settings
- Additional Tools
- Viewing Keys
- Importing Previous Configurations
- Summary
- Keyspaces
- 7. Reading and Writing Data
- Query Differences Between RDBMS and Cassandra
- No Update Query
- Record-Level Atomicity on Writes
- No Server-Side Transaction Support
- No Duplicate Keys
- Basic Write Properties
- Consistency Levels
- Basic Read Properties
- The API
- Ranges and Slices
- Setup and Inserting Data
- Using a Simple Get
- Seeding Some Values
- Slice Predicate
- Getting Particular Column Names with Get Slice
- Getting a Set of Columns with Slice Range
- Counts
- Reversed
- Getting All Columns in a Row
- Get Range Slices
- Multiget Slice
- Deleting
- Batch Mutates
- Batch Deletes
- Range Ghosts
- Programmatically Defining Keyspaces and Column Families
- Summary
- Query Differences Between RDBMS and Cassandra
- 8. Clients
- Basic Client API
- Thrift
- Thrift Support for Java
- Exceptions
- Thrift Summary
- Avro
- Avro Ant Targets
- Avro Specification
- Avro Summary
- A Bit of Git
- Connecting Client Nodes
- Client List
- Round-Robin DNS
- Load Balancer
- Cassandra Web Console
- Hector (Java)
- Features
- The Hector API
- HectorSharp (C#)
- Chirper
- Chiton (Python)
- Pelops (Java)
- Kundera (Java ORM)
- Fauna (Ruby)
- Summary
- 9. Monitoring
- Logging
- Tailing
- General Tips
- Following along
- Warning signs
- Overview of JMX and MBeans
- MBeans
- Integrating JMX
- Interacting with Cassandra via JMX
- Cassandras MBeans
- org.apache.cassandra.concurrent
- org.apache.cassandra.db
- org.apache.cassandra.gms
- org.apache.cassandra.service
- StorageService
- StreamingService
- Custom Cassandra MBeans
- Runtime Analysis Tools
- Heap Analysis with JMX and JHAT
- Detecting Thread Problems
- Health Check
- Summary
- Logging
- 10. Maintenance
- Getting Ring Information
- Info
- Ring
- Range Tokens
- Getting Statistics
- Using cfstats
- Using tpstats
- Basic Maintenance
- Repair
- Flush
- Cleanup
- Snapshots
- Taking a Snapshot
- Clearing a Snapshot
- Load-Balancing the Cluster
- loadbalance and streams
- Decommissioning a Node
- Updating Nodes
- Removing Tokens
- Compaction Threshold
- Changing Column Families in a Working Cluster
- Summary
- Getting Ring Information
- 11. Performance Tuning
- Data Storage
- Reply Timeout
- Commit Logs
- Memtables
- Concurrency
- Caching
- Buffer Sizes
- Using the Python Stress Test
- Generating the Python Thrift Interfaces
- Getting Thrift
- Running the Python Stress Test
- Generating the Python Thrift Interfaces
- Startup and JVM Settings
- Tuning the JVM
- Summary
- 12. Integrating Hadoop
- What Is Hadoop?
- Working with MapReduce
- Cassandra Hadoop Source Package
- Running the Word Count Example
- Outputting Data to Cassandra
- Hadoop Streaming
- Tools Above MapReduce
- Pig
- Hive
- Cluster Configuration
- Use Cases
- Raptr.com: Keith Thornhill
- Imagini: Dave Gardner
- Summary
- A. The Nonrelational Landscape
- Nonrelational Databases
- Object Databases
- XML Databases
- SoftwareAG Tamino
- eXist
- Oracle Berkeley XML DB
- MarkLogic Server
- Apache Xindice
- Summary
- Document-Oriented Databases
- IBM Lotus
- Apache CouchDB
- MongoDB
- Riak
- Graph Databases
- FlockDB
- Neo4J
- Key-Value Stores and Distributed Hashtables
- Amazon Dynamo
- Project Voldemort
- Redis
- Columnar Databases
- Google Bigtable
- HBase
- Hypertable
- Polyglot Persistence
- Summary
- Glossary
- Index
- About the Author
- Colophon
- SPECIAL OFFER: Upgrade this ebook with OReilly
- Copyright