Architecting HBase Applications. A Guidebook for Successful Development and Design - Helion
ISBN: 978-14-919-1610-0
stron: 252, Format: ebook
Data wydania: 2016-07-18
Księgarnia: Helion
Cena książki: 126,65 zł (poprzednio: 147,27 zł)
Oszczędzasz: 14% (-20,62 zł)
HBase is a remarkable tool for indexing mass volumes of data, but getting started with this distributed database and its ecosystem can be daunting. With this hands-on guide, you’ll learn how to architect, design, and deploy your own HBase applications by examining real-world solutions. Along with HBase principles and cluster deployment guidelines, this book includes in-depth case studies that demonstrate how large companies solved specific use cases with HBase.
Authors Jean-Marc Spaggiari and Kevin O’Dell also provide draft solutions and code examples to help you implement your own versions of those use cases, from master data management (MDM) and document storage to near real-time event processing. You’ll also learn troubleshooting techniques to help you avoid common deployment mistakes.
- Learn exactly what HBase does, what its ecosystem includes, and how to set up your environment
- Explore how real-world HBase instances were deployed and put into production
- Examine documented use cases for tracking healthcare claims, digital advertising, data management, and product quality
- Understand how HBase works with tools and techniques such as Spark, Kafka, MapReduce, and the Java API
- Learn how to identify the causes and understand the consequences of the most common HBase issues
Osoby które kupowały "Architecting HBase Applications. A Guidebook for Successful Development and Design", wybierały także:
- F# 4.0 dla zaawansowanych. Wydanie IV 96,45 zł, (29,90 zł -69%)
- Systemy reaktywne. Wzorce projektowe i ich stosowanie 65,31 zł, (20,90 zł -68%)
- GameMaker. Kurs video. Kompleksowy przewodnik tworzenia gier platformowych 154,58 zł, (55,65 zł -64%)
- Poradnik design thinking - czyli jak wykorzystać myślenie projektowe w biznesie 39,21 zł, (14,90 zł -62%)
- Flutter. Kurs video. Przewodnik dla 149,00 zł, (59,60 zł -60%)
Spis treści
Architecting HBase Applications. A Guidebook for Successful Development and Design eBook -- spis treści
- Foreword
- Preface
- Who Should Read This Book?
- How This Book Is Organized
- Additional Resources
- Conventions Used in This Book
- Using Code Examples
- Safari Books Online
- How to Contact Us
- Acknowledgments
- From Kevin
- From Jean-Marc
- I. Introduction to HBase
- 1. What Is HBase?
- Column-Oriented Versus Row-Oriented
- Implementation and Use Cases
- 2. HBase Principles
- Table Format
- Table Layout
- Table Storage
- Regions
- Column family
- Stores
- HFiles
- Blocks
- Cells
- Internal Table Operations
- Compaction
- Minor compaction
- Major compaction
- Splits (Auto-Sharding)
- Balancing
- Compaction
- Dependencies
- HBase Roles
- Master Server
- RegionServer
- Thrift Server
- REST Server
- Table Format
- 3. HBase Ecosystem
- Monitoring Tools
- Cloudera Manager
- Apache Ambari
- Hannibal
- SQL
- Apache Phoenix
- Apache Trafodion
- Splice Machine
- Honorable Mentions (Kylin, Themis, Tephra, Hive, and Impala)
- Frameworks
- OpenTSDB
- Kite
- HappyBase
- AsyncHBase
- Monitoring Tools
- 4. HBase Sizing and Tuning Overview
- Hardware
- Storage
- Networking
- OS Tuning
- Hadoop Tuning
- HBase Tuning
- Different Workload Tuning
- 5. Environment Setup
- System Requirements
- Operating System
- Virtual Machine
- VM modes
- Hadoop distribution
- Resources
- Memory
- Disk space
- Java
- HBase Standalone Installation
- HBase in a VM
- Local Versus VM
- Local Mode
- Pros
- Cons
- Virtual Linux Environment
- Pros
- Cons
- QuickStart VM (or Equivalent)
- Pros
- Cons
- Local Mode
- Troubleshooting
- IP/Name Configuration
- Access to the /tmp Folder
- Environment Variables
- Available Memory
- First Steps
- Basic Operations
- help
- create
- list
- Import Code Examples
- Download from command line
- Build from command line
- Download and build using Eclipse
- Testing the Examples
- From command line
- From Eclipse
- Basic Operations
- Pseudodistributed and Fully Distributed
- System Requirements
- II. Use Cases
- 6. Use Case: HBase as a System of Record
- Ingest/Pre-Processing
- Processing/Serving
- User Experience
- 7. Implementation of an Underlying Storage Engine
- Table Design
- Table Schema
- Hashing keys
- Column qualifier
- Table Parameters
- Compression
- Data block encoding
- Bloom filter
- Presplitting
- Implementation
- Table Schema
- Data conversion
- Generate Test Data
- Create Avro Schema
- Implement MapReduce Transformation
- HFile Validation
- Bulk Loading
- Data Validation
- Table Size
- Counting from the shell
- Counting from MapReduce
- File Content
- Using the shell
- Using Java
- Table Size
- Data Indexing
- Data Retrieval
- Going Further
- Table Design
- 8. Use Case: Near Real-Time Event Processing
- Ingest/Pre-Processing
- Near Real-Time Event Processing
- Processing/Serving
- 9. Implementation of Near Real-Time Event Processing
- Application Flow
- Kafka
- Flume
- HBase
- Lily
- Solr
- Implementation
- Data Generation
- Kafka
- Flume
- Flume Kafka source
- Flume Kafka channel
- Flume HBase sink
- Interceptor
- Conversion
- Lookup
- Serializer
- HBase
- Table design
- Table parameters
- Java implementation
- Lily
- Solr
- Testing
- Going Further
- Application Flow
- 10. Use Case: HBase as a Master Data Management Tool
- Ingest
- Processing
- 11. Implementation of HBase as a Master Data Management Tool
- MapReduce Versus Spark
- Get Spark Interacting with HBase
- Run Spark over an HBase Table
- Calling HBase from Spark
- Implementing Spark with HBase
- Spark and HBase: Puts
- Spark on HBase: Bulk Load
- Spark Over HBase
- Going Further
- 12. Use Case: Document Store
- Serving
- Ingest
- Clean Up
- 13. Implementation of Document Store
- MOBs
- Storage
- Usage
- Too Big
- Consistency
- Going Further
- MOBs
- III. Troubleshooting
- 14. Too Many Regions
- Consequences
- Causes
- Misconfiguration
- Misoperation
- Over-splitting
- Improper presplitting
- Solution
- Before 0.98
- Offline merges
- Using HBase command
- Using the Java API
- Starting with 0.98
- Using HBase shell
- Using the Java API
- Before 0.98
- Prevention
- Regions Size
- Key and Table Design
- 15. Too Many Column Families
- Consequences
- Memory
- Compactions
- Split
- Causes, Solution, and Prevention
- Delete a Column Family
- Merge a Column Family
- Separate a Column Family into a New Table
- Consequences
- 16. Hotspotting
- Consequences
- Causes
- Monotonically Incrementing Keys
- Poorly Distributed Keys
- Small Reference Tables
- Applications Issues
- Meta Region Hotspotting
- Prevention and Solution
- 17. Timeouts and Garbage Collection
- Consequences
- Causes
- Storage Failure
- Power-Saving Features
- Network Failure
- Solutions
- Prevention
- Reduce Heap Size
- Off-Heap BlockCache
- Using the G1GC Algorithm
- Must-use parameters
- Additional HBase settings while exceeding 100 GB heaps
- Other interesting parameters
- Configure Swappiness to 0 or 1
- Disable Environment-Friendly Features
- Hardware Duplication
- 18. HBCK and Inconsistencies
- HBase Filesystem Layout
- Reading META
- Reading HBase on HDFS
- General HBCK Overview
- Using HBCK
- Index