Database Reliability Engineering. Designing and Operating Resilient Database Systems - Helion
ISBN: 978-14-919-2621-5
stron: 294, Format: ebook
Data wydania: 2017-10-26
Księgarnia: Helion
Cena książki: 152,15 zł (poprzednio: 176,92 zł)
Oszczędzasz: 14% (-24,77 zł)
The infrastructure-as-code revolution in IT is also affecting database administration. With this practical book, developers, system administrators, and junior to mid-level DBAs will learn how the modern practice of site reliability engineering applies to the craft of database architecture and operations. Authors Laine Campbell and Charity Majors provide a framework for professionals looking to join the ranks of today’s database reliability engineers (DBRE).
You’ll begin by exploring core operational concepts that DBREs need to master. Then you’ll examine a wide range of database persistence options, including how to implement key technologies to provide resilient, scalable, and performant data storage and retrieval. With a firm foundation in database reliability engineering, you’ll be ready to dive into the architecture and operations of any modern database.
This book covers:
- Service-level requirements and risk management
- Building and evolving an architecture for operational visibility
- Infrastructure engineering and infrastructure management
- How to facilitate the release management process
- Data storage, indexing, and replication
- Identifying datastore characteristics and best use cases
- Datastore architectural components and data-driven architectures
Osoby które kupowały "Database Reliability Engineering. Designing and Operating Resilient Database Systems", wybierały także:
- Windows Media Center. Domowe centrum rozrywki 66,67 zł, (8,00 zł -88%)
- Ruby on Rails. Ćwiczenia 18,75 zł, (3,00 zł -84%)
- Przywództwo w świecie VUCA. Jak być skutecznym liderem w niepewnym środowisku 58,64 zł, (12,90 zł -78%)
- Scrum. O zwinnym zarządzaniu projektami. Wydanie II rozszerzone 58,64 zł, (12,90 zł -78%)
- Od hierarchii do turkusu, czyli jak zarządzać w XXI wieku 58,64 zł, (12,90 zł -78%)
Spis treści
Database Reliability Engineering. Designing and Operating Resilient Database Systems eBook -- spis treści
- Foreword
- Preface
- Why We Wrote This Book
- Who This Book Is For
- How This Book Is Organized
- Conventions Used in This Book
- OReilly Safari
- How to Contact Us
- 1. Introducing Database Reliability Engineering
- Guiding Principles of the DBRE
- Protect the Data
- Self-Service for Scale
- Elimination of Toil
- Databases Are Not Special Snowflakes
- Eliminate the Barriers Between Software and Operations
- Operations Core Overview
- Hierarchy of Needs
- Survival and Safety
- Love and Belonging
- Esteem
- Self-actualization
- Wrapping Up
- Guiding Principles of the DBRE
- 2. Service-Level Management
- Why Do I Need Service-Level Objectives?
- Service-Level Indicators
- Latency
- Availability
- Throughput
- Durability
- Cost or Efficiency
- Defining Service Objectives
- Latency Indicators
- Availability Indicators
- Resiliency versus robustness in availability
- Designing for downtime allowed
- Throughput Indicators
- Cost/efficiency indicators
- Considerations
- Monitoring and Reporting on SLOs
- Monitoring Availability
- Monitoring Latency
- Monitoring Throughput
- Monitoring Cost and Efficiency
- Wrapping Up
- 3. Risk Management
- Risk Considerations
- Unknown Factors and Complexity
- Availability of Resources
- Human Factors
- Group Factors
- What Do We Do?
- What Not to Do
- A Working Process: Bootstrapping
- Service Risk Evaluation
- Architectural Inventory
- Prioritization
- Severe impact (immediate SLO violation)
- Major (imminent SLO violation)
- Moderate (could contribute to SLO violation with other incidents in the same period)
- Minor
- Control and Decision Making
- Identification
- Evaluation
- Mitigation and controls
- Implementation
- Ongoing Iterations
- Wrapping Up
- Risk Considerations
- 4. Operational Visibility
- The New Rules of Operational Visibility
- Treat OpViz Systems Like BI Systems
- Distributed Ephemeral Environments Trending to the Norm
- Store at High Resolutions for Key Metrics
- Keep Your Architecture Simple
- An OpViz Framework
- Data In
- Telemetry/Metrics
- Events
- Logs
- Data Out
- Bootstrapping Your Monitoring
- Is the Data Safe?
- Is the Service Up?
- Are the Consumers in Pain?
- Instrumenting the Application
- Distributed Tracing
- Events and Logs
- Instrumenting the Server or Instance
- Events and Logs
- Instrumenting the Datastore
- Datastore Connection Layer
- Utilization
- Saturation
- Errors
- Internal Database Visibility
- Throughput and Latency Metrics
- Commits, Redo, and Journaling
- Replication State
- Memory Structures
- Locking and Concurrency
- Database Objects
- Database Queries
- Database Asserts and Events
- Wrapping Up
- The New Rules of Operational Visibility
- 5. Infrastructure Engineering
- Hosts
- Physical Servers
- Operating a System and Kernel
- User resource limits
- I/O scheduler
- Memory allocation and fragmentation
- Swapping
- Non-Uniform memory access
- Network
- Storage
- Storage capacity
- Storage throughput
- Storage latency
- Storage availability
- Durability
- Storage Area Networks
- Benefits of Physical Servers
- Cons of Physical Servers
- Virtualization
- Hypervisor
- Concurrency
- Storage
- Use Cases
- Containers
- Database as a Service
- Challenges of DBaaS
- The DBRE and the DBaaS
- Wrapping Up
- Hosts
- 6. Infrastructure Management
- Version Control
- Configuration Definition
- Building from Configuration
- Maintaining Configuration
- Enforcement of Configuration Definitions
- Configuration synchronization
- Component redeploys
- Enforcement of Configuration Definitions
- Infrastructure Definition and Orchestration
- Monolithic Infrastructure Definitions
- Separating Vertically
- Separated Tiers (Horizontal Definitions)
- Acceptance Testing and Compliance
- Service Catalog
- Bringing It All Together
- Development Environments
- Wrapping Up
- 7. Backup and Recovery
- Core Concepts
- Physical versus Logical
- Online versus Offline
- Full, Incremental, and Differential
- Considerations for Recovery
- Recovery Scenarios
- Planned Recovery Scenarios
- New production nodes and clusters
- Building different environments
- ETL and pipeline processes for downstream datastores
- Operational tests
- Unplanned Scenarios
- User error
- Application errors
- Infrastructure services
- OS and hardware errors
- Hardware failures
- Datacenter failures
- Scenario scope
- Scenario Impact
- Planned Recovery Scenarios
- Anatomy of a Recovery Strategy
- Building Block 1: Detection
- User error
- Application errors
- Infrastructure services
- OS and hardware errors
- Hardware and datacenter failures
- Building Block 2: Tiered Storage
- Online, high performance storage
- Online, low-performance storage
- Offline storage
- Object storage
- Building Block 3: A Varied Toolbox
- Full physical backups
- Incremental physical backups
- Full and incremental logical backups
- Object stores
- Building Block 4: Testing
- Building Block 1: Detection
- A Recovery Strategy Defined
- Online, Fast Storage with Full and Incremental Backups
- Use Cases
- Detection
- Tiered storage
- Toolbox
- Testing
- Online, Slow Storage with Full and Incremental Backups
- Use cases
- Detection
- Tiered storage
- Toolbox
- Testing
- Offline Storage
- Use cases
- Detection
- Tiered storage
- Toolbox
- Testing
- Object Storage
- Use cases
- Detection
- Testing
- Online, Fast Storage with Full and Incremental Backups
- Wrapping Up
- Core Concepts
- 8. Release Management
- Education and Collaboration
- Become a Funnel
- Foster Conversations
- Domain-Specific Knowledge
- Architecture
- Data model
- Best Practices and Standards
- Tools
- Collaboration
- Integration
- Prerequisites
- Version control system
- Database build automation
- Test data
- Database migrations and packaging
- CI server and test framework
- Prerequisites
- Testing
- Test-Friendly Development Practices
- Abstraction and encapsulation
- Being efficient
- Post-Commit Testing
- Pre-build
- Build
- Post-build
- Full Dataset Testing
- Downstream Tests
- Operational Tests
- Test-Friendly Development Practices
- Deployment
- Migrations and Versioning
- Impact Analysis
- Locking of objects
- Saturation of resources
- Data integrity issues
- Replication stalls
- Migration Patterns
- Pattern: locking operations
- Pattern: high resource utilization operations
- Pattern: rolling migrations
- Migration testing
- Rollback testing
- Manual or Automated
- Wrapping Up
- Education and Collaboration
- 9. Security
- The Purpose of Security
- Protecting Data from Theft
- Protecting from Purposeful Damage
- Protecting from Accidental Damage
- Protecting Data from Exposure
- Compliance and Auditing Standards
- Database Security as a Function
- Education and Collaboration
- Self-Service
- Integration and Testing
- Operational Visibility
- Application layer instrumentation
- Database layer instrumentation
- OS instrumentation
- Vulnerabilities and Exploits
- STRIDE
- DREAD
- Basic Precautions
- Denial of Service
- Mitigation
- Resource management and load shedding
- Continual improvement of database access and workloads
- Logging and monitoring
- SQL Injection
- Mitigation
- Prepared statements
- Input validation
- Harm reduction
- Monitoring
- Network and Authentication Protocols
- Encryption of Data
- Financial Data
- Personal Health Data
- Private Individual Data
- Military or Government Data
- Confidential/Sensitive Business Data
- Data in Transit
- Anatomy of a cipher suite
- Communication within the network
- Communications outside of the network
- Establishing secure data connections
- Basic connection encryption
- Securely stored secrets
- Dynamically built database users
- Data in the Database
- Application-level security
- Database plug-in encryption
- Transparent database encryption
- Query performance considerations
- Data in the Filesystem
- Data encryption above the filesystem
- Filesystem encryption
- Device-level encryption
- Wrapping Up
- The Purpose of Security
- 10. Data Storage, Indexing, and Replication
- Data Structure Storage
- Database Row Storage
- B-tree structures
- Binary tree writes
- B-tree structures
- Sorted-String Tables and Log-Structured Merge Trees
- Bloom filters
- Implementations
- Indexing
- Hash indexes
- Bitmap indexes
- Permutations of B-trees
- Logs and Databases
- Database Row Storage
- Data Replication
- Single-Leader
- Replication models
- Replication log formats
- Statement-based logs
- Write-ahead logs
- Row-based replication
- Block-level replication
- Other methods
- Single-leader replication uses
- Availability
- Scalability
- Locality
- Portability
- Single leader replication challenges
- Building replicas
- Keeping replicas synchronized
- Single leader failovers
- Single leader replication monitoring
- Replication lag and latency
- Replication availability and capacity
- Replication consistency
- Operational processes
- Multi-Leader Replication
- Multileader use cases
- Availability
- Locality
- Disaster recovery
- Conflict resolution in traditional multidirectional replication
- Eliminate conflicts
- Last write wins
- Custom resolution options
- Conflict-free replicated datatypes
- Write-anywhere replication
- Eventual consistency
- Read and write quorums
- Sloppy quorums
- Anti-entropy
- Multileader use cases
- Single-Leader
- Wrapping Up
- Data Structure Storage
- 11. Datastore Field Guide
- Conceptual Attributes of a Datastore
- The Data Model
- The relational model
- The keyvalue model
- The document model
- The navigational model
- Transactions
- ACID
- Atomicity
- Consistency
- Isolation
- Durability
- BASE
- The Data Model
- Internal Attributes of a Datastore
- Storage
- The Ubiquitous CAP Theorem Section
- Consistency
- Availability
- Partition tolerance
- Consistency Latency Trade-offs
- Availability
- Wrapping Up
- Conceptual Attributes of a Datastore
- 12. A Data Architecture Sampler
- Architectural Components
- Frontend Datastores
- Data Access Layer
- Database Proxies
- Availability
- Data Integrity
- Scalability
- Latency
- Event and Message Systems
- Availability
- Data integrity
- Scalability
- Latency
- Caches and Memory Stores
- Availability
- Data integrity
- Scalability
- Latency
- Data Architectures
- Lambda and Kappa
- Lambda architecture
- Kappa architecture
- Event Sourcing
- CQRS
- Lambda and Kappa
- Wrapping Up
- Architectural Components
- 13. Making the Case For DBRE
- A Culture of Database Reliability
- Breaking-Down Barriers
- The architectural process
- Database development
- Production migrations
- Infrastructure design and deployment
- Data-Driven Decision Making
- Data Integrity and Recoverability
- Breaking-Down Barriers
- Wrapping Up
- A Culture of Database Reliability
- Index