Database Reliability Engineering. Designing and Operating Resilient Database Systems - Helion

ebook

Autor: Laine Campbell, Charity Majors
ISBN: 978-14-919-2621-5
stron: 294, Format: ebook
Data wydania: 2017-10-26
Księgarnia: Helion

Cena książki: 152,15 zł (poprzednio: 176,92 zł)
Oszczędzasz: 14% (-24,77 zł)

Osoby, które kupiły tę książkę, wybierały także »

The infrastructure-as-code revolution in IT is also affecting database administration. With this practical book, developers, system administrators, and junior to mid-level DBAs will learn how the modern practice of site reliability engineering applies to the craft of database architecture and operations. Authors Laine Campbell and Charity Majors provide a framework for professionals looking to join the ranks of today’s database reliability engineers (DBRE).

You’ll begin by exploring core operational concepts that DBREs need to master. Then you’ll examine a wide range of database persistence options, including how to implement key technologies to provide resilient, scalable, and performant data storage and retrieval. With a firm foundation in database reliability engineering, you’ll be ready to dive into the architecture and operations of any modern database.

This book covers:

Service-level requirements and risk management
Building and evolving an architecture for operational visibility
Infrastructure engineering and infrastructure management
How to facilitate the release management process
Data storage, indexing, and replication
Identifying datastore characteristics and best use cases
Datastore architectural components and data-driven architectures

Osoby które kupowały "Database Reliability Engineering. Designing and Operating Resilient Database Systems", wybierały także:

Jak zhakowa 125,00 zł, (10,00 zł -92%)
Biologika Sukcesji Pokoleniowej. Sezon 3. Konflikty na terytorium 126,36 zł, (13,90 zł -89%)
Windows Media Center. Domowe centrum rozrywki 66,67 zł, (8,00 zł -88%)
Podręcznik startupu. Budowa wielkiej firmy krok po kroku 92,67 zł, (13,90 zł -85%)
Ruby on Rails. Ćwiczenia 18,75 zł, (3,00 zł -84%)

Spis treści

Database Reliability Engineering. Designing and Operating Resilient Database Systems eBook -- spis treści

Foreword
Preface
- Why We Wrote This Book
- Who This Book Is For
- How This Book Is Organized
- Conventions Used in This Book
- OReilly Safari
- How to Contact Us
1. Introducing Database Reliability Engineering
- Guiding Principles of the DBRE
  - Protect the Data
  - Self-Service for Scale
  - Elimination of Toil
  - Databases Are Not Special Snowflakes
  - Eliminate the Barriers Between Software and Operations
- Operations Core Overview
- Hierarchy of Needs
  - Survival and Safety
  - Love and Belonging
  - Esteem
  - Self-actualization
- Wrapping Up
2. Service-Level Management
- Why Do I Need Service-Level Objectives?
- Service-Level Indicators
  - Latency
  - Availability
  - Throughput
  - Durability
  - Cost or Efficiency
- Defining Service Objectives
  - Latency Indicators
  - Availability Indicators
    - Resiliency versus robustness in availability
    - Designing for downtime allowed
  - Throughput Indicators
    - Cost/efficiency indicators
    - Considerations
- Monitoring and Reporting on SLOs
  - Monitoring Availability
  - Monitoring Latency
  - Monitoring Throughput
  - Monitoring Cost and Efficiency
- Wrapping Up
3. Risk Management
- Risk Considerations
  - Unknown Factors and Complexity
  - Availability of Resources
  - Human Factors
  - Group Factors
- What Do We Do?
- What Not to Do
- A Working Process: Bootstrapping
  - Service Risk Evaluation
  - Architectural Inventory
  - Prioritization
    - Severe impact (immediate SLO violation)
    - Major (imminent SLO violation)
    - Moderate (could contribute to SLO violation with other incidents in the same period)
      - Minor
    - Control and Decision Making
    - Identification
    - Evaluation
    - Mitigation and controls
    - Implementation
- Ongoing Iterations
- Wrapping Up
4. Operational Visibility
- The New Rules of Operational Visibility
  - Treat OpViz Systems Like BI Systems
  - Distributed Ephemeral Environments Trending to the Norm
  - Store at High Resolutions for Key Metrics
  - Keep Your Architecture Simple
- An OpViz Framework
- Data In
  - Telemetry/Metrics
  - Events
  - Logs
- Data Out
- Bootstrapping Your Monitoring
  - Is the Data Safe?
  - Is the Service Up?
  - Are the Consumers in Pain?
- Instrumenting the Application
  - Distributed Tracing
  - Events and Logs
- Instrumenting the Server or Instance
  - Events and Logs
- Instrumenting the Datastore
- Datastore Connection Layer
  - Utilization
  - Saturation
  - Errors
- Internal Database Visibility
  - Throughput and Latency Metrics
  - Commits, Redo, and Journaling
  - Replication State
  - Memory Structures
  - Locking and Concurrency
- Database Objects
- Database Queries
- Database Asserts and Events
- Wrapping Up
5. Infrastructure Engineering
- Hosts
  - Physical Servers
  - Operating a System and Kernel
    - User resource limits
    - I/O scheduler
    - Memory allocation and fragmentation
    - Swapping
    - Non-Uniform memory access
    - Network
    - Storage
    - Storage capacity
    - Storage throughput
    - Storage latency
    - Storage availability
    - Durability
  - Storage Area Networks
  - Benefits of Physical Servers
  - Cons of Physical Servers
- Virtualization
  - Hypervisor
  - Concurrency
  - Storage
  - Use Cases
- Containers
- Database as a Service
  - Challenges of DBaaS
  - The DBRE and the DBaaS
- Wrapping Up
6. Infrastructure Management
- Version Control
- Configuration Definition
- Building from Configuration
- Maintaining Configuration
  - Enforcement of Configuration Definitions
    - Configuration synchronization
    - Component redeploys
- Infrastructure Definition and Orchestration
  - Monolithic Infrastructure Definitions
  - Separating Vertically
  - Separated Tiers (Horizontal Definitions)
- Acceptance Testing and Compliance
- Service Catalog
- Bringing It All Together
- Development Environments
- Wrapping Up
7. Backup and Recovery
- Core Concepts
  - Physical versus Logical
  - Online versus Offline
  - Full, Incremental, and Differential
- Considerations for Recovery
- Recovery Scenarios
  - Planned Recovery Scenarios
    - New production nodes and clusters
    - Building different environments
    - ETL and pipeline processes for downstream datastores
    - Operational tests
  - Unplanned Scenarios
    - User error
    - Application errors
    - Infrastructure services
    - OS and hardware errors
    - Hardware failures
    - Datacenter failures
  - Scenario scope
  - Scenario Impact
- Anatomy of a Recovery Strategy
  - Building Block 1: Detection
    - User error
    - Application errors
    - Infrastructure services
    - OS and hardware errors
    - Hardware and datacenter failures
  - Building Block 2: Tiered Storage
    - Online, high performance storage
    - Online, low-performance storage
    - Offline storage
    - Object storage
  - Building Block 3: A Varied Toolbox
    - Full physical backups
    - Incremental physical backups
    - Full and incremental logical backups
    - Object stores
  - Building Block 4: Testing
- A Recovery Strategy Defined
  - Online, Fast Storage with Full and Incremental Backups
    - Use Cases
    - Detection
    - Tiered storage
    - Toolbox
    - Testing
  - Online, Slow Storage with Full and Incremental Backups
    - Use cases
    - Detection
    - Tiered storage
    - Toolbox
    - Testing
  - Offline Storage
    - Use cases
    - Detection
    - Tiered storage
    - Toolbox
    - Testing
  - Object Storage
    - Use cases
    - Detection
    - Testing
- Wrapping Up
8. Release Management
- Education and Collaboration
  - Become a Funnel
  - Foster Conversations
  - Domain-Specific Knowledge
    - Architecture
    - Data model
    - Best Practices and Standards
    - Tools
  - Collaboration
- Integration
  - Prerequisites
    - Version control system
    - Database build automation
    - Test data
    - Database migrations and packaging
    - CI server and test framework
- Testing
  - Test-Friendly Development Practices
    - Abstraction and encapsulation
    - Being efficient
  - Post-Commit Testing
    - Pre-build
    - Build
    - Post-build
  - Full Dataset Testing
  - Downstream Tests
  - Operational Tests
- Deployment
  - Migrations and Versioning
  - Impact Analysis
    - Locking of objects
    - Saturation of resources
    - Data integrity issues
    - Replication stalls
  - Migration Patterns
    - Pattern: locking operations
    - Pattern: high resource utilization operations
    - Pattern: rolling migrations
    - Migration testing
    - Rollback testing
  - Manual or Automated
- Wrapping Up
9. Security
- The Purpose of Security
  - Protecting Data from Theft
  - Protecting from Purposeful Damage
  - Protecting from Accidental Damage
  - Protecting Data from Exposure
  - Compliance and Auditing Standards
- Database Security as a Function
  - Education and Collaboration
  - Self-Service
  - Integration and Testing
  - Operational Visibility
    - Application layer instrumentation
    - Database layer instrumentation
    - OS instrumentation
- Vulnerabilities and Exploits
  - STRIDE
  - DREAD
  - Basic Precautions
  - Denial of Service
    - Mitigation
    - Resource management and load shedding
    - Continual improvement of database access and workloads
    - Logging and monitoring
  - SQL Injection
    - Mitigation
    - Prepared statements
    - Input validation
    - Harm reduction
    - Monitoring
  - Network and Authentication Protocols
- Encryption of Data
  - Financial Data
  - Personal Health Data
  - Private Individual Data
  - Military or Government Data
  - Confidential/Sensitive Business Data
  - Data in Transit
    - Anatomy of a cipher suite
    - Communication within the network
    - Communications outside of the network
    - Establishing secure data connections
      - Basic connection encryption
      - Securely stored secrets
      - Dynamically built database users
  - Data in the Database
    - Application-level security
    - Database plug-in encryption
    - Transparent database encryption
    - Query performance considerations
  - Data in the Filesystem
    - Data encryption above the filesystem
    - Filesystem encryption
    - Device-level encryption
- Wrapping Up
10. Data Storage, Indexing, and Replication
- Data Structure Storage
  - Database Row Storage
    - B-tree structures
      - Binary tree writes
  - Sorted-String Tables and Log-Structured Merge Trees
    - Bloom filters
    - Implementations
  - Indexing
    - Hash indexes
    - Bitmap indexes
    - Permutations of B-trees
  - Logs and Databases
- Data Replication
  - Single-Leader
    - Replication models
    - Replication log formats
      - Statement-based logs
      - Write-ahead logs
      - Row-based replication
      - Block-level replication
      - Other methods
    - Single-leader replication uses
      - Availability
      - Scalability
      - Locality
      - Portability
    - Single leader replication challenges
      - Building replicas
      - Keeping replicas synchronized
      - Single leader failovers
    - Single leader replication monitoring
      - Replication lag and latency
      - Replication availability and capacity
      - Replication consistency
      - Operational processes
  - Multi-Leader Replication
    - Multileader use cases
      - Availability
      - Locality
      - Disaster recovery
    - Conflict resolution in traditional multidirectional replication
      - Eliminate conflicts
      - Last write wins
      - Custom resolution options
      - Conflict-free replicated datatypes
    - Write-anywhere replication
      - Eventual consistency
      - Read and write quorums
      - Sloppy quorums
      - Anti-entropy
- Wrapping Up
11. Datastore Field Guide
- Conceptual Attributes of a Datastore
  - The Data Model
    - The relational model
    - The keyvalue model
    - The document model
    - The navigational model
  - Transactions
    - ACID
    - Atomicity
    - Consistency
    - Isolation
    - Durability
  - BASE
- Internal Attributes of a Datastore
  - Storage
  - The Ubiquitous CAP Theorem Section
    - Consistency
    - Availability
    - Partition tolerance
  - Consistency Latency Trade-offs
  - Availability
- Wrapping Up
12. A Data Architecture Sampler
- Architectural Components
  - Frontend Datastores
  - Data Access Layer
  - Database Proxies
    - Availability
    - Data Integrity
    - Scalability
    - Latency
  - Event and Message Systems
    - Availability
    - Data integrity
    - Scalability
    - Latency
  - Caches and Memory Stores
    - Availability
    - Data integrity
    - Scalability
    - Latency
- Data Architectures
  - Lambda and Kappa
    - Lambda architecture
    - Kappa architecture
  - Event Sourcing
  - CQRS
- Wrapping Up
13. Making the Case For DBRE
- A Culture of Database Reliability
  - Breaking-Down Barriers
    - The architectural process
    - Database development
    - Production migrations
    - Infrastructure design and deployment
  - Data-Driven Decision Making
  - Data Integrity and Recoverability
- Wrapping Up
Index