Observability Engineering - Helion
ISBN: 9781492076391
stron: 320, Format: ebook
Data wydania: 2022-05-06
Księgarnia: Helion
Cena książki: 194,65 zł (poprzednio: 226,34 zł)
Oszczędzasz: 14% (-31,69 zł)
Observability is critical for building, changing, and understanding the software that powers complex modern systems. Teams that adopt observability are much better equipped to ship code swiftly and confidently, identify outliers and aberrant behaviors, and understand the experience of each and every user. This practical book explains the value of observable systems and shows you how to practice observability-driven development.
Authors Charity Majors, Liz Fong-Jones, and George Miranda from Honeycomb explain what constitutes good observability, show you how to improve upon what youâ??re doing today, and provide practical dos and don'ts for migrating from legacy tooling, such as metrics monitoring and log management. Youâ??ll also learn the impact observability has on organizational culture (and vice versa).
You'll explore:
- How the concept of observability applies to managing software systems
- The value of practicing observability when delivering and managing complex cloud native applications and systems
- The impact observability has across the entire software development lifecycle
- How and why different functional teams use observability with service-level objectives (SLOs)
- How to instrument your code to help future engineers understand the code you wrote today
- How to produce quality code for context-aware system debugging and maintenance
- How data-rich analytics can help you debug elusive issues quickly
Osoby które kupowały "Observability Engineering", wybierały także:
- Windows Media Center. Domowe centrum rozrywki 66,67 zł, (8,00 zł -88%)
- Ruby on Rails. Ćwiczenia 18,75 zł, (3,00 zł -84%)
- Przywództwo w świecie VUCA. Jak być skutecznym liderem w niepewnym środowisku 58,64 zł, (12,90 zł -78%)
- Scrum. O zwinnym zarządzaniu projektami. Wydanie II rozszerzone 58,64 zł, (12,90 zł -78%)
- Od hierarchii do turkusu, czyli jak zarządzać w XXI wieku 58,64 zł, (12,90 zł -78%)
Spis treści
Observability Engineering eBook -- spis treści
- Foreword
- Preface
- Who This Book Is For
- Why We Wrote This Book
- What You Will Learn
- Conventions Used in This Book
- Using Code Examples
- OReilly Online Learning
- How to Contact Us
- Acknowledgments
- I. The Path to Observability
- 1. What Is Observability?
- The Mathematical Definition of Observability
- Applying Observability to Software Systems
- Mischaracterizations About Observability for Software
- Why Observability Matters Now
- Is This Really the Best Way?
- Why Are Metrics and Monitoring Not Enough?
- Debugging with Metrics Versus Observability
- The Role of Cardinality
- The Role of Dimensionality
- Debugging with Observability
- Observability Is for Modern Systems
- Conclusion
- 2. How Debugging Practices Differ Between Observability and Monitoring
- How Monitoring Data Is Used for Debugging
- Troubleshooting Behaviors When Using Dashboards
- The Limitations of Troubleshooting by Intuition
- Example 1: Insufficient correlation
- Example 2: Not drilling down
- Example 3: Tool-hopping
- Traditional Monitoring Is Fundamentally Reactive
- How Observability Enables Better Debugging
- Conclusion
- How Monitoring Data Is Used for Debugging
- 3. Lessons from Scaling Without Observability
- An Introduction to Parse
- Scaling at Parse
- The Evolution Toward Modern Systems
- The Evolution Toward Modern Practices
- Shifting Practices at Parse
- Conclusion
- 4. How Observability Relates to DevOps, SRE, and Cloud Native
- Cloud Native, DevOps, and SRE in a Nutshell
- Observability: Debugging Then Versus Now
- Observability Empowers DevOps and SRE Practices
- Conclusion
- II. Fundamentals of Observability
- 5. Structured Events Are the Building Blocks of Observability
- Debugging with Structured Events
- The Limitations of Metrics as a Building Block
- The Limitations of Traditional Logs as a Building Block
- Unstructured Logs
- Structured Logs
- Properties of Events That Are Useful in Debugging
- Conclusion
- 6. Stitching Events into Traces
- Distributed Tracing and Why It Matters Now
- The Components of Tracing
- Instrumenting a Trace the Hard Way
- Adding Custom Fields into Trace Spans
- Stitching Events into Traces
- Conclusion
- 7. Instrumentation with OpenTelemetry
- A Brief Introduction to Instrumentation
- Open Instrumentation Standards
- Instrumentation Using Code-Based Examples
- Start with Automatic Instrumentation
- Add Custom Instrumentation
- Starting and finishing trace spans
- Adding wide fields to an event
- Recording process-wide metrics
- Send Instrumentation Data to a Backend System
- Conclusion
- 8. Analyzing Events to Achieve Observability
- Debugging from Known Conditions
- Debugging from First Principles
- Using the Core Analysis Loop
- Automating the Brute-Force Portion of the Core Analysis Loop
- This Misleading Promise of AIOps
- Conclusion
- 9. How Observability and Monitoring Come Together
- Where Monitoring Fits
- Where Observability Fits
- System Versus Software Considerations
- Assessing Your Organizational Needs
- Exceptions: Infrastructure Monitoring That Cant Be Ignored
- Real-World Examples
- Conclusion
- III. Observability for Teams
- 10. Applying Observability Practices in Your Team
- Join a Community Group
- Start with the Biggest Pain Points
- Buy Instead of Build
- Flesh Out Your Instrumentation Iteratively
- Look for Opportunities to Leverage Existing Efforts
- Prepare for the Hardest Last Push
- Conclusion
- 11. Observability-Driven Development
- Test-Driven Development
- Observability in the Development Cycle
- Determining Where to Debug
- Debugging in the Time of Microservices
- How Instrumentation Drives Observability
- Shifting Observability Left
- Using Observability to Speed Up Software Delivery
- Conclusion
- 12. Using Service-Level Objectives for Reliability
- Traditional Monitoring Approaches Create Dangerous Alert Fatigue
- Threshold Alerting Is for Known-Unknowns Only
- User Experience Is a North Star
- What Is a Service-Level Objective?
- Reliable Alerting with SLOs
- Changing Culture Toward SLO-Based Alerts: A Case Study
- Conclusion
- 13. Acting on and Debugging SLO-Based Alerts
- Alerting Before Your Error Budget Is Empty
- Framing Time as a Sliding Window
- Forecasting to Create a Predictive Burn Alert
- The Lookahead Window
- Extrapolating the future from current burn rate
- Short-term burn alerts
- Context-aware burn alerts
- The Baseline Window
- Acting on SLO Burn Alerts
- The Lookahead Window
- Using Observability Data for SLOs Versus Time-Series Data
- Conclusion
- 14. Observability and the Software Supply Chain
- Why Slack Needed Observability
- Instrumentation: Shared Client Libraries and Dimensions
- Case Studies: Operationalizing the Supply Chain
- Understanding Context Through Tooling
- Embedding Actionable Alerting
- Understanding What Changed
- Conclusion
- IV. Observability at Scale
- 15. Build Versus Buy and Return on Investment
- How to Analyze the ROI of Observability
- The Real Costs of Building Your Own
- The Hidden Costs of Using Free Software
- The Benefits of Building Your Own
- The Risks of Building Your Own
- The Real Costs of Buying Software
- The Hidden Financial Costs of Commercial Software
- The Hidden Nonfinancial Costs of Commercial Software
- The Benefits of Buying Commercial Software
- The Risks of Buying Commercial Software
- Buy Versus Build Is Not a Binary Choice
- Conclusion
- 16. Efficient Data Storage
- The Functional Requirements for Observability
- Time-Series Databases Are Inadequate for Observability
- Other Possible Data Stores
- Data Storage Strategies
- Case Study: The Implementation of Honeycombs Retriever
- Partitioning Data by Time
- Storing Data by Column Within Segments
- Performing Query Workloads
- Querying for Traces
- Querying Data in Real Time
- Making It Affordable with Tiering
- Making It Fast with Parallelism
- Dealing with High Cardinality
- Scaling and Durability Strategies
- Notes on Building Your Own Efficient Data Store
- Conclusion
- The Functional Requirements for Observability
- 17. Cheap and Accurate Enough: Sampling
- Sampling to Refine Your Data Collection
- Using Different Approaches to Sampling
- Constant-Probability Sampling
- Sampling on Recent Traffic Volume
- Sampling Based on Event Content (Keys)
- Combining per Key and Historical Methods
- Choosing Dynamic Sampling Options
- When to Make a Sampling Decision for Traces
- Translating Sampling Strategies into Code
- The Base Case
- Fixed-Rate Sampling
- Recording the Sample Rate
- Consistent Sampling
- Target Rate Sampling
- Having More Than One Static Sample Rate
- Sampling by Key and Target Rate
- Sampling with Dynamic Rates on Arbitrarily Many Keys
- Putting It All Together: Head and Tail per Key Target Rate Sampling
- Conclusion
- 18. Telemetry Management with Pipelines
- Attributes of Telemetry Pipelines
- Routing
- Security and Compliance
- Workload Isolation
- Data Buffering
- Capacity Management
- Rate limiting
- Sampling
- Queuing
- Data Filtering and Augmentation
- Data Transformation
- Ensuring Data Quality and Consistency
- Managing a Telemetry Pipeline: Anatomy
- Challenges When Managing a Telemetry Pipeline
- Performance
- Correctness
- Availability
- Reliability
- Isolation
- Data Freshness
- Use Case: Telemetry Management at Slack
- Metrics Aggregation
- Logs and Trace Events
- Open Source Alternatives
- Managing a Telemetry Pipeline: Build Versus Buy
- Conclusion
- Attributes of Telemetry Pipelines
- V. Spreading Observability Culture
- 19. The Business Case for Observability
- The Reactive Approach to Introducing Change
- The Return on Investment of Observability
- The Proactive Approach to Introducing Change
- Introducing Observability as a Practice
- Using the Appropriate Tools
- Instrumentation
- Data Storage and Analytics
- Rolling Out Tools to Your Teams
- Knowing When You Have Enough Observability
- Conclusion
- 20. Observabilitys Stakeholders and Allies
- Recognizing Nonengineering Observability Needs
- Creating Observability Allies in Practice
- Customer Support Teams
- Customer Success and Product Teams
- Sales and Executive Teams
- Using Observability Versus Business Intelligence Tools
- Query Execution Time
- Accuracy
- Recency
- Structure
- Time Windows
- Ephemerality
- Using Observability and BI Tools Together in Practice
- Conclusion
- 21. An Observability Maturity Model
- A Note About Maturity Models
- Why Observability Needs a Maturity Model
- About the Observability Maturity Model
- Capabilities Referenced in the OMM
- Respond to System Failure with Resilience
- If your team is doing well
- If your team is doing poorly
- How observability is related
- Deliver High-Quality Code
- If your team is doing well
- If your team is doing poorly
- How observability is related
- Manage Complexity and Technical Debt
- If your team is doing well
- If your team is doing poorly
- How observability is related
- Release on a Predictable Cadence
- If your team is doing well
- If your team is doing poorly
- How observability is related
- Understand User Behavior
- If your team is doing well
- If your team is doing poorly
- How observability is related
- Respond to System Failure with Resilience
- Using the OMM for Your Organization
- Conclusion
- 22. Where to Go from Here
- Observability, Then Versus Now
- Additional Resources
- Predictions for Where Observability Is Going
- Index