Fundamentals of Data Observability - Helion
ISBN: 9781098133252
stron: 266, Format: ebook
Data wydania: 2023-08-14
Księgarnia: Helion
Cena książki: 211,65 zł (poprzednio: 246,10 zł)
Oszczędzasz: 14% (-34,45 zł)
Quickly detect, troubleshoot, and prevent a wide range of data issues through data observability, a set of best practices that enables data teams to gain greater visibility of data and its usage. If you're a data engineer, data architect, or machine learning engineer who depends on the quality of your data, this book shows you how to focus on the practical aspects of introducing data observability in your everyday work.
Author Andy Petrella helps you build the right habits to identify and solve data issues, such as data drifts and poor quality, so you can stop their propagation in data applications, pipelines, and analytics. You'll learn ways to introduce data observability, including setting up a framework for generating and collecting all the information you need.
- Learn the core principles and benefits of data observability
- Use data observability to detect, troubleshoot, and prevent data issues
- Follow the book's recipes to implement observability in your data projects
- Use data observability to create a trustworthy communication framework with data consumers
- Learn how to educate your peers about the benefits of data observability
Osoby które kupowały "Fundamentals of Data Observability", wybierały także:
- Windows Media Center. Domowe centrum rozrywki 66,67 zł, (8,00 zł -88%)
- Ruby on Rails. Ćwiczenia 18,75 zł, (3,00 zł -84%)
- Przywództwo w świecie VUCA. Jak być skutecznym liderem w niepewnym środowisku 58,64 zł, (12,90 zł -78%)
- Scrum. O zwinnym zarządzaniu projektami. Wydanie II rozszerzone 58,64 zł, (12,90 zł -78%)
- Od hierarchii do turkusu, czyli jak zarządzać w XXI wieku 58,64 zł, (12,90 zł -78%)
Spis treści
Fundamentals of Data Observability eBook -- spis treści
- Preface
- Overview of the Book
- Who Should Read This Book
- Conventions Used in This Book
- Using Code Examples
- OReilly Online Learning
- How to Contact Us
- Acknowledgments
- I. Introducing Data Observability
- 1. Introducing Data Observability
- Scaling Data Teams
- Challenges of Scaling Data Teams
- Segregated Roles and Responsibilities and Organizational Complexity
- Anatomy of Data Issues and Consequences
- Impact of Data Issues on Data Team Dynamics
- Scaling AI Roadblocks
- Challenges with Current Data Management Practices
- Effects of Data Governance at Scale
- Data Observability to the Rescue
- The Areas of Observability
- How Data Teams Can Leverage Data Observability Now
- Low Latency Data Issues Detection
- Efficient Data Issues Troubleshooting
- Preventing Data Issues
- Decentralized Data Quality Management
- Complementing Existing Data Governance Capabilities
- The Future and Beyond
- Conclusion
- Scaling Data Teams
- 2. Components of Data Observability
- Channels of Data Observability Information
- Logs
- Traces
- Metrics
- Observations Model
- Physical Space
- Server
- User
- Static Space
- Data source
- Schema
- Lineage
- Application
- Application repository
- Application version
- Dynamic Space
- Application execution
- Lineage execution
- Data metrics
- Expectations
- Rules
- Explicit rules
- Assisted rules
- Connections with SLAs/SLOs
- Automatic Anomaly Detection
- Prevent Garbage In, Garbage Out
- Pre- and post-conditions
- Circuit breaker
- Rules
- Conclusion
- Channels of Data Observability Information
- 3. Roles of Data Observability in a Data Organization
- Data Architecture
- Where Does Data Observability Fit in a Data Architecture?
- The data-observable system
- The data observability platform
- Data Architecture with Data Observability
- Where Does Data Observability Fit in a Data Architecture?
- How Data Observability Helps with Data Engineering Undercurrents
- Security
- Data Management
- Data governance
- Discoverability
- Accountability
- Data quality
- Master data management and data modeling
- Data Integration and Interoperability
- Data lifecycle management
- Ethics and privacy
- DataOps
- Automation
- Observability
- Incident response
- Orchestration
- Software engineering
- Data governance
- Support for Data Meshs Data as Products
- Conclusion
- Data Architecture
- II. Implementing Data Observability
- 4. Generate Data Observations
- At the Source
- Generating Data Observations at the Source
- Low-Level API in Python
- Description of the Data Pipeline
- Definition of the Status of the Data Pipeline
- Data Observations for the Data Pipeline
- Generate Contextual Data Observations
- Generate Data-Related Observations
- Generate Lineage-Related Data Observations
- Wrap-Up: The Data-Observable Data Pipeline
- Using Data Observations to Address Failures of the Data Pipeline
- Conclusion
- 5. Automate the Generation of Data Observations
- Abstraction Strategies
- Event Listeners
- Aspect-Oriented Programming
- Code enrichment
- Runtime manipulation
- Monkey patching
- Bytecode instrumentation
- Code generation
- AST manipulation
- Annotations, decorators, and attributes
- Macros
- High-Level Applications
- No-Code Applications
- Low-Code Applications
- Code generators
- Workflows
- Differences Among Monitoring Alternatives
- Conclusion
- Abstraction Strategies
- 6. Implementing Expectations
- Introducing Expectations
- Shift-Left Data Quality
- Corner Cases Discovery
- Lifting Service Level Indicators
- Using Data Profilers
- Maintaining Expectations
- Overarching Practices
- Fail Fast and Fail Safe
- Simplify Tests and Extend CI/CD
- Conclusion
- Introducing Expectations
- III. Data Observability in Action
- 7. Integrating Data Observability in Your Data Stack
- Ingestion Stage
- Ingestion Stage Data Observability Recipes
- Airbyte Agent
- Transformation
- Transformation Stage Data Observability Recipes
- Apache Spark
- Spline
- Going beyond the compliance use case
- dbt Agent
- Serving
- Recipes
- BigQuery in Python
- Orchestrated SQL with Airflow
- Analytics
- Machine Learning Recipes
- Business Intelligence Recipes
- Databricks
- Superset
- Conclusion
- Ingestion Stage
- 8. Making Opaque Systems Translucent
- Data Translucence
- Opaque Systems
- SaaS
- Dont Touch It; It (Kinda) Works
- Inherited Systems
- Strategies for Data Translucence
- Strategies
- Data governance APIs
- Log analytics
- Code and binary analyzers
- Data scanning
- The Data Observability Connector
- Example: Building a dbt Data Observability Connector (SaaS)
- Strategies
- Conclusion
- Afterword: Future Observations
- Unification of Processing
- Generative Milestones
- Trustable Expanded Creativity
- Conclusion
- Index