Building an Event-Driven Data Mesh - Helion
ebook
Autor: Adam BellemareISBN: 9781098127565
stron: 262, Format: ebook
Data wydania: 2023-04-04
Księgarnia: Helion
Cena książki: 220,15 zł (poprzednio: 255,99 zł)
Oszczędzasz: 14% (-35,84 zł)
The exponential growth of data combined with the need to derive real-time business value is a critical issue today. An event-driven data mesh can power real-time operational and analytical workloads, all from a single set of data product streams. With practical real-world examples, this book shows you how to successfully design and build an event-driven data mesh.
Building an Event-Driven Data Mesh provides:
- Practical tips for iteratively building your own event-driven data mesh, including hurdles you'll experience, possible solutions, and how to obtain real value as soon as possible
- Solutions to pitfalls you may encounter when moving your organization from monoliths to event-driven architectures
- A clear understanding of how events relate to systems and other events in the same stream and across streams
- A realistic look at event modeling options, such as fact, delta, and command type events, including how these choices will impact your data products
- Best practices for handling events at scale, privacy, and regulatory compliance
- Advice on asynchronous communication and handling eventual consistency
Osoby które kupowały "Building an Event-Driven Data Mesh", wybierały także:
- Windows Media Center. Domowe centrum rozrywki 66,67 zł, (8,00 zł -88%)
- Ruby on Rails. Ćwiczenia 18,75 zł, (3,00 zł -84%)
- Przywództwo w świecie VUCA. Jak być skutecznym liderem w niepewnym środowisku 58,64 zł, (12,90 zł -78%)
- Scrum. O zwinnym zarządzaniu projektami. Wydanie II rozszerzone 58,64 zł, (12,90 zł -78%)
- Od hierarchii do turkusu, czyli jak zarządzać w XXI wieku 58,64 zł, (12,90 zł -78%)
Spis treści
Building an Event-Driven Data Mesh eBook -- spis treści
- Preface
- Conventions Used in This Book
- OReilly Online Learning
- How to Contact Us
- Acknowledgments
- 1. Event-Driven Data Communication
- What Is Data Mesh?
- An Event-Driven Data Mesh
- Using Data in the Operational Plane
- The Data Monolith
- The Difficulties of Communicating Data for Operational Concerns
- Strategy 1: Replicate data between services
- Strategy 2: Use APIs to avoid data replication needs
- The Analytical Plane: Data Warehouses and Data Lakes
- The Organizational Impact of Schema on Read
- Problem 1: Violated data model boundaries
- Problem 2: Lack of single ownership
- Problem 3: Do-it-yourself and custom point-to-point data connections
- Bad Data: The Costs of Inaction
- Can We Unify Analytical and Operational Workflows?
- Rethinking Data with Data Mesh
- Common Objections to an Event-Driven Data Mesh
- Producers Cannot Model Data for Everyones Use Cases
- Making Multiple Copies of Data Is Bad
- There should only be a single master copy of the data, and all systems should reference it directly
- Its too computationally expensive to create, store, and update multiple copies of the same data
- Managing information security policies across systems and distributed data sets is too hard
- Eventual Consistency Is Too Difficult to Manage
- Summary
- 2. Data Mesh
- Principle 1: Domain Ownership
- Domain-Driven Design in Brief
- Selecting the Data to Expose from Your Domain
- Principle 2: Data as a Product
- Data Products Provide Immutable and Time-Stamped Data
- Data Products Are Multimodal
- Accessing a Data Product Via Push or Pull
- The Three Data Product Alignment Types
- Source-aligned data products
- Aggregate-aligned data products
- Consumer-aligned data products
- Event-Driven Data Products as Inputs for Operational Systems
- Principle 3: Federated Governance
- Specifying Data Product Language, Framework, and API Support
- Establishing Data Product Life Cycle Requirements
- Establishing Data Handling and Infosec Policies
- Identifying and Standardizing Cross-Domain Polysemes
- Formalizing Self-Service Platform Requirements
- Principle 4: Self-Service Platform
- Discovering Data Products and Dependencies
- Data Product Management Controls
- Data Product Access Controls
- Compute and Storage Resources for Building and Using Data Products
- Providing Self-Service Through SaaS
- Summary
- Principle 1: Domain Ownership
- 3. Event Streams for Data Mesh
- Events, Messages, and Records
- Whats an Event Stream? What Is It Not?
- Ephemeral Message-Passing
- Queuing
- Consuming and Using Event-Driven Data Products
- State Events and Event-Carried State Transfer
- Materializing Events
- Aggregating Events
- The Kappa Architecture
- The Lambda Architecture and Why It Doesnt Work for Data Mesh
- Supporting the Requirements for Kappa Architecture
- Selecting an Event Broker
- Summary
- 4. Federated Governance
- Forming a Federated Governance Team
- Implementing Standards
- Supporting Multimodal Data Product Types
- Supporting Data Product Schemas
- Supporting Programming Languages and Frameworks
- Metadata Standards and Requirements
- Domain and owner
- Tiered service levels
- Data quality classifications
- Privacy, financial, and custom tagging
- Upstream metadata dependencies
- Metadata wrap-up example
- Ensuring Cross-Domain Data Product Compatibility and Interoperability
- Defining and Using Common Entities
- Event Stream Keying and Partitioning
- Time and Time Zones
- What Does a Governance Meeting Look Like?
- 1. Identifying Existing Problems
- 2. Drafting Proposals
- 3. Reviewing Proposals
- 4. Implementing Proposals
- 5. Archiving Proposals
- Data Security and Access Policies
- Disable Data Product Access by Default
- Consider End-to-End Encryption
- Field-Level Encryption
- Data Privacy, the Right to Be Forgotten, and Crypto-Shredding
- Data Product Lineage
- Topology-Based Lineage
- Record-Based Lineage
- Summary
- 5. Self-Service Data Platform
- The Self-Service Platform Maturity Model
- Level 1: The Minimal Viable Platform
- The Schema Registry
- An Extremely Basic Metadata Catalog
- Connectors
- Level 1 Wrap-Up: How Does It Work?
- Level 2: The Expanded Platform
- Full-Featured Metadata Catalog
- The Data Product Management Service and UI
- Service and User Identities
- Basic Access Controls
- Stream Processing for Building Data Products
- Level 2 Wrap-Up: How Does It Work?
- Level 3: The Mature Platform
- Authentication, Identification, and Access Management
- Integration with Existing Application Delivery Processes
- Programmatic Data Product Management API
- Monitoring and Alerting
- Multiregion and Multicloud Data Products
- Level 3 Wrap-Up: How Does It Work?
- Summary
- 6. Event Schemas
- A Brief Introduction to Serialization and Deserialization
- What Is a Schema?
- What Are Our Schema Technology Options?
- Googles Protocol Buffers, aka Protobuf
- Apache Avro
- JSON Schema
- Schema Evolution: Changing Your Schemas Through Time
- Negotiating a Breaking Schema Change
- Step 1: Design the New Data Model
- Step 2: Iterate with Your Existing Consumers and the Federated Governance Team
- Step 3. Create a Release Schedule, a Data Migration Plan, and a Deprecation Plan
- Step 4. Execute the Release
- The Role of the Schema Registry
- Best Practices for Managing Schemas in Your Codebase
- Choosing a Schema Technology
- Summary
- 7. Designing Events
- Introduction to Event Types
- Expanding on State Events and Event-Carried State Transfer
- Current State Events
- Before/After State Events
- Delta Events
- Event Sourcing with Delta Events
- Why Delta Events Dont Work for Event-Driven Data Products
- There is an infinite set of possible event types
- The logic to interpret the events must be replicated to each consumer
- These events map poorly to event streams
- Inversion of ownership: Consumers put their business logic into the producer
- Inability to maintain historical data without excessive complications
- Measurement Events
- Measurement Events Often Form Aggregate-Aligned Data Products
- Measurement Event Sources May Be Lossy
- Measurement Events May Power Time-Sensitive Applications
- Hybrid EventsState with a Bit of Delta
- Notification Events
- Summary
- 8. Bootstrapping Data Products
- Getting Started: Bootstrapping with Connectors
- Dual Writes
- Polling the Database to Create Data Products
- Change-Data Capture
- Change-Data Capture Using a Transactional Outbox
- Denormalization and Eventification
- Eventification at the Transactional Outbox
- Eventification in a Dedicated Service
- What Should Go In the Event? And What Should Stay Out?
- Slowly Changing Dimensions
- Type 1: Overwrite with the new value
- Type 2: Append the new value
- Bootstrapping Cloud Storage Files to an Event Stream
- Summary
- 9. Integrating Event-Driven Data into Data at Rest
- Analytics and the Medallion Architecture
- Connecting Event Streams Into Existing Batch-Data Flows
- Through the Lens of Data Mesh: Whats Going On?
- Through the Lens of Data Mesh: How Do We Solve It?
- Balancing File Sizes, SLAs, and Latency
- Budget Blues: A Tale of Overspending
- Extending the Self-Service Platform for Nonstreaming Data Products
- Summary
- 10. Eventual Consistency
- Converging on Consistency, One Event at a Time
- Strategies for Dealing with Eventual Consistency
- Prevent Failures to Avoid Inconsistency
- Use Event-Driven Data Products Instead of Request-Response Server API Calls
- Expose Eventual Consistency in the Server Response
- Plan for New Services and Reprocessing of Data
- Synchronize Data Products on Time Boundaries
- Out-of-Order Events
- Resolving Late-Arriving Events
- Summary
- 11. Bringing It All Together
- Event Streams for Data Mesh
- Integrating with Existing Systems
- Operations, Analytics, and Everything in Between
- Summary
- Index