reklama - zainteresowany?

Streaming Systems. The What, Where, When, and How of Large-Scale Data Processing - Helion

Streaming Systems. The What, Where, When, and How of Large-Scale Data Processing
ebook
Autor: Tyler Akidau, Slava Chernyak, Reuven Lax
ISBN: 978-14-919-8382-9
stron: 352, Format: ebook
Data wydania: 2018-07-16
Ksi─Ögarnia: Helion

Cena ksi─ů┼╝ki: 152,15 z┼é (poprzednio: 176,92 z┼é)
Oszczędzasz: 14% (-24,77 zł)

Dodaj do koszyka Streaming Systems. The What, Where, When, and How of Large-Scale Data Processing

Streaming data is a big deal in big data these days. As more and more businesses seek to tame the massive unbounded data sets that pervade our world, streaming systems have finally reached a level of maturity sufficient for mainstream adoption. With this practical guide, data engineers, data scientists, and developers will learn how to work with streaming data in a conceptual and platform-agnostic way.

Expanded from Tyler Akidau’s popular blog posts "Streaming 101" and "Streaming 102", this book takes you from an introductory level to a nuanced understanding of the what, where, when, and how of processing real-time data streams. You’ll also dive deep into watermarks and exactly-once processing with co-authors Slava Chernyak and Reuven Lax.

You’ll explore:

  • How streaming and batch data processing patterns compare
  • The core principles and concepts behind robust out-of-order data processing
  • How watermarks track progress and completeness in infinite datasets
  • How exactly-once data processing techniques ensure correctness
  • How the concepts of streams and tables form the foundations of both batch and streaming data processing
  • The practical motivations behind a powerful persistent state mechanism, driven by a real-world example
  • How time-varying relations provide a link between stream processing and the world of SQL and relational algebra

Dodaj do koszyka Streaming Systems. The What, Where, When, and How of Large-Scale Data Processing

 

Osoby które kupowały "Streaming Systems. The What, Where, When, and How of Large-Scale Data Processing", wybierały także:

  • Windows Media Center. Domowe centrum rozrywki
  • Podr─Öcznik startupu. Budowa wielkiej firmy krok po kroku
  • Ruby on Rails. ─ćwiczenia
  • Prawa ludzkiej natury
  • Tajemnice sieci

Dodaj do koszyka Streaming Systems. The What, Where, When, and How of Large-Scale Data Processing

Spis tre┼Ťci

Streaming Systems. The What, Where, When, and How of Large-Scale Data Processing eBook -- spis treÂci

  • Preface Or: What Are You Getting Yourself Into Here?
    • Navigating This Book
      • Takeaways
    • Conventions Used in This Book
    • Online Resources
      • Figures
      • Code Snippets
    • OReilly Safari
    • How to Contact Us
    • Acknowledgments
  • I. The Beam Model
  • 1. Streaming 101
    • Terminology: What Is Streaming?
      • On the Greatly Exaggerated Limitations of Streaming
      • Event Time Versus Processing Time
    • Data Processing Patterns
      • Bounded Data
      • Unbounded Data: Batch
        • Fixed windows
        • Sessions
      • Unbounded Data: Streaming
        • Time-agnostic
          • Filtering
          • Inner joins
        • Approximation algorithms
        • Windowing
          • Windowing by processing time
          • Windowing by event time
    • Summary
  • 2. The What, Where, When, and How of Data Processing
    • Roadmap
    • Batch Foundations: What and Where
      • What: Transformations
      • Where: Windowing
    • Going Streaming: When and How
      • When: The Wonderful Thing About Triggers Is Triggers Are Wonderful Things!
      • When: Watermarks
      • When: Early/On-Time/Late Triggers FTW!
      • When: Allowed Lateness (i.e., Garbage Collection)
      • How: Accumulation
    • Summary
  • 3. Watermarks
    • Definition
    • Source Watermark Creation
      • Perfect Watermark Creation
      • Heuristic Watermark Creation
    • Watermark Propagation
      • Understanding Watermark Propagation
      • Watermark Propagation and Output Timestamps
      • The Tricky Case of Overlapping Windows
    • Percentile Watermarks
    • Processing-Time Watermarks
    • Case Studies
      • Case Study: Watermarks in Google Cloud Dataflow
      • Case Study: Watermarks in Apache Flink
      • Case Study: Source Watermarks for Google Cloud Pub/Sub
    • Summary
  • 4. Advanced Windowing
    • When/Where: Processing-Time Windows
      • Event-Time Windowing
      • Processing-Time Windowing via Triggers
      • Processing-Time Windowing via Ingress Time
    • Where: Session Windows
    • Where: Custom Windowing
      • Variations on Fixed Windows
        • Unaligned fixed windows
        • Per-element/key fixed windows
      • Variations on Session Windows
        • Bounded sessions
      • One Size Does Not Fit All
    • Summary
  • 5. Exactly-Once and Side Effects
    • Why Exactly Once Matters
    • Accuracy Versus Completeness
      • Side Effects
      • Problem Definition
    • Ensuring Exactly Once in Shuffle
    • Addressing Determinism
    • Performance
      • Graph Optimization
      • Bloom Filters
      • Garbage Collection
    • Exactly Once in Sources
    • Exactly Once in Sinks
    • Use Cases
      • Example Source: Cloud Pub/Sub
      • Example Sink: Files
      • Example Sink: Google BigQuery
    • Other Systems
      • Apache Spark Streaming
      • Apache Flink
    • Summary
  • II. Streams and Tables
  • 6. Streams and Tables
    • Stream-and-Table Basics Or: a Special Theory of Stream and Table Relativity
      • Toward a General Theory of Stream and Table Relativity
    • Batch Processing Versus Streams and Tables
      • A Streams and Tables Analysis of MapReduce
        • Map as streams/tables
        • Reduce as streams/tables
      • Reconciling with Batch Processing
    • What, Where, When, and How in a Streams and Tables World
      • What: Transformations
      • Where: Windowing
        • Window merging
      • When: Triggers
      • How: Accumulation
      • A Holistic View of Streams and Tables in the Beam Model
    • A General Theory of Stream and Table Relativity
    • Summary
  • 7. The Practicalities of Persistent State
    • Motivation
      • The Inevitability of Failure
      • Correctness and Efficiency
    • Implicit State
      • Raw Grouping
      • Incremental Combining
    • Generalized State
      • Case Study: Conversion Attribution
      • Conversion Attribution with Apache Beam
    • Summary
  • 8. Streaming SQL
    • What Is Streaming SQL?
      • Relational Algebra
      • Time-Varying Relations
      • Streams and Tables
    • Looking Backward: Stream and Table Biases
      • The Beam Model: A Stream-Biased Approach
      • The SQL Model: A Table-Biased Approach
        • Materialized views
    • Looking Forward: Toward Robust Streaming SQL
      • Stream and Table Selection
      • Temporal Operators
        • Where: windowing
        • When: triggers
          • A SQL-ish default: per-record triggers
          • Watermark triggers
          • Repeated delay triggers
          • Data-driven triggers
        • How: accumulation
          • Retractions in a SQL world
          • Discarding mode, or lack thereof
    • Summary
  • 9. Streaming Joins
    • All Your Joins Are Belong to Streaming
    • Unwindowed Joins
      • FULL OUTER
      • LEFT OUTER
      • RIGHT OUTER
      • INNER
      • ANTI
      • SEMI
    • Windowed Joins
      • Fixed Windows
      • Temporal Validity
        • Temporal validity windows
        • Temporal validity joins
          • Watermarks and temporal validity joins
    • Summary
  • 10. The Evolution of Large-Scale Data Processing
    • MapReduce
    • Hadoop
    • Flume
    • Storm
    • Spark
    • MillWheel
    • Kafka
    • Cloud Dataflow
    • Flink
    • Beam
    • Summary
  • Index

Dodaj do koszyka Streaming Systems. The What, Where, When, and How of Large-Scale Data Processing

Code, Publish & WebDesing by CATALIST.com.pl



(c) 2005-2022 CATALIST agencja interaktywna, znaki firmowe nale┼╝─ů do wydawnictwa Helion S.A.