reklama - zainteresowany?

Data Science: The Hard Parts - Helion

Data Science: The Hard Parts
ebook
Autor: Daniel Vaughan
ISBN: 9781098146436
stron: 256, Format: ebook
Data wydania: 2023-11-01
Księgarnia: Helion

Cena książki: 220,15 zł (poprzednio: 255,99 zł)
Oszczędzasz: 14% (-35,84 zł)

Dodaj do koszyka Data Science: The Hard Parts

This practical guide provides a collection of techniques and best practices that are generally overlooked in most data engineering and data science pedagogy. A common misconception is that great data scientists are experts in the "big themes" of the discipline—machine learning and programming. But most of the time, these tools can only take us so far. In practice, the smaller tools and skills really separate a great data scientist from a not-so-great one.

Taken as a whole, the lessons in this book make the difference between an average data scientist candidate and a qualified data scientist working in the field. Author Daniel Vaughan has collected, extended, and used these skills to create value and train data scientists from different companies and industries.

With this book, you will:

  • Understand how data science creates value
  • Deliver compelling narratives to sell your data science project
  • Build a business case using unit economics principles
  • Create new features for a ML model using storytelling
  • Learn how to decompose KPIs
  • Perform growth decompositions to find root causes for changes in a metric

Daniel Vaughan is head of data at Clip, the leading paytech company in Mexico. He's the author of Analytical Skills for AI and Data Science (O'Reilly).

Dodaj do koszyka Data Science: The Hard Parts

 

Osoby które kupowały "Data Science: The Hard Parts", wybierały także:

  • Cisco CCNA 200-301. Kurs video. Podstawy sieci komputerowych i konfiguracji. Część 1
  • Cisco CCNP Enterprise 350-401 ENCOR. Kurs video. Sieci przedsi
  • Jak zhakowa
  • Windows Media Center. Domowe centrum rozrywki
  • Deep Web bez tajemnic. Kurs video. Pozyskiwanie ukrytych danych

Dodaj do koszyka Data Science: The Hard Parts

Spis treści

Data Science: The Hard Parts eBook -- spis treści

  • Preface
    • Conventions Used in This Book
    • Using Code Examples
    • OReilly Online Learning
    • How to Contact Us
    • Acknowledgments
  • I. Data Analytics Techniques
  • 1. So What? Creating Value with Data Science
    • What Is Value?
    • What: Understanding the Business
    • So What: The Gist of Value Creation in DS
    • Now What: Be a Go-Getter
    • Measuring Value
    • Key Takeaways
    • Further Reading
  • 2. Metrics Design
    • Desirable Properties That Metrics Should Have
      • Measurable
      • Actionable
      • Relevance
      • Timeliness
    • Metrics Decomposition
      • Funnel Analytics
      • Stock-Flow Decompositions
      • P×Q-Type Decompositions
    • Example: Another Revenue Decomposition
    • Example: Marketplaces
    • Key Takeaways
    • Further Reading
  • 3. Growth Decompositions: Understanding Tailwinds and Headwinds
    • Why Growth Decompositions?
    • Additive Decomposition
      • Example
      • Interpretation and Use Cases
    • Multiplicative Decomposition
      • Example
      • Interpretation
    • Mix-Rate Decompositions
      • Example
      • Interpretation
    • Mathematical Derivations
      • Additive Decomposition
      • Multiplicative Decomposition
      • Mix-Rate Decomposition
    • Key Takeaways
    • Further Reading
  • 4. 2×2 Designs
    • The Case for Simplification
    • Whats a 2×2 Design?
    • Example: Test a Model and a New Feature
    • Example: Understanding User Behavior
    • Example: Credit Origination and Acceptance
    • Example: Prioritizing Your Workflow
    • Key Takeaways
    • Further Reading
  • 5. Building Business Cases
    • Some Principles to Construct Business Cases
    • Example: Proactive Retention Strategy
    • Fraud Prevention
    • Purchasing External Datasets
    • Working on a Data Science Project
    • Key Takeaways
    • Further Reading
  • 6. Whats in a Lift?
    • Lifts Defined
    • Example: Classifier Model
    • Self-Selection and Survivorship Biases
    • Other Use Cases for Lifts
    • Key Takeaways
    • Further Reading
  • 7. Narratives
    • Whats in a Narrative: Telling a Story with Your Data
      • Clear and to the Point
      • Credible
      • Memorable
      • Actionable
    • Building a Narrative
      • Science as Storytelling
      • What, So What, and Now What?
        • What?
        • So what?
        • Now what?
    • The Last Mile
      • Writing TL;DRs
      • Tips to Write Memorable TL;DRs
      • Example: Writing a TL;DR for This Chapter
      • Delivering Powerful Elevator Pitches
      • Presenting Your Narrative
    • Key Takeaways
    • Further Reading
  • 8. Datavis: Choosing the Right Plot to Deliver a Message
    • Some Useful and Not-So-Used Data Visualizations
      • Bar Versus Line Plots
      • Slopegraphs
      • Waterfall Charts
      • Scatterplot Smoothers
      • Plotting Distributions
    • General Recommendations
      • Find the Right Datavis for Your Message
      • Choose Your Colors Wisely
      • Different Dimensions in a Plot
      • Aim for a Large Enough Data-Ink Ratio
      • Customization Versus Semiautomation
      • Get the Font Size Right from the Beginning
      • Interactive or Not
      • Stay Simple
      • Start by Explaining the Plot
    • Key Takeaways
    • Further Reading
  • II. Machine Learning
  • 9. Simulation and Bootstrapping
    • Basics of Simulation
    • Simulating a Linear Model and Linear Regression
    • What Are Partial Dependence Plots?
    • Omitted Variable Bias
    • Simulating Classification Problems
      • Latent Variable Models
      • Comparing Different Algorithms
    • Bootstrapping
    • Key Takeaways
    • Further Reading
  • 10. Linear Regression: Going Back to Basics
    • Whats in a Coefficient?
    • The Frisch-Waugh-Lovell Theorem
    • Why Should You Care About FWL?
    • Confounders
    • Additional Variables
    • The Central Role of Variance in ML
    • Key Takeaways
    • Further Reading
  • 11. Data Leakage
    • What Is Data Leakage?
      • Outcome Is Also a Feature
      • A Function of the Outcome Is Itself a Feature
      • Bad Controls
      • Mislabeling of a Timestamp
      • Multiple Datasets with Sloppy Time Aggregations
      • Leakage of Other Information
    • Detecting Data Leakage
    • Complete Separation
    • Windowing Methodology
      • Choosing the Length of the Windows
      • The Training Stage Mirrors the Scoring Stage
      • Implementing the Windowing Methodology
    • I Have Leakage: Now What?
    • Key Takeaways
    • Further Reading
  • 12. Productionizing Models
    • What Does Production Ready Mean?
      • Batch Scores (Offline)
      • Real-Time Model Objects
    • Data and Model Drift
    • Essential Steps in any Production Pipeline
      • Get and Transform Data
      • Validate Data
      • Training and Scoring Stages
      • Validate Model and Scores
      • Deploy Model and Scores
    • Key Takeaways
    • Further Reading
  • 13. Storytelling in Machine Learning
    • A Holistic View of Storytelling in ML
    • Ex Ante and Interim Storytelling
      • Creating Hypotheses
        • Predicting human behavior
        • Predicting system behavior
        • Predicting downstream metrics
      • Feature Engineering
    • Ex Post Storytelling: Opening the Black Box
      • Interpretability-Performance Trade-Off
      • Linear Regression: Setting a Benchmark
      • Feature Importance
      • Heatmaps
      • Partial Dependence Plots
      • Accumulated Local Effects
    • Key Takeaways
    • Further Reading
  • 14. From Prediction to Decisions
    • Dissecting Decision Making
    • Simple Decision Rules by Smart Thresholding
      • Precision and Recall
      • Example: Lead Generation
    • Confusion Matrix Optimization
    • Key Takeaways
    • Further Reading
  • 15. Incrementality: The Holy Grail of Data Science?
    • Defining Incrementality
      • Causal Reasoning to Improve Prediction
      • Causal Reasoning as a Differentiator
      • Improved Decision Making
    • Confounders and Colliders
    • Selection Bias
    • Unconfoundedness Assumption
    • Breaking Selection Bias: Randomization
    • Matching
    • Machine Learning and Causal Inference
      • Open Source Codebases
      • Double Machine Learning
    • Key Takeaways
    • Further Reading
  • 16. A/B Tests
    • What Is an A/B Test?
    • Decision Criterion
    • Minimum Detectable Effects
      • Choosing the Statistical Power, Level, and P
      • Estimating the Variance of the Outcome
      • Simulations
      • Example: Conversion Rates
      • Setting the MDE
    • Hypotheses Backlog
      • Metric
      • Hypothesis
      • Ranking
    • Governance of Experiments
    • Key Takeaways
    • Further Reading
  • 17. Large Language Models and the Practice of Data Science
    • The Current State of AI
    • What Do Data Scientists Do?
    • Evolving the Data Scientists Job Description
      • Case Study: A/B Testing
      • Case Study: Data Cleansing
      • Case Study: Machine Learning
    • LLMs and This Book
    • Key Takeaways
    • Further Reading
  • Index

Dodaj do koszyka Data Science: The Hard Parts

Code, Publish & WebDesing by CATALIST.com.pl



(c) 2005-2025 CATALIST agencja interaktywna, znaki firmowe należą do wydawnictwa Helion S.A.