Building Machine Learning Powered Applications. Going from Idea to Product - Helion

ebook

Autor: Emmanuel Ameisen
ISBN: 978-14-920-4506-9
stron: 260, Format: ebook
Data wydania: 2020-01-21
Księgarnia: Helion

Cena książki: 186,15 zł (poprzednio: 216,45 zł)
Oszczędzasz: 14% (-30,30 zł)

Osoby, które kupiły tę książkę, wybierały także »

Learn the skills necessary to design, build, and deploy applications powered by machine learning (ML). Through the course of this hands-on book, you’ll build an example ML-driven application from initial idea to deployed product. Data scientists, software engineers, and product managers—including experienced practitioners and novices alike—will learn the tools, best practices, and challenges involved in building a real-world ML application step by step.

Author Emmanuel Ameisen, an experienced data scientist who led an AI education program, demonstrates practical ML concepts using code snippets, illustrations, screenshots, and interviews with industry leaders. Part I teaches you how to plan an ML application and measure success. Part II explains how to build a working ML model. Part III demonstrates ways to improve the model until it fulfills your original vision. Part IV covers deployment and monitoring strategies.

This book will help you:

Define your product goal and set up a machine learning problem
Build your first end-to-end pipeline quickly and acquire an initial dataset
Train and evaluate your ML models and address performance bottlenecks
Deploy and monitor your models in a production environment

Osoby które kupowały "Building Machine Learning Powered Applications. Going from Idea to Product", wybierały także:

Jak zhakowa 125,00 zł, (10,00 zł -92%)
Biologika Sukcesji Pokoleniowej. Sezon 3. Konflikty na terytorium 126,36 zł, (13,90 zł -89%)
Windows Media Center. Domowe centrum rozrywki 66,67 zł, (8,00 zł -88%)
Podręcznik startupu. Budowa wielkiej firmy krok po kroku 92,67 zł, (13,90 zł -85%)
Ruby on Rails. Ćwiczenia 18,75 zł, (3,00 zł -84%)

Spis treści

Building Machine Learning Powered Applications. Going from Idea to Product eBook -- spis treści

Preface
- The Goal of Using Machine Learning Powered Applications
  - Use ML to Build Practical Applications
  - Additional Resources
- Practical ML
  - What This Book Covers
    - The entire process of ML
    - A technical, practical case study
    - Real business applications
  - Prerequisites
  - Our Case Study: MLAssisted Writing
  - The ML Process
- Conventions Used in This Book
- Using Code Examples
- OReilly Online Learning
- How to Contact Us
- Acknowledgments
I. Find the Correct ML Approach
1. From Product Goal to ML Framing
- Estimate What Is Possible
  - Models
    - Classification and regression
    - Knowledge extraction from unstructured data
    - Catalog organization
    - Generative models
  - Data
    - Data types
    - Data availability
    - Datasets are iterative
- Framing the ML Editor
  - Trying to Do It All with ML: An End-to-End Framework
  - The Simplest Approach: Being the Algorithm
  - Middle Ground: Learning from Our Experience
- Monica Rogati: How to Choose and Prioritize ML Projects
- Conclusion
2. Create a Plan
- Measuring Success
  - Business Performance
  - Model Performance
  - Freshness and Distribution Shift
  - Speed
- Estimate Scope and Challenges
  - Leverage Domain Expertise
    - Learning from experts
    - Examining the data
  - Stand on the Shoulders of Giants
    - Open data
    - Open source code
    - Bring both together
- ML Editor Planning
  - Initial Plan for an Editor
  - Always Start with a Simple Model
- To Make Regular Progress: Start Simple
  - Start with a Simple Pipeline
    - Training
    - Inference
  - Pipeline for the ML Editor
- Conclusion
II. Build a Working Pipeline
3. Build Your First End-to-End Pipeline
- The Simplest Scaffolding
- Prototype of an ML Editor
  - Parse and Clean Data
  - Tokenizing Text
  - Generating Features
- Test Your Workflow
  - User Experience
  - Modeling Results
    - Finding the impact bottleneck
- ML Editor Prototype Evaluation
  - Model
  - User Experience
- Conclusion
4. Acquire an Initial Dataset
- Iterate on Datasets
  - Do Data Science
- Explore Your First Dataset
  - Be Efficient, Start Small
  - Insights Versus Products
  - A Data Quality Rubric
    - Data format
    - Data quality
    - Data quantity and distribution
    - ML editor data inspection
- Label to Find Data Trends
  - Summary Statistics
    - Summary statistics for ML editor
  - Explore and Label Efficiently
    - Vectorizing
      - Tabular data
      - Text data
      - Image data
    - Dimensionality reduction
    - Clustering
  - Be the Algorithm
  - Data Trends
- Let Data Inform Features and Models
  - Build Features Out of Patterns
    - Raw datetime
    - Extracting day of week and day of month
    - Feature crosses
    - Giving your model the answer
  - ML Editor Features
- Robert Munro: How Do You Find, Label, and Leverage Data?
- Conclusion
III. Iterate on Models
5. Train and Evaluate Your Model
- The Simplest Appropriate Model
  - Simple Models
    - Quick to implement
    - Understandable
    - Deployable
  - From Patterns to Models
    - We want to ignore feature scale
    - Our predicted variable is a linear combination of predictors
    - Our data has a temporal aspect
    - Each data point is a combination of patterns
    - ML Editor model
  - Split Your Dataset
    - Validation set
    - Test set
    - Relative proportions
    - Data leakage
      - Temporal data leakage
      - Sample contamination
  - ML Editor Data Split
  - Judge Performance
    - Bias variance trade-off
    - Going beyond aggregate metrics
- Evaluate Your Model: Look Beyond Accuracy
  - Contrast Data and Predictions
  - Confusion Matrix
  - ROC Curve
  - Calibration Curve
  - Dimensionality Reduction for Errors
  - The Top-k Method
    - The k best performing examples
    - The k worst performing examples
    - The k most uncertain examples
    - Top-k implementation tips
    - Top-k method for the ML Editor
  - Other Models
- Evaluate Feature Importance
  - Directly from a Classifier
  - Black-Box Explainers
- Conclusion
6. Debug Your ML Problems
- Software Best Practices
  - ML-Specific Best Practices
- Debug Wiring: Visualizing and Testing
  - Start with One Example
    - Visualization steps
    - Data loading
    - Cleaning and feature selection
    - Feature generation
    - Data formatting
    - Model output
    - Systematizing our visual validation
    - Separate your concerns
  - Test Your ML Code
    - Test data ingestion
    - Test data processing
    - Test model outputs
- Debug Training: Make Your Model Learn
  - Task Difficulty
    - Data quality, quantity, and diversity
    - Data representation
    - Model capacity
  - Optimization Problems
- Debug Generalization: Make Your Model Useful
  - Data Leakage
  - Overfitting
    - Regularization
    - Data augmentation
    - Dataset redesign
  - Consider the Task at Hand
- Conclusion
7. Using Classifiers for Writing Recommendations
- Extracting Recommendations from Models
  - What Can We Achieve Without a Model?
    - Using feature statistics
  - Extracting Global Feature Importance
  - Using a Models Score
  - Extracting Local Feature Importance
- Comparing Models
  - Version 1: The Report Card
  - Version 2: More Powerful, More Unclear
  - Version 3: Understandable Recommendations
- Generating Editing Recommendations
- Conclusion
IV. Deploy and Monitor
8. Considerations When Deploying Models
- Data Concerns
  - Data Ownership
  - Data Bias
    - Test sets
  - Systemic Bias
- Modeling Concerns
  - Feedback Loops
  - Inclusive Model Performance
  - Considering Context
  - Adversaries
    - Defeating a model
    - Exploiting a model
  - Abuse Concerns and Dual-Use
- Chris Harland: Shipping Experiments
- Conclusion
9. Choose Your Deployment Option
- Server-Side Deployment
  - Streaming Application or API
  - Batch Predictions
- Client-Side Deployment
  - On Device
  - Browser Side
- Federated Learning: A Hybrid Approach
- Conclusion
10. Build Safeguards for Models
- Engineer Around Failures
  - Input and Output Checks
    - Check inputs
    - Model outputs
  - Model Failure Fallbacks
    - Filtering model
- Engineer for Performance
  - Scale to Multiple Users
    - Caching for ML
      - Caching inference results
      - Caching by indexing
  - Model and Data Life Cycle Management
    - Reproducibility
    - Resilience
    - Pipeline flexibility
  - Data Processing and DAGs
- Ask for Feedback
- Chris Moody: Empowering Data Scientists to Deploy Models
- Conclusion
11. Monitor and Update Models
- Monitoring Saves Lives
  - Monitoring to Inform Refresh Rate
  - Monitor to Detect Abuse
- Choose What to Monitor
  - Performance Metrics
  - Business Metrics
- CI/CD for ML
  - A/B Testing and Experimentation
    - Choosing groups and duration
    - Estimating the better variant
    - Building the infrastructure
  - Other Approaches
- Conclusion
Index