Building Machine Learning Powered Applications. Going from Idea to Product - Helion
ISBN: 978-14-920-4506-9
stron: 260, Format: ebook
Data wydania: 2020-01-21
Księgarnia: Helion
Cena książki: 186,15 zł (poprzednio: 216,45 zł)
Oszczędzasz: 14% (-30,30 zł)
Learn the skills necessary to design, build, and deploy applications powered by machine learning (ML). Through the course of this hands-on book, you’ll build an example ML-driven application from initial idea to deployed product. Data scientists, software engineers, and product managers—including experienced practitioners and novices alike—will learn the tools, best practices, and challenges involved in building a real-world ML application step by step.
Author Emmanuel Ameisen, an experienced data scientist who led an AI education program, demonstrates practical ML concepts using code snippets, illustrations, screenshots, and interviews with industry leaders. Part I teaches you how to plan an ML application and measure success. Part II explains how to build a working ML model. Part III demonstrates ways to improve the model until it fulfills your original vision. Part IV covers deployment and monitoring strategies.
This book will help you:
- Define your product goal and set up a machine learning problem
- Build your first end-to-end pipeline quickly and acquire an initial dataset
- Train and evaluate your ML models and address performance bottlenecks
- Deploy and monitor your models in a production environment
Osoby które kupowały "Building Machine Learning Powered Applications. Going from Idea to Product", wybierały także:
- Windows Media Center. Domowe centrum rozrywki 66,67 zł, (8,00 zł -88%)
- Ruby on Rails. Ćwiczenia 18,75 zł, (3,00 zł -84%)
- Przywództwo w świecie VUCA. Jak być skutecznym liderem w niepewnym środowisku 58,64 zł, (12,90 zł -78%)
- Scrum. O zwinnym zarządzaniu projektami. Wydanie II rozszerzone 58,64 zł, (12,90 zł -78%)
- Od hierarchii do turkusu, czyli jak zarządzać w XXI wieku 58,64 zł, (12,90 zł -78%)
Spis treści
Building Machine Learning Powered Applications. Going from Idea to Product eBook -- spis treści
- Preface
- The Goal of Using Machine Learning Powered Applications
- Use ML to Build Practical Applications
- Additional Resources
- Practical ML
- What This Book Covers
- The entire process of ML
- A technical, practical case study
- Real business applications
- Prerequisites
- Our Case Study: MLAssisted Writing
- The ML Process
- What This Book Covers
- Conventions Used in This Book
- Using Code Examples
- OReilly Online Learning
- How to Contact Us
- Acknowledgments
- The Goal of Using Machine Learning Powered Applications
- I. Find the Correct ML Approach
- 1. From Product Goal to ML Framing
- Estimate What Is Possible
- Models
- Classification and regression
- Knowledge extraction from unstructured data
- Catalog organization
- Generative models
- Data
- Data types
- Data availability
- Datasets are iterative
- Models
- Framing the ML Editor
- Trying to Do It All with ML: An End-to-End Framework
- The Simplest Approach: Being the Algorithm
- Middle Ground: Learning from Our Experience
- Monica Rogati: How to Choose and Prioritize ML Projects
- Conclusion
- Estimate What Is Possible
- 2. Create a Plan
- Measuring Success
- Business Performance
- Model Performance
- Freshness and Distribution Shift
- Speed
- Estimate Scope and Challenges
- Leverage Domain Expertise
- Learning from experts
- Examining the data
- Stand on the Shoulders of Giants
- Open data
- Open source code
- Bring both together
- Leverage Domain Expertise
- ML Editor Planning
- Initial Plan for an Editor
- Always Start with a Simple Model
- To Make Regular Progress: Start Simple
- Start with a Simple Pipeline
- Training
- Inference
- Pipeline for the ML Editor
- Start with a Simple Pipeline
- Conclusion
- Measuring Success
- II. Build a Working Pipeline
- 3. Build Your First End-to-End Pipeline
- The Simplest Scaffolding
- Prototype of an ML Editor
- Parse and Clean Data
- Tokenizing Text
- Generating Features
- Test Your Workflow
- User Experience
- Modeling Results
- Finding the impact bottleneck
- ML Editor Prototype Evaluation
- Model
- User Experience
- Conclusion
- 4. Acquire an Initial Dataset
- Iterate on Datasets
- Do Data Science
- Explore Your First Dataset
- Be Efficient, Start Small
- Insights Versus Products
- A Data Quality Rubric
- Data format
- Data quality
- Data quantity and distribution
- ML editor data inspection
- Label to Find Data Trends
- Summary Statistics
- Summary statistics for ML editor
- Explore and Label Efficiently
- Vectorizing
- Tabular data
- Text data
- Image data
- Dimensionality reduction
- Clustering
- Vectorizing
- Be the Algorithm
- Data Trends
- Summary Statistics
- Let Data Inform Features and Models
- Build Features Out of Patterns
- Raw datetime
- Extracting day of week and day of month
- Feature crosses
- Giving your model the answer
- ML Editor Features
- Build Features Out of Patterns
- Robert Munro: How Do You Find, Label, and Leverage Data?
- Conclusion
- Iterate on Datasets
- III. Iterate on Models
- 5. Train and Evaluate Your Model
- The Simplest Appropriate Model
- Simple Models
- Quick to implement
- Understandable
- Deployable
- From Patterns to Models
- We want to ignore feature scale
- Our predicted variable is a linear combination of predictors
- Our data has a temporal aspect
- Each data point is a combination of patterns
- ML Editor model
- Split Your Dataset
- Validation set
- Test set
- Relative proportions
- Data leakage
- Temporal data leakage
- Sample contamination
- ML Editor Data Split
- Judge Performance
- Bias variance trade-off
- Going beyond aggregate metrics
- Simple Models
- Evaluate Your Model: Look Beyond Accuracy
- Contrast Data and Predictions
- Confusion Matrix
- ROC Curve
- Calibration Curve
- Dimensionality Reduction for Errors
- The Top-k Method
- The k best performing examples
- The k worst performing examples
- The k most uncertain examples
- Top-k implementation tips
- Top-k method for the ML Editor
- Other Models
- Evaluate Feature Importance
- Directly from a Classifier
- Black-Box Explainers
- Conclusion
- The Simplest Appropriate Model
- 6. Debug Your ML Problems
- Software Best Practices
- ML-Specific Best Practices
- Debug Wiring: Visualizing and Testing
- Start with One Example
- Visualization steps
- Data loading
- Cleaning and feature selection
- Feature generation
- Data formatting
- Model output
- Systematizing our visual validation
- Separate your concerns
- Test Your ML Code
- Test data ingestion
- Test data processing
- Test model outputs
- Start with One Example
- Debug Training: Make Your Model Learn
- Task Difficulty
- Data quality, quantity, and diversity
- Data representation
- Model capacity
- Optimization Problems
- Task Difficulty
- Debug Generalization: Make Your Model Useful
- Data Leakage
- Overfitting
- Regularization
- Data augmentation
- Dataset redesign
- Consider the Task at Hand
- Conclusion
- Software Best Practices
- 7. Using Classifiers for Writing Recommendations
- Extracting Recommendations from Models
- What Can We Achieve Without a Model?
- Using feature statistics
- Extracting Global Feature Importance
- Using a Models Score
- Extracting Local Feature Importance
- What Can We Achieve Without a Model?
- Comparing Models
- Version 1: The Report Card
- Version 2: More Powerful, More Unclear
- Version 3: Understandable Recommendations
- Generating Editing Recommendations
- Conclusion
- Extracting Recommendations from Models
- IV. Deploy and Monitor
- 8. Considerations When Deploying Models
- Data Concerns
- Data Ownership
- Data Bias
- Test sets
- Systemic Bias
- Modeling Concerns
- Feedback Loops
- Inclusive Model Performance
- Considering Context
- Adversaries
- Defeating a model
- Exploiting a model
- Abuse Concerns and Dual-Use
- Chris Harland: Shipping Experiments
- Conclusion
- Data Concerns
- 9. Choose Your Deployment Option
- Server-Side Deployment
- Streaming Application or API
- Batch Predictions
- Client-Side Deployment
- On Device
- Browser Side
- Federated Learning: A Hybrid Approach
- Conclusion
- Server-Side Deployment
- 10. Build Safeguards for Models
- Engineer Around Failures
- Input and Output Checks
- Check inputs
- Model outputs
- Model Failure Fallbacks
- Filtering model
- Input and Output Checks
- Engineer for Performance
- Scale to Multiple Users
- Caching for ML
- Caching inference results
- Caching by indexing
- Caching for ML
- Model and Data Life Cycle Management
- Reproducibility
- Resilience
- Pipeline flexibility
- Data Processing and DAGs
- Scale to Multiple Users
- Ask for Feedback
- Chris Moody: Empowering Data Scientists to Deploy Models
- Conclusion
- Engineer Around Failures
- 11. Monitor and Update Models
- Monitoring Saves Lives
- Monitoring to Inform Refresh Rate
- Monitor to Detect Abuse
- Choose What to Monitor
- Performance Metrics
- Business Metrics
- CI/CD for ML
- A/B Testing and Experimentation
- Choosing groups and duration
- Estimating the better variant
- Building the infrastructure
- Other Approaches
- A/B Testing and Experimentation
- Conclusion
- Monitoring Saves Lives
- Index