Machine Learning Production Systems - Helion
ISBN: 9781098155971
stron: 474, Format: ebook
Data wydania: 2024-10-02
Księgarnia: Helion
Cena książki: 228,65 zł (poprzednio: 278,84 zł)
Oszczędzasz: 18% (-50,19 zł)
Using machine learning for products, services, and critical business processes is quite different from using ML in an academic or research setting—especially for recent ML graduates and those moving from research to a commercial environment. Whether you currently work to create products and services that use ML, or would like to in the future, this practical book gives you a broad view of the entire field.
Authors Robert Crowe, Hannes Hapke, Emily Caveness, and Di Zhu help you identify topics that you can dive into deeper, along with reference materials and tutorials that teach you the details. You'll learn the state of the art of machine learning engineering, including a wide range of topics such as modeling, deployment, and MLOps. You'll learn the basics and advanced aspects to understand the production ML lifecycle.
This book provides four in-depth sections that cover all aspects of machine learning engineering:
- Data: collecting, labeling, validating, automation, and data preprocessing; data feature engineering and selection; data journey and storage
- Modeling: high performance modeling; model resource management techniques; model analysis and interoperability; neural architecture search
- Deployment: model serving patterns and infrastructure for ML models and LLMs; management and delivery; monitoring and logging
- Productionalizing: ML pipelines; classifying unstructured texts and images; genAI model pipelines
Osoby które kupowały "Machine Learning Production Systems", wybierały także:
- Windows Media Center. Domowe centrum rozrywki 66,67 zł, (8,00 zł -88%)
- Ruby on Rails. Ćwiczenia 18,75 zł, (3,00 zł -84%)
- Przywództwo w świecie VUCA. Jak być skutecznym liderem w niepewnym środowisku 58,64 zł, (12,90 zł -78%)
- Scrum. O zwinnym zarządzaniu projektami. Wydanie II rozszerzone 58,64 zł, (12,90 zł -78%)
- Od hierarchii do turkusu, czyli jak zarządzać w XXI wieku 58,64 zł, (12,90 zł -78%)
Spis treści
Machine Learning Production Systems eBook -- spis treści
- Foreword
- Preface
- Who Should Read This Book
- Why We Wrote This Book
- Navigating This Book
- Conventions Used in This Book
- Using Code Examples
- OReilly Online Learning
- How to Contact Us
- Acknowledgments
- Robert
- Hannes
- Emily
- Di
- 1. Introduction to Machine Learning Production Systems
- What Is Production Machine Learning?
- Benefits of Machine Learning Pipelines
- Focus on Developing New Models, Not on Maintaining Existing Models
- Prevention of Bugs
- Creation of Records for Debugging and Reproducing Results
- Standardization
- The Business Case for ML Pipelines
- When to Use Machine Learning Pipelines
- Steps in a Machine Learning Pipeline
- Data Ingestion and Data Versioning
- Data Validation
- Feature Engineering
- Model Training and Model Tuning
- Model Analysis
- Model Deployment
- Looking Ahead
- 2. Collecting, Labeling, and Validating Data
- Important Considerations in Data Collection
- Responsible Data Collection
- Labeling Data: Data Changes and Drift in Production ML
- Labeling Data: Direct Labeling and Human Labeling
- Validating Data: Detecting Data Issues
- Validating Data: TensorFlow Data Validation
- Skew Detection with TFDV
- Types of Skew
- Example: Spotting Imbalanced Datasets with TensorFlow Data Validation
- Conclusion
- 3. Feature Engineering and Feature Selection
- Introduction to Feature Engineering
- Preprocessing Operations
- Feature Engineering Techniques
- Normalizing and Standardizing
- Bucketizing
- Feature Crosses
- Dimensionality and Embeddings
- Visualization
- Feature Transformation at Scale
- Choose a Framework That Scales Well
- Avoid TrainingServing Skew
- Consider Instance-Level Versus Full-Pass Transformations
- Using TensorFlow Transform
- Analyzers
- Code Example
- Feature Selection
- Feature Spaces
- Feature Selection Overview
- Filter Methods
- Wrapper Methods
- Forward selection
- Backward elimination
- Recursive feature elimination
- Code example
- Embedded Methods
- Feature and Example Selection for LLMs and GenAI
- Example: Using TF Transform to Tokenize Text
- Benefits of Using TF Transform
- Alternatives to TF Transform
- Conclusion
- 4. Data Journey and Data Storage
- Data Journey
- ML Metadata
- Using a Schema
- Schema Development
- Schema Environments
- Changes Across Datasets
- Enterprise Data Storage
- Feature Stores
- Metadata
- Precomputed features
- Time travel
- Data Warehouses
- Data Lakes
- Feature Stores
- Conclusion
- 5. Advanced Labeling, Augmentation, and Data Preprocessing
- Advanced Labeling
- Semi-Supervised Labeling
- Label propagation
- Sampling techniques
- Active Learning
- Margin sampling
- Other sampling techniques
- Weak Supervision
- Advanced Labeling Review
- Semi-Supervised Labeling
- Data Augmentation
- Example: CIFAR-10
- Other Augmentation Techniques
- Data Augmentation Review
- Preprocessing Time Series Data: An Example
- Windowing
- Sampling
- Conclusion
- Advanced Labeling
- 6. Model Resource Management Techniques
- Dimensionality Reduction: Dimensionality Effect on Performance
- Example: Word Embedding Using Keras
- Curse of Dimensionality
- Adding Dimensions Increases Feature Space Volume
- Dimensionality Reduction
- Three approaches
- Algorithmic dimensionality reduction
- Principal component analysis
- Quantization and Pruning
- Mobile, IoT, Edge, and Similar Use Cases
- Quantization
- Benefits and process of quantization
- MobileNets
- Post-training quantization
- Quantization-aware training
- Comparing results
- Example: Quantizing models with TF Lite
- Optimizing Your TensorFlow Model with TF Lite
- Optimization Options
- Pruning
- The Lottery Ticket Hypothesis
- Pruning in TensorFlow
- Knowledge Distillation
- Teacher and Student Networks
- Knowledge Distillation Techniques
- TMKD: Distilling Knowledge for a Q&A Task
- Increasing Robustness by Distilling EfficientNets
- Conclusion
- Dimensionality Reduction: Dimensionality Effect on Performance
- 7. High-Performance Modeling
- Distributed Training
- Data Parallelism
- Synchronous versus asynchronous training
- Distribution awareness
- Tf.distribute: Distributed training in TensorFlow
- OneDeviceStrategy
- MirroredStrategy
- ParameterServerStrategy
- Fault tolerance
- Data Parallelism
- Efficient Input Pipelines
- Input Pipeline Basics
- Input Pipeline Patterns: Improving Efficiency
- Optimizing Your Input Pipeline with TensorFlow Data
- Prefetching
- Parallelizing data transformation
- Caching
- Training Large Models: The Rise of Giant Neural Nets and Parallelism
- Potential Solutions and Their Shortcomings
- Gradient accumulation
- Swapping
- Parallelism, revisited in the context of giant neural nets
- Pipeline Parallelism to the Rescue?
- Potential Solutions and Their Shortcomings
- Conclusion
- Distributed Training
- 8. Model Analysis
- Analyzing Model Performance
- Black-Box Evaluation
- Performance Metrics and Optimization Objectives
- Advanced Model Analysis
- TensorFlow Model Analysis
- The Learning Interpretability Tool
- Advanced Model Debugging
- Benchmark Models
- Sensitivity Analysis
- Random attacks
- Partial dependence plots
- Vulnerability to attacks
- Measuring model vulnerability
- Hardening your models
- Residual Analysis
- Model Remediation
- Discrimination Remediation
- Fairness
- Fairness Evaluation
- True/false positive/negative rates
- Accuracy and AUC
- Fairness Considerations
- Fairness Evaluation
- Continuous Evaluation and Monitoring
- Conclusion
- Analyzing Model Performance
- 9. Interpretability
- Explainable AI
- Model Interpretation Methods
- Method Categories
- Intrinsic or post hoc?
- Model specific or model agnostic?
- Local or global?
- Intrinsically Interpretable Models
- Feature importance
- Lattice models
- Model-Agnostic Methods
- Partial dependence plots
- Permutation feature importance
- Local Interpretable Model-Agnostic Explanations
- Shapley Values
- The SHAP Library
- Testing Concept Activation Vectors
- AI Explanations
- Integrated gradients
- XRAI
- Method Categories
- Example: Exploring Model Sensitivity with SHAP
- Regression Models
- Natural Language Processing Models
- Conclusion
- 10. Neural Architecture Search
- Hyperparameter Tuning
- Introduction to AutoML
- Key Components of NAS
- Search Spaces
- Macro search space
- Micro search space
- Search Strategies
- Performance Estimation Strategies
- Simple approach to performance estimation
- More efficient performance estimation
- Search Spaces
- AutoML in the Cloud
- Amazon SageMaker Autopilot
- Microsoft Azure Automated Machine Learning
- Google Cloud AutoML
- Using AutoML
- Generative AI and AutoML
- Conclusion
- 11. Introduction to Model Serving
- Model Training
- Model Prediction
- Latency
- Throughput
- Cost
- Resources and Requirements for Serving Models
- Cost and Complexity
- Accelerators
- Feeding the Beast
- Model Deployments
- Data Center Deployments
- Mobile and Distributed Deployments
- Model Servers
- Managed Services
- Conclusion
- 12. Model Serving Patterns
- Batch Inference
- Batch Throughput
- Batch Inference Use Cases
- Product recommendations
- Sentiment analysis
- Demand forecasting
- ETL for Distributed Batch and Stream Processing Systems
- Introduction to Real-Time Inference
- Synchronous Delivery of Real-Time Predictions
- Asynchronous Delivery of Real-Time Predictions
- Optimizing Real-Time Inference
- Real-Time Inference Use Cases
- Serving Model Ensembles
- Ensemble Topologies
- Example Ensemble
- Ensemble Serving Considerations
- Model Routers: Ensembles in GenAI
- Data Preprocessing and Postprocessing in Real Time
- Training Transformations Versus Serving Transformations
- Windowing
- Options for Preprocessing
- Enter TensorFlow Transform
- Postprocessing
- Inference at the Edge and at the Browser
- Challenges
- Balancing energy consumption with processing power
- Performing model retraining and updates
- Securing the user data
- Model Deployments via Containers
- Training on the Device
- Federated Learning
- Runtime Interoperability
- Inference in Web Browsers
- Challenges
- Conclusion
- Batch Inference
- 13. Model Serving Infrastructure
- Model Servers
- TensorFlow Serving
- Servables
- Servable versions
- Models
- Loaders
- Sources
- Aspired versions
- Managers
- Core
- NVIDIA Triton Inference Server
- TorchServe
- TensorFlow Serving
- Building Scalable Infrastructure
- Containerization
- Traditional Deployment Era
- Virtualized Deployment Era
- Container Deployment Era
- The Docker Containerization Framework
- Docker daemon
- Docker client
- Docker registry
- Docker objects
- Docker image
- Docker container
- Container Orchestration
- Kubernetes
- Kubernetes components
- Containers on clouds
- Kubeflow
- Reliability and Availability Through Redundancy
- Observability
- High Availability
- Automated Deployments
- Hardware Accelerators
- GPUs
- TPUs
- Conclusion
- Model Servers
- 14. Model Serving Examples
- Example: Deploying TensorFlow Models with TensorFlow Serving
- Exporting Keras Models for TF Serving
- Setting Up TF Serving with Docker
- Basic Configuration of TF Serving
- Making Model Prediction Requests with REST
- Making Model Prediction Requests with gRPC
- Getting Predictions from Classification and Regression Models
- Using Payloads
- Getting Model Metadata from TF Serving
- Making Batch Inference Requests
- Example: Profiling TF Serving Inferences with TF Profiler
- Prerequisites
- TensorBoard Setup
- Model Profile
- Example: Basic TorchServe Setup
- Installing the TorchServe Dependencies
- Exporting Your Model for TorchServe
- Setting Up TorchServe
- Request handlers
- TorchServe configuration
- Making Model Prediction Requests
- Making Batch Inference Requests
- Setting batch configuration via config.properties
- Setting batch configuration via REST request
- Conclusion
- Example: Deploying TensorFlow Models with TensorFlow Serving
- 15. Model Management and Delivery
- Experiment Tracking
- Experimenting in Notebooks
- Experimenting Overall
- Not just one big file
- Tracking runtime parameters
- Tools for Experiment Tracking and Versioning
- TensorBoard
- Tools for organizing experiment results
- Introduction to MLOps
- Data Scientists Versus Software Engineers
- ML Engineers
- ML in Products and Services
- MLOps
- MLOps Methodology
- MLOps Level 0
- MLOps Level 1
- MLOps Level 2
- Components of an Orchestrated Workflow
- Three Types of Custom Components
- Python FunctionBased Components
- Container-Based Components
- Fully Custom Components
- TFX Deep Dive
- TFX SDK
- Intermediate Representation
- Runtime
- Implementing an ML Pipeline Using TFX Components
- Advanced Features of TFX
- Component dependency
- Data dependency
- Task dependency
- Importer
- Conditional execution
- Component dependency
- Managing Model Versions
- Approaches to Versioning Models
- Versioning proposal
- Arbitrary grouping
- Black-box functional model
- Pipeline execution versioning
- Model Lineage
- Model Registries
- Approaches to Versioning Models
- Continuous Integration and Continuous Deployment
- Continuous Integration
- Continuous Delivery
- Progressive Delivery
- Blue/Green Deployment
- Canary Deployment
- Live Experimentation
- A/B testing
- Multi-armed bandits
- Contextual bandits
- Conclusion
- Experiment Tracking
- 16. Model Monitoring and Logging
- The Importance of Monitoring
- Observability in Machine Learning
- What Should You Monitor?
- Custom Alerting in TFX
- Logging
- Distributed Tracing
- Monitoring for Model Decay
- Data Drift and Concept Drift
- Model Decay Detection
- Supervised Monitoring Techniques
- Statistical process control
- Sequential analysis
- Error distribution monitoring
- Unsupervised Monitoring Techniques
- Clustering
- Feature distribution monitoring
- Model-dependent monitoring
- Mitigating Model Decay
- Retraining Your Model
- When to Retrain
- Automated Retraining
- Conclusion
- 17. Privacy and Legal Requirements
- Why Is Data Privacy Important?
- What Data Needs to Be Kept Private?
- Harms
- Only Collect What You Need
- GenAI Data Scraped from the Web and Other Sources
- Legal Requirements
- The GDPR and the CCPA
- The GDPRs Right to Be Forgotten
- Pseudonymization and Anonymization
- Differential Privacy
- Local and Global DP
- Epsilon-Delta DP
- Applying Differential Privacy to ML
- Differentially Private Stochastic Gradient Descent
- Private Aggregation of Teacher Ensembles
- Confidential and Private Collaborative learning
- TensorFlow Privacy Example
- Federated Learning
- Encrypted ML
- Conclusion
- Why Is Data Privacy Important?
- 18. Orchestrating Machine Learning Pipelines
- An Introduction to Pipeline Orchestration
- Why Pipeline Orchestration?
- Directed Acyclic Graphs
- Pipeline Orchestration with TFX
- Interactive TFX Pipelines
- Converting Your Interactive Pipeline for Production
- Orchestrating TFX Pipelines with Apache Beam
- Orchestrating TFX Pipelines with Kubeflow Pipelines
- Introduction to Kubeflow Pipelines
- Installation and Initial Setup
- Accessing Kubeflow Pipelines
- The Workflow from TFX to Kubeflow
- OpFunc Functions
- Orchestrating Kubeflow Pipelines
- Google Cloud Vertex Pipelines
- Setting Up Google Cloud and Vertex Pipelines
- Setting Up a Google Cloud Service Account
- Orchestrating Pipelines with Vertex Pipelines
- Executing Vertex Pipelines
- Choosing Your Orchestrator
- Interactive TFX
- Apache Beam
- Kubeflow Pipelines
- Google Cloud Vertex Pipelines
- Alternatives to TFX
- Conclusion
- An Introduction to Pipeline Orchestration
- 19. Advanced TFX
- Advanced Pipeline Practices
- Configure Your Components
- Import Artifacts
- Use Resolver Node
- Execute a Conditional Pipeline
- Export TF Lite Models
- Warm-Starting Model Training
- Use Exit Handlers
- Trigger Messages from TFX
- Custom TFX Components: Architecture and Use Cases
- Architecture of TFX Components
- Use Cases of Custom Components
- Using Function-Based Custom Components
- Writing a Custom Component from Scratch
- Defining Component Specifications
- Defining Component Channels
- Writing the Custom Executor
- Writing the Custom Driver
- Assembling the Custom Component
- Using Our Basic Custom Component
- Implementation Review
- Reusing Existing Components
- Creating Container-Based Custom Components
- Which Custom Component Is Right for You?
- TFX-Addons
- Conclusion
- Advanced Pipeline Practices
- 20. ML Pipelines for Computer Vision Problems
- Our Data
- Our Model
- Custom Ingestion Component
- Data Preprocessing
- Exporting the Model
- Our Pipeline
- Data Ingestion
- Data Preprocessing
- Model Training
- Model Evaluation
- Model Export
- Putting It All Together
- Executing on Apache Beam
- Executing on Vertex Pipelines
- Model Deployment with TensorFlow Serving
- Conclusion
- 21. ML Pipelines for Natural Language Processing
- Our Data
- Our Model
- Ingestion Component
- Data Preprocessing
- Putting the Pipeline Together
- Executing the Pipeline
- Model Deployment with Google Cloud Vertex
- Registering Your ML Model
- Creating a New Model Endpoint
- Deploying Your ML Model
- Requesting Predictions from the Deployed Model
- Cleaning Up Your Deployed Model
- Conclusion
- 22. Generative AI
- Generative Models
- GenAI Model Types
- Agents and Copilots
- Pretraining
- Pretraining Datasets
- Embeddings
- Self-Supervised Training with Masks
- Fine-Tuning
- Fine-Tuning Versus Transfer Learning
- Fine-Tuning Datasets
- Fine-Tuning Considerations for Production
- Fine-Tuning Versus Model APIs
- Parameter-Efficient Fine-Tuning
- LoRA
- S-LoRA
- Human Alignment
- Reinforcement Learning from Human Feedback
- Reinforcement Learning from AI Feedback
- Direct Preference Optimization
- Prompting
- Chaining
- Retrieval Augmented Generation
- ReAct
- Evaluation
- Evaluation Techniques
- Benchmarking Across Models
- LMOps
- GenAI Attacks
- Jailbreaks
- Prompt Injection
- Responsible GenAI
- Design for Responsibility
- Conduct Adversarial Testing
- Constitutional AI
- Conclusion
- 23. The Future of Machine Learning Production Systems and Next Steps
- Lets Think in Terms of ML Systems, Not ML Models
- Bringing ML Systems Closer to Domain Experts
- Privacy Has Never Been More Important
- Conclusion
- Index