Practical Machine Learning for Computer Vision - Helion

ebook

Autor: Valliappa Lakshmanan, Martin GĂśrner, Ryan Gillard
ISBN: 9781098102326
stron: 482, Format: ebook
Data wydania: 2021-07-21
Księgarnia: Helion

Cena książki: 245,65 zł (poprzednio: 285,64 zł)
Oszczędzasz: 14% (-39,99 zł)

Osoby, które kupiły tę książkę, wybierały także »

This practical book shows you how to employ machine learning models to extract information from images. ML engineers and data scientists will learn how to solve a variety of image problems including classification, object detection, autoencoders, image generation, counting, and captioning with proven ML techniques. This book provides a great introduction to end-to-end deep learning: dataset creation, data preprocessing, model design, model training, evaluation, deployment, and interpretability.

Google engineers Valliappa Lakshmanan, Martin Görner, and Ryan Gillard show you how to develop accurate and explainable computer vision ML models and put them into large-scale production using robust ML architecture in a flexible and maintainable way. You'll learn how to design, train, evaluate, and predict with models written in TensorFlow or Keras.

You'll learn how to:

Design ML architecture for computer vision tasks
Select a model (such as ResNet, SqueezeNet, or EfficientNet) appropriate to your task
Create an end-to-end ML pipeline to train, evaluate, deploy, and explain your model
Preprocess images for data augmentation and to support learnability
Incorporate explainability and responsible AI best practices
Deploy image models as web services or on edge devices
Monitor and manage ML models

Osoby które kupowały "Practical Machine Learning for Computer Vision", wybierały także:

Jak zhakowa 125,00 zł, (10,00 zł -92%)
Biologika Sukcesji Pokoleniowej. Sezon 3. Konflikty na terytorium 126,36 zł, (13,90 zł -89%)
Windows Media Center. Domowe centrum rozrywki 66,67 zł, (8,00 zł -88%)
Podręcznik startupu. Budowa wielkiej firmy krok po kroku 92,67 zł, (13,90 zł -85%)
Ruby on Rails. Ćwiczenia 18,75 zł, (3,00 zł -84%)

Spis treści

Practical Machine Learning for Computer Vision eBook -- spis treści

Preface
- Who Is This Book For?
- How to Use This Book
- Organization of the Book
- Conventions Used in This Book
- Using Code Examples
- OReilly Online Learning
- How to Contact Us
- Acknowledgments
1. Machine Learning for Computer Vision
- Machine Learning
- Deep Learning Use Cases
- Summary
2. ML Models for Vision
- A Dataset for Machine Perception
  - 5-Flowers Dataset
  - Reading Image Data
  - Visualizing Image Data
  - Reading the Dataset File
- A Linear Model Using Keras
  - Keras Model
    - Prediction function
    - Activation function
    - Optimizer
    - Training loss
    - Error metrics
  - Training the Model
    - Creating the datasets
    - Creating and viewing the model
    - Fitting the model
    - Plotting predictions
- A Neural Network Using Keras
  - Neural Networks
    - Hidden layers
    - Training the neural network
    - Learning rate
    - Regularization
    - Early stopping
    - Hyperparameter tuning
  - Deep Neural Networks
    - Building a DNN
    - Dropout
    - Batch normalization
- Summary
- Glossary
3. Image Vision
- Pretrained Embeddings
  - Pretrained Model
  - Transfer Learning
  - Fine-Tuning
    - Learning rate schedule
    - Differential learning rate
- Convolutional Networks
  - Convolutional Filters
  - Stacking Convolutional Layers
  - Pooling Layers
  - AlexNet
- The Quest for Depth
  - Filter Factorization
  - 1x1 Convolutions
  - VGG19
  - Global Average Pooling
- Modular Architectures
  - Inception
  - SqueezeNet
  - ResNet and Skip Connections
  - DenseNet
  - Depth-Separable Convolutions
  - Xception
- Neural Architecture Search Designs
  - NASNet
  - The MobileNet Family
    - Depthwise convolutions
    - Inverted residual bottlenecks
    - MobileNetV2
    - EfficientNet: Putting it all together
- Beyond Convolution: The Transformer Architecture
- Choosing a Model
  - Performance Comparison
  - Ensembling
  - Recommended Strategy
- Summary
4. Object Detection and Image Segmentation
- Object Detection
  - YOLO
    - YOLO grid
    - Object detection head
    - Loss function
    - YOLO limitations
  - RetinaNet
    - Feature pyramid networks
    - Anchor boxes
    - Architecture
    - Focal loss (for classification)
    - Smooth L1 loss (for box regression)
    - Non-maximum suppression
    - Other considerations
- Segmentation
  - Mask R-CNN and Instance Segmentation
    - Region proposal networks
    - R-CNN
    - ROI resampling (ROI alignment)
    - Class and bounding box predictions
    - Transposed convolutions
    - Instance segmentation
  - U-Net and Semantic Segmentation
    - Images and labels
    - Architecture
    - Training
- Summary
5. Creating Vision Datasets
- Collecting Images
  - Photographs
  - Imaging
    - Polar grids
    - Satellite channels
    - Geospatial layers
  - Proof of Concept
- Data Types
  - Channels
    - Scaling
    - Channel order
    - Grayscale
  - Geospatial Data
    - Raster data
    - Remote sensing
  - Audio and Video
    - Spectrogram
    - Frame by frame
    - Conv3D
- Manual Labeling
  - Multilabel
  - Object Detection
- Labeling at Scale
  - Labeling User Interface
  - Multiple Tasks
  - Voting and Crowdsourcing
  - Labeling Services
- Automated Labeling
  - Labels from Related Data
  - Noisy Student
  - Self-Supervised Learning
- Bias
  - Sources of Bias
  - Selection Bias
  - Measurement Bias
  - Confirmation Bias
  - Detecting Bias
- Creating a Dataset
  - Splitting Data
  - TensorFlow Records
    - Running at scale
    - TensorFlow Recorder
  - Reading TensorFlow Records
- Summary
6. Preprocessing
- Reasons for Preprocessing
  - Shape Transformation
  - Data Quality Transformation
  - Improving Model Quality
- Size and Resolution
  - Using Keras Preprocessing Layers
  - Using the TensorFlow Image Module
  - Mixing Keras and TensorFlow
  - Model Training
- Training-Serving Skew
  - Reusing Functions
  - Preprocessing Within the Model
  - Using tf.transform
    - Writing the Beam pipeline
    - Transforming the data
    - Saving the transform
    - Reading the preprocessed data
    - Transformation during serving
    - Benefits of tf.transform
- Data Augmentation
  - Spatial Transformations
  - Color Distortion
  - Information Dropping
- Forming Input Images
- Summary
7. Training Pipeline
- Efficient Ingestion
  - Storing Data Efficiently
    - TensorFlow Records
    - Storing preprocessed data
  - Reading Data in Parallel
    - Parallelizing
    - Measuring performance
  - Maximizing GPU Utilization
    - Efficient data handling
    - Vectorization
    - Staying in the graph
      - Iteration
      - Slicing and conditionals
      - Matrix math
      - Batching
- Saving Model State
  - Exporting the Model
    - Invoking the model
    - Usable signature
    - Using the signature
  - Checkpointing
- Distribution Strategy
  - Choosing a Strategy
  - Creating the Strategy
    - MirroredStrategy
    - MultiWorkerMirroredStrategy
      - Shuffling
      - Virtual epochs
    - TPUStrategy
- Serverless ML
  - Creating a Python Package
    - Reusable modules
    - Invoking Python modules
    - Installing dependencies
  - Submitting a Training Job
    - Running on multiple GPUs
    - Distribution to multiple GPUs
    - Distribution to TPU
  - Hyperparameter Tuning
    - Specifying the search space
    - Using parameter values
    - Reporting accuracy
    - Result
    - Continuing tuning
  - Deploying the Model
- Summary
8. Model Quality and Continuous Evaluation
- Monitoring
  - TensorBoard
  - Weight Histograms
  - Device Placement
  - Data Visualization
  - Training Events
- Model Quality Metrics
  - Metrics for Classification
    - Binary classification
    - Multiclass, single-label classification
    - Multiclass, multilabel classification
  - Metrics for Regression
  - Metrics for Object Detection
- Quality Evaluation
  - Sliced Evaluations
  - Fairness Monitoring
  - Continuous Evaluation
- Summary
9. Model Predictions
- Making Predictions
  - Exporting the Model
  - Using In-Memory Models
  - Improving Abstraction
  - Improving Efficiency
- Online Prediction
  - TensorFlow Serving
    - Deploying the model
    - Making predictions
  - Modifying the Serving Function
    - Changing the default signature
    - Multiple signatures
  - Handling Image Bytes
    - Loading the model
    - Adding a prediction signature
    - Exporting signatures
    - Base64 encoding
- Batch and Stream Prediction
  - The Apache Beam Pipeline
  - Managed Service for Batch Prediction
  - Invoking Online Prediction
- Edge ML
  - Constraints and Optimizations
  - TensorFlow Lite
  - Running TensorFlow Lite
  - Processing the Image Buffer
  - Federated Learning
- Summary
10. Trends in Production ML
- Machine Learning Pipelines
  - The Need for Pipelines
  - Kubeflow Pipelines Cluster
  - Containerizing the Codebase
  - Writing a Component
  - Connecting Components
  - Automating a Run
- Explainability
  - Techniques
    - LIME
    - KernelSHAP
    - Integrated Gradients
    - xRAI
  - Adding Explainability
    - Explainability signatures
    - Explanation metadata
    - Deploying the model
    - Obtaining explanations
- No-Code Computer Vision
  - Why Use No-Code?
  - Loading Data
  - Training
  - Evaluation
- Summary
11. Advanced Vision Problems
- Object Measurement
  - Reference Object
  - Segmentation
  - Rotation Correction
  - Ratio and Measurements
- Counting
  - Density Estimation
  - Extracting Patches
  - Simulating Input Images
  - Regression
  - Prediction
- Pose Estimation
  - PersonLab
  - The PoseNet Model
  - Identifying Multiple Poses
- Image Search
  - Distributed Search
  - Fast Search
  - Better Embeddings
- Summary
12. Image and Text Generation
- Image Understanding
  - Embeddings
  - Auxiliary Learning Tasks
  - Autoencoders
    - Architecture
    - Training
    - Latent vectors
  - Variational Autoencoders
    - Architecture
    - Loss
- Image Generation
  - Generative Adversarial Networks
    - Creating the networks
    - Discriminator training
    - Generator training
    - Distribution changes
  - GAN Improvements
    - Conditional GANs
      - The cGAN generator
      - The cGAN discriminator
  - Image-to-Image Translation
  - Super-Resolution
  - Modifying Pictures (Inpainting)
  - Anomaly Detection
  - Deepfakes
- Image Captioning
  - Dataset
  - Tokenizing the Captions
  - Batching
  - Captioning Model
    - Image encoder
    - Attention mechanism
    - Caption decoder
  - Training Loop
  - Prediction
- Summary
Afterword
Index