Machine Learning with Python Cookbook. 2nd Edition - Helion
ISBN: 9781098135683
stron: 416, Format: ebook
Data wydania: 2023-07-27
Księgarnia: Helion
Cena książki: 29,90 zł (poprzednio: 299,00 zł)
Oszczędzasz: 90% (-269,10 zł)
This practical guide provides more than 200 self-contained recipes to help you solve machine learning challenges you may encounter in your work. If you're comfortable with Python and its libraries, including pandas and scikit-learn, you'll be able to address specific problems, from loading data to training models and leveraging neural networks.
Each recipe in this updated edition includes code that you can copy, paste, and run with a toy dataset to ensure that it works. From there, you can adapt these recipes according to your use case or application. Recipes include a discussion that explains the solution and provides meaningful context.
Go beyond theory and concepts by learning the nuts and bolts you need to construct working machine learning applications. You'll find recipes for:
- Vectors, matrices, and arrays
- Working with data from CSV, JSON, SQL, databases, cloud storage, and other sources
- Handling numerical and categorical data, text, images, and dates and times
- Dimensionality reduction using feature extraction or feature selection
- Model evaluation and selection
- Linear and logical regression, trees and forests, and k-nearest neighbors
- Supporting vector machines (SVM), naäve Bayes, clustering, and tree-based models
- Saving, loading, and serving trained models from multiple frameworks
Osoby które kupowały "Machine Learning with Python Cookbook. 2nd Edition", wybierały także:
- Cisco CCNA 200-301. Kurs video. Administrowanie bezpieczeństwem sieci. Część 3 665,00 zł, (39,90 zł -94%)
- Cisco CCNA 200-301. Kurs video. Administrowanie urządzeniami Cisco. Część 2 665,00 zł, (39,90 zł -94%)
- Cisco CCNA 200-301. Kurs video. Podstawy sieci komputerowych i konfiguracji. Część 1 665,00 zł, (39,90 zł -94%)
- Impact of P2P and Free Distribution on Book Sales 427,14 zł, (29,90 zł -93%)
- Cisco CCNP Enterprise 350-401 ENCOR. Kurs video. Programowanie i automatyzacja sieci 443,33 zł, (39,90 zł -91%)
Spis treści
Machine Learning with Python Cookbook. 2nd Edition eBook -- spis treści
- Preface
- Conventions Used in This Book
- Using Code Examples
- OReilly Online Learning
- How to Contact Us
- Acknowledgments
- 1. Working with Vectors, Matrices, and Arrays in NumPy
- 1.0. Introduction
- 1.1. Creating a Vector
- 1.2. Creating a Matrix
- 1.3. Creating a Sparse Matrix
- 1.4. Preallocating NumPy Arrays
- 1.5. Selecting Elements
- 1.6. Describing a Matrix
- 1.7. Applying Functions over Each Element
- 1.8. Finding the Maximum and Minimum Values
- 1.9. Calculating the Average, Variance, and Standard Deviation
- 1.10. Reshaping Arrays
- 1.11. Transposing a Vector or Matrix
- 1.12. Flattening a Matrix
- 1.13. Finding the Rank of a Matrix
- 1.14. Getting the Diagonal of a Matrix
- 1.15. Calculating the Trace of a Matrix
- 1.16. Calculating Dot Products
- 1.17. Adding and Subtracting Matrices
- 1.18. Multiplying Matrices
- 1.19. Inverting a Matrix
- 1.20. Generating Random Values
- 2. Loading Data
- 2.0. Introduction
- 2.1. Loading a Sample Dataset
- 2.2. Creating a Simulated Dataset
- 2.3. Loading a CSV File
- 2.4. Loading an Excel File
- 2.5. Loading a JSON File
- 2.6. Loading a Parquet File
- 2.7. Loading an Avro File
- 2.8. Querying a SQLite Database
- 2.9. Querying a Remote SQL Database
- 2.10. Loading Data from a Google Sheet
- 2.11. Loading Data from an S3 Bucket
- 2.12. Loading Unstructured Data
- 3. Data Wrangling
- 3.0. Introduction
- 3.1. Creating a Dataframe
- 3.2. Getting Information about the Data
- 3.3. Slicing DataFrames
- 3.4. Selecting Rows Based on Conditionals
- 3.5. Sorting Values
- 3.6. Replacing Values
- 3.7. Renaming Columns
- 3.8. Finding the Minimum, Maximum, Sum, Average, and Count
- 3.9. Finding Unique Values
- 3.10. Handling Missing Values
- 3.11. Deleting a Column
- 3.12. Deleting a Row
- 3.13. Dropping Duplicate Rows
- 3.14. Grouping Rows by Values
- 3.15. Grouping Rows by Time
- 3.16. Aggregating Operations and Statistics
- 3.17. Looping over a Column
- 3.18. Applying a Function over All Elements in a Column
- 3.19. Applying a Function to Groups
- 3.20. Concatenating DataFrames
- 3.21. Merging DataFrames
- 4. Handling Numerical Data
- 4.0. Introduction
- 4.1. Rescaling a Feature
- 4.2. Standardizing a Feature
- 4.3. Normalizing Observations
- 4.4. Generating Polynomial and Interaction Features
- 4.5. Transforming Features
- 4.6. Detecting Outliers
- 4.7. Handling Outliers
- 4.8. Discretizating Features
- 4.9. Grouping Observations Using Clustering
- 4.10. Deleting Observations with Missing Values
- 4.11. Imputing Missing Values
- 5. Handling Categorical Data
- 5.0. Introduction
- 5.1. Encoding Nominal Categorical Features
- 5.2. Encoding Ordinal Categorical Features
- 5.3. Encoding Dictionaries of Features
- 5.4. Imputing Missing Class Values
- 5.5. Handling Imbalanced Classes
- 6. Handling Text
- 6.0. Introduction
- 6.1. Cleaning Text
- 6.2. Parsing and Cleaning HTML
- 6.3. Removing Punctuation
- 6.4. Tokenizing Text
- 6.5. Removing Stop Words
- 6.6. Stemming Words
- 6.7. Tagging Parts of Speech
- 6.8. Performing Named-Entity Recognition
- 6.9. Encoding Text as a Bag of Words
- 6.10. Weighting Word Importance
- 6.11. Using Text Vectors to Calculate Text Similarity in a Search Query
- 6.12. Using a Sentiment Analysis Classifier
- 7. Handling Dates and Times
- 7.0. Introduction
- 7.1. Converting Strings to Dates
- 7.2. Handling Time Zones
- 7.3. Selecting Dates and Times
- 7.4. Breaking Up Date Data into Multiple Features
- 7.5. Calculating the Difference Between Dates
- 7.6. Encoding Days of the Week
- 7.7. Creating a Lagged Feature
- 7.8. Using Rolling Time Windows
- 7.9. Handling Missing Data in Time Series
- 8. Handling Images
- 8.0. Introduction
- 8.1. Loading Images
- 8.2. Saving Images
- 8.3. Resizing Images
- 8.4. Cropping Images
- 8.5. Blurring Images
- 8.6. Sharpening Images
- 8.7. Enhancing Contrast
- 8.8. Isolating Colors
- 8.9. Binarizing Images
- 8.10. Removing Backgrounds
- 8.11. Detecting Edges
- 8.12. Detecting Corners
- 8.13. Creating Features for Machine Learning
- 8.14. Encoding Color Histograms as Features
- 8.15. Using Pretrained Embeddings as Features
- 8.16. Detecting Objects with OpenCV
- 8.17. Classifying Images with Pytorch
- 9. Dimensionality Reduction Using Feature Extraction
- 9.0. Introduction
- 9.1. Reducing Features Using Principal Components
- 9.2. Reducing Features When Data Is Linearly Inseparable
- 9.3. Reducing Features by Maximizing Class Separability
- 9.4. Reducing Features Using Matrix Factorization
- 9.5. Reducing Features on Sparse Data
- 10. Dimensionality Reduction Using Feature Selection
- 10.0. Introduction
- 10.1. Thresholding Numerical Feature Variance
- 10.2. Thresholding Binary Feature Variance
- 10.3. Handling Highly Correlated Features
- 10.4. Removing Irrelevant Features for Classification
- 10.5. Recursively Eliminating Features
- 11. Model Evaluation
- 11.0. Introduction
- 11.1. Cross-Validating Models
- 11.2. Creating a Baseline Regression Model
- 11.3. Creating a Baseline Classification Model
- 11.4. Evaluating Binary Classifier Predictions
- 11.5. Evaluating Binary Classifier Thresholds
- 11.6. Evaluating Multiclass Classifier Predictions
- 11.7. Visualizing a Classifiers Performance
- 11.8. Evaluating Regression Models
- 11.9. Evaluating Clustering Models
- 11.10. Creating a Custom Evaluation Metric
- 11.11. Visualizing the Effect of Training Set Size
- 11.12. Creating a Text Report of Evaluation Metrics
- 11.13. Visualizing the Effect of Hyperparameter Values
- 12. Model Selection
- 12.0. Introduction
- 12.1. Selecting the Best Models Using Exhaustive Search
- 12.2. Selecting the Best Models Using Randomized Search
- 12.3. Selecting the Best Models from Multiple Learning Algorithms
- 12.4. Selecting the Best Models When Preprocessing
- 12.5. Speeding Up Model Selection with Parallelization
- 12.6. Speeding Up Model Selection Using Algorithm-Specific Methods
- 12.7. Evaluating Performance After Model Selection
- 13. Linear Regression
- 13.0. Introduction
- 13.1. Fitting a Line
- 13.2. Handling Interactive Effects
- 13.3. Fitting a Nonlinear Relationship
- 13.4. Reducing Variance with Regularization
- 13.5. Reducing Features with Lasso Regression
- 14. Trees and Forests
- 14.0. Introduction
- 14.1. Training a Decision Tree Classifier
- 14.2. Training a Decision Tree Regressor
- 14.3. Visualizing a Decision Tree Model
- 14.4. Training a Random Forest Classifier
- 14.5. Training a Random Forest Regressor
- 14.6. Evaluating Random Forests with Out-of-Bag Errors
- 14.7. Identifying Important Features in Random Forests
- 14.8. Selecting Important Features in Random Forests
- 14.9. Handling Imbalanced Classes
- 14.10. Controlling Tree Size
- 14.11. Improving Performance Through Boosting
- 14.12. Training an XGBoost Model
- 14.13. Improving Real-Time Performance with LightGBM
- 15. K-Nearest Neighbors
- 15.0. Introduction
- 15.1. Finding an Observations Nearest Neighbors
- 15.2. Creating a K-Nearest Neighbors Classifier
- 15.3. Identifying the Best Neighborhood Size
- 15.4. Creating a Radius-Based Nearest Neighbors Classifier
- 15.5. Finding Approximate Nearest Neighbors
- 15.6. Evaluating Approximate Nearest Neighbors
- 16. Logistic Regression
- 16.0. Introduction
- 16.1. Training a Binary Classifier
- 16.2. Training a Multiclass Classifier
- 16.3. Reducing Variance Through Regularization
- 16.4. Training a Classifier on Very Large Data
- 16.5. Handling Imbalanced Classes
- 17. Support Vector Machines
- 17.0. Introduction
- 17.1. Training a Linear Classifier
- 17.2. Handling Linearly Inseparable Classes Using Kernels
- 17.3. Creating Predicted Probabilities
- 17.4. Identifying Support Vectors
- 17.5. Handling Imbalanced Classes
- 18. Naive Bayes
- 18.0. Introduction
- 18.1. Training a Classifier for Continuous Features
- 18.2. Training a Classifier for Discrete and Count Features
- 18.3. Training a Naive Bayes Classifier for Binary Features
- 18.4. Calibrating Predicted Probabilities
- 19. Clustering
- 19.0. Introduction
- 19.1. Clustering Using K-Means
- 19.2. Speeding Up K-Means Clustering
- 19.3. Clustering Using Mean Shift
- 19.4. Clustering Using DBSCAN
- 19.5. Clustering Using Hierarchical Merging
- 20. Tensors with PyTorch
- 20.0. Introduction
- 20.1. Creating a Tensor
- 20.2. Creating a Tensor from NumPy
- 20.3. Creating a Sparse Tensor
- 20.4. Selecting Elements in a Tensor
- 20.5. Describing a Tensor
- 20.6. Applying Operations to Elements
- 20.7. Finding the Maximum and Minimum Values
- 20.8. Reshaping Tensors
- 20.9. Transposing a Tensor
- 20.10. Flattening a Tensor
- 20.11. Calculating Dot Products
- 20.12. Multiplying Tensors
- 21. Neural Networks
- 21.0. Introduction
- 21.1. Using Autograd with PyTorch
- 21.2. Preprocessing Data for Neural Networks
- 21.3. Designing a Neural Network
- 21.4. Training a Binary Classifier
- 21.5. Training a Multiclass Classifier
- 21.6. Training a Regressor
- 21.7. Making Predictions
- 21.8. Visualize Training History
- 21.9. Reducing Overfitting with Weight Regularization
- 21.10. Reducing Overfitting with Early Stopping
- 21.11. Reducing Overfitting with Dropout
- 21.12. Saving Model Training Progress
- 21.13. Tuning Neural Networks
- 21.14. Visualizing Neural Networks
- 22. Neural Networks for Unstructured Data
- 22.0. Introduction
- 22.1. Training a Neural Network for Image Classification
- 22.2. Training a Neural Network for Text Classification
- 22.3. Fine-Tuning a Pretrained Model for Image Classification
- 22.4. Fine-Tuning a Pretrained Model for Text Classification
- 23. Saving, Loading, and Serving Trained Models
- 23.0. Introduction
- 23.1. Saving and Loading a scikit-learn Model
- 23.2. Saving and Loading a TensorFlow Model
- 23.3. Saving and Loading a PyTorch Model
- 23.4. Serving scikit-learn Models
- 23.5. Serving TensorFlow Models
- 23.6. Serving PyTorch Models in Seldon
- Index