Practical Machine Learning with H2O. Powerful, Scalable Techniques for Deep Learning and AI - Helion
ISBN: 978-14-919-6455-2
stron: 300, Format: ebook
Data wydania: 2016-12-05
Księgarnia: Helion
Cena książki: 126,65 zł (poprzednio: 147,27 zł)
Oszczędzasz: 14% (-20,62 zł)
Machine learning has finally come of age. With H2O software, you can perform machine learning and data analysis using a simple open source framework that’s easy to use, has a wide range of OS and language support, and scales for big data. This hands-on guide teaches you how to use H20 with only minimal math and theory behind the learning algorithms.
If you’re familiar with R or Python, know a bit of statistics, and have some experience manipulating data, author Darren Cook will take you through H2O basics and help you conduct machine-learning experiments on different sample data sets. You’ll explore several modern machine-learning techniques such as deep learning, random forests, unsupervised learning, and ensemble learning.
- Learn how to import, manipulate, and export data with H2O
- Explore key machine-learning concepts, such as cross-validation and validation data sets
- Work with three diverse data sets, including a regression, a multinomial classification, and a binomial classification
- Use H2O to analyze each sample data set with four supervised machine-learning algorithms
- Understand how cluster analysis and other unsupervised machine-learning algorithms work
Osoby które kupowały "Practical Machine Learning with H2O. Powerful, Scalable Techniques for Deep Learning and AI", wybierały także:
- Scala for Machine Learning - Second Edition 186,88 zł, (29,90 zł -84%)
- Data Analysis with IBM SPSS Statistics 186,88 zł, (29,90 zł -84%)
- QlikView for Developers 186,88 zł, (29,90 zł -84%)
- Practical Machine Learning Cookbook 186,88 zł, (29,90 zł -84%)
- Mastering Hadoop 3 175,88 zł, (29,90 zł -83%)
Spis treści
Practical Machine Learning with H2O. Powerful, Scalable Techniques for Deep Learning and AI eBook -- spis treści
- Preface
- Who Uses It and Why?
- About You
- Conventions Used in This Book
- Using Code Examples
- OReilly Safari
- How to Contact Us
- Acknowledgments
- 1. Installation and Quick-Start
- Preparing to Install
- Installing R
- Installing Python
- Privacy
- Installing Java
- Install H2O with R (CRAN)
- Install H2O with Python (pip)
- Our First Learning
- Training and Predictions, with Python
- Training and Predictions, with R
- Performance Versus Predictions
- On Being Unlucky
- Flow
- Data
- Models
- Predictions
- Other Things in Flow
- Summary
- Preparing to Install
- 2. Data Import, Data Export
- Memory Requirements
- Preparing the Data
- Getting Data into H2O
- Load CSV Files
- Load Other File Formats
- Load Directly from R
- Load Directly from Python
- Data Manipulation
- Laziness, Naming, Deleting
- Data Summaries
- Operations on Columns
- Aggregating Rows
- Indexing
- Split Data Already in H2O
- Rows and Columns
- Getting Data Out of H2O
- Exporting Data Frames
- POJOs
- Model Files
- Save All Models
- Summary
- 3. The Data Sets
- Data Set: Building Energy Efficiency
- Setup and Load
- The Data Columns
- Splitting the Data
- Lets Take a Look!
- About the Data Set
- Data Set: Handwritten Digits
- Setup and Load
- Taking a Look
- Helping the Models
- About the Data Set
- Data Set: Football Scores
- Correlations
- Missing Data And Yet More Columns
- How to Train and Test?
- Setup and Load
- The Other Third
- Missing Data (Again)
- Setup and Load (Again)
- About the Data Set
- Summary
- Data Set: Building Energy Efficiency
- 4. Common Model Parameters
- Supported Metrics
- Regression Metrics
- Classification Metrics
- Binomial Classification
- The Essentials
- Effort
- Scoring and Validation
- Early Stopping
- Checkpoints
- Cross-Validation (aka k-folds)
- Data Weighting
- Sampling, Generalizing
- Regression
- Output Control
- Summary
- Supported Metrics
- 5. Random Forest
- Decision Trees
- Random Forest
- Parameters
- Building Energy Efficiency: Default Random Forest
- Grid Search
- Cartesian
- RandomDiscrete
- High-Level Strategy
- Building Energy Efficiency: Tuned Random Forest
- MNIST: Default Random Forest
- MNIST: Tuned Random Forest
- Enhanced Data
- Football: Default Random Forest
- Football: Tuned Random Forest
- Summary
- 6. Gradient Boosting Machines
- Boosting
- The Good, the Bad, and the Mysterious
- Parameters
- Building Energy Efficiency: Default GBM
- Building Energy Efficiency: Tuned GBM
- MNIST: Default GBM
- MNIST: Tuned GBM
- Football: Default GBM
- Football: Tuned GBM
- Summary
- 7. Linear Models
- GLM Parameters
- Building Energy Efficiency: Default GLM
- Building Energy Efficiency: Tuned GLM
- MNIST: Default GLM
- MNIST: Tuned GLM
- Football: Default GLM
- Football: Tuned GLM
- Summary
- 8. Deep Learning (Neural Nets)
- What Are Neural Nets?
- Numbers Versus Categories
- Network Layers
- Activation Functions
- Parameters
- Deep Learning Regularization
- Deep Learning Scoring
- Building Energy Efficiency: Default Deep Learning
- Building Energy Efficiency: Tuned Deep Learning
- MNIST: Default Deep Learning
- MNIST: Tuned Deep Learning
- Football: Default Deep Learning
- Football: Tuned Deep Learning
- Summary
- Appendix: More Deep Learning Parameters
- What Are Neural Nets?
- 9. Unsupervised Learning
- K-Means Clustering
- Deep Learning Auto-Encoder
- Stacked Auto-Encoder
- Principal Component Analysis
- GLRM
- Missing Data
- GLRM
- Lose the R!
- Summary
- 10. Everything Else
- Staying on Top of and Poking into Things
- Installing the Latest Version
- Building from Source
- Running from the Command Line
- Clusters
- EC2
- Other Cloud Providers
- Hadoop
- Spark / Sparkling Water
- Naive Bayes
- Ensembles
- Stacking: h2o.ensemble
- Categorical Ensembles
- Summary
- 11. Epilogue: Didnt They All Do Well!
- Building Energy Results
- MNIST Results
- Football Data
- How Low Can You Go?
- The More the Merrier
- Still Desperate for More
- Filtering for Hardness
- Auto-Encoder
- Convolute and Shrink
- Ensembles
- That Was as Low as I Go
- Summary
- Index