Thoughtful Machine Learning with Python. A Test-Driven Approach - Helion

ebook

Autor: Matthew Kirk
ISBN: 978-14-919-2408-2
stron: 220, Format: ebook
Data wydania: 2017-01-16
Księgarnia: Helion

Cena książki: 126,65 zł (poprzednio: 147,27 zł)
Oszczędzasz: 14% (-20,62 zł)

Osoby, które kupiły tę książkę, wybierały także »

Tagi: Python - Programowanie | Uczenie maszynowe

Gain the confidence you need to apply machine learning in your daily work. With this practical guide, author Matthew Kirk shows you how to integrate and test machine learning algorithms in your code, without the academic subtext.

Featuring graphs and highlighted code examples throughout, the book features tests with Python’s Numpy, Pandas, Scikit-Learn, and SciPy data science libraries. If you’re a software engineer or business analyst interested in data science, this book will help you:

Reference real-world examples to test each algorithm through engaging, hands-on exercises
Apply test-driven development (TDD) to write and run tests before you start coding
Explore techniques for improving your machine-learning models with data extraction and feature development
Watch out for the risks of machine learning, such as underfitting or overfitting data
Work with K-Nearest Neighbors, neural networks, clustering, and other algorithms

Osoby które kupowały "Thoughtful Machine Learning with Python. A Test-Driven Approach", wybierały także:

Django 4. Praktyczne tworzenie aplikacji sieciowych. Wydanie IV 125,48 zł, (38,90 zł -69%)
Django. Kurs video. Aplikacje webowe w Pythonie 117,35 zł, (39,90 zł -66%)
Sztuczna inteligencja w Azure. Kurs video. Uczenie maszynowe i Azure Machine Learning Service 199,00 zł, (69,65 zł -65%)
Web scraping w Data Science. Kurs video. Uczenie maszynowe i architektura splotowych sieci neuronowych 178,97 zł, (62,64 zł -65%)
Data Science w Pythonie. Kurs video. Algorytmy uczenia maszynowego 199,00 zł, (69,65 zł -65%)

Spis treści

Thoughtful Machine Learning with Python. A Test-Driven Approach eBook -- spis treści

Preface
- Conventions Used in This Book
- Using Code Examples
- OReilly Safari
- How to Contact Us
- Acknowledgments
1. Probably Approximately Correct Software
- Writing Software Right
  - SOLID
    - Single Responsibility Principle
    - Open/Closed Principle
    - Liskov Substitution Principle
    - Interface Segregation Principle
    - Dependency Inversion Principle
  - Testing or TDD
  - Refactoring
- Writing the Right Software
  - Writing the Right Software with Machine Learning
  - What Exactly Is Machine Learning?
  - The High Interest Credit Card Debt of Machine Learning
  - SOLID Applied to Machine Learning
    - SRP
    - OCP
    - LSP
    - ISP
    - DIP
  - Machine Learning Code Is Complex but Not Impossible
  - TDD: Scientific Method 2.0
  - Refactoring Our Way to Knowledge
- The Plan for the Book
2. A Quick Introduction to Machine Learning
- What Is Machine Learning?
- Supervised Learning
- Unsupervised Learning
- Reinforcement Learning
- What Can Machine Learning Accomplish?
- Mathematical Notation Used Throughout the Book
- Conclusion
3. K-Nearest Neighbors
- How Do You Determine Whether You Want to Buy a House?
- How Valuable Is That House?
- Hedonic Regression
- What Is a Neighborhood?
- K-Nearest Neighbors
- Mr. Ks Nearest Neighborhood
- Distances
  - Triangle Inequality
  - Geometrical Distance
    - Cosine similarity
  - Computational Distances
    - Manhattan distance
    - Levenshtein distance
  - Statistical Distances
    - Mahalanobis distance
    - Jaccard distance
- Curse of Dimensionality
- How Do We Pick K?
  - Guessing K
  - Heuristics for Picking K
    - Use coprime class and K combinations
    - Choose a K that is greater or equal to the number of classes plus one
    - Choose a K that is low enough to avoid noise
    - Algorithms for picking K
- Valuing Houses in Seattle
  - About the Data
  - General Strategy
  - Coding and Testing Design
  - KNN Regressor Construction
  - KNN Testing
- Conclusion
4. Naive Bayesian Classification
- Using Bayes Theorem to Find Fraudulent Orders
- Conditional Probabilities
- Probability Symbols
- Inverse Conditional Probability (aka Bayes Theorem)
- Naive Bayesian Classifier
  - The Chain Rule
- Naiveté in Bayesian Reasoning
- Pseudocount
- Spam Filter
  - Setup Notes
  - Coding and Testing Design
  - Data Source
  - Email Class
  - Tokenization and Context
  - SpamTrainer
    - Storing training data
    - Building the Bayesian classifier
    - Calculating a classification
  - Error Minimization Through Cross-Validation
    - Minimizing false positives
    - Building the two folds
    - Cross-validation and error measuring
- Conclusion
5. Decision Trees and Random Forests
- The Nuances of Mushrooms
- Classifying Mushrooms Using a Folk Theorem
- Finding an Optimal Switch Point
  - Information Gain
  - GINI Impurity
  - Variance Reduction
- Pruning Trees
  - Ensemble Learning
    - Bagging
    - Random forests
  - Writing a Mushroom Classifier
    - Coding and testing design
    - MushroomProblem
    - Testing
- Conclusion
6. Hidden Markov Models
- Tracking User Behavior Using State Machines
- Emissions/Observations of Underlying States
- Simplification Through the Markov Assumption
  - Using Markov Chains Instead of a Finite State Machine
- Hidden Markov Model
- Evaluation: Forward-Backward Algorithm
  - Mathematical Representation of the Forward-Backward Algorithm
  - Using User Behavior
- The Decoding Problem Through the Viterbi Algorithm
- The Learning Problem
- Part-of-Speech Tagging with the Brown Corpus
  - Setup Notes
  - Coding and Testing Design
  - The Seam of Our Part-of-Speech Tagger: CorpusParser
  - Writing the Part-of-Speech Tagger
  - Cross-Validating to Get Confidence in the Model
  - How to Make This Model Better
- Conclusion
7. Support Vector Machines
- Customer Happiness as a Function of What They Say
  - Sentiment Classification Using SVMs
- The Theory Behind SVMs
  - Decision Boundary
  - Maximizing Boundaries
  - Kernel Trick: Feature Transformation
  - Optimizing with Slack
- Sentiment Analyzer
  - Setup Notes
  - Coding and Testing Design
  - SVM Testing Strategies
  - Corpus Class
  - CorpusSet Class
  - Model Validation and the Sentiment Classifier
- Aggregating Sentiment
  - Exponentially Weighted Moving Average
- Mapping Sentiment to Bottom Line
- Conclusion
8. Neural Networks
- What Is a Neural Network?
- History of Neural Nets
- Boolean Logic
- Perceptrons
- How to Construct Feed-Forward Neural Nets
  - Input Layer
    - Standard inputs
    - Symmetric inputs
  - Hidden Layers
  - Neurons
  - Activation Functions
  - Output Layer
  - Training Algorithms
  - The Delta Rule
  - Back Propagation
  - QuickProp
  - RProp
- Building Neural Networks
  - How Many Hidden Layers?
  - How Many Neurons for Each Layer?
  - Tolerance for Error and Max Epochs
- Using a Neural Network to Classify a Language
  - Setup Notes
  - Coding and Testing Design
  - The Data
  - Writing the Seam Test for Language
  - Cross-Validating Our Way to a Network Class
  - Tuning the Neural Network
  - Precision and Recall for Neural Networks
  - Wrap-Up of Example
  - Conclusion
9. Clustering
- Studying Data Without Any Bias
- User Cohorts
- Testing Cluster Mappings
  - Fitness of a Cluster
  - Silhouette Coefficient
  - Comparing Results to Ground Truth
- K-Means Clustering
  - The K-Means Algorithm
  - Downside of K-Means Clustering
- EM Clustering
  - Algorithm
    - Expectation
    - Maximization
- The Impossibility Theorem
- Example: Categorizing Music
  - Setup Notes
  - Gathering the Data
  - Coding Design
  - Analyzing the Data with K-Means
  - EM Clustering Our Data
  - The Results from the EM Jazz Clustering
- Conclusion
10. Improving Models and Data Extraction
- Debate Club
- Picking Better Data
  - Feature Selection
  - Exhaustive Search
  - Random Feature Selection
  - A Better Feature Selection Algorithm
  - Minimum Redundancy Maximum Relevance Feature Selection
- Feature Transformation and Matrix Factorization
  - Principal Component Analysis
  - Independent Component Analysis
- Ensemble Learning
  - Bagging
  - Boosting
- Conclusion
11. Putting It Together: Conclusion
- Machine Learning Algorithms Revisited
- How to Use This Information to Solve Problems
- Whats Next for You?
Index