Data Science for Business. What you need to know about data mining and data-analytic thinking - Helion

ebook

Autor: Foster Provost, Tom Fawcett
ISBN: 978-14-493-7428-0
stron: 414, Format: ebook
Data wydania: 2013-07-27
Księgarnia: Helion

Cena książki: 126,65 zł (poprzednio: 147,27 zł)
Oszczędzasz: 14% (-20,62 zł)

Osoby, które kupiły tę książkę, wybierały także »

Tagi: Business Intelligence | E-biznes

Written by renowned data science experts Foster Provost and Tom Fawcett, Data Science for Business introduces the fundamental principles of data science, and walks you through the "data-analytic thinking" necessary for extracting useful knowledge and business value from the data you collect. This guide also helps you understand the many data-mining techniques in use today.

Based on an MBA course Provost has taught at New York University over the past ten years, Data Science for Business provides examples of real-world business problems to illustrate these principles. You’ll not only learn how to improve communication between business stakeholders and data scientists, but also how participate intelligently in your company’s data science projects. You’ll also discover how to think data-analytically, and fully appreciate how data science methods can support business decision-making.

Understand how data science fits in your organization—and how you can use it for competitive advantage
Treat data as a business asset that requires careful investment if you’re to gain real value
Approach business problems data-analytically, using the data-mining process to gather good data in the most appropriate way
Learn general concepts for actually extracting knowledge from data
Apply data science principles when interviewing data science job candidates

Osoby które kupowały "Data Science for Business. What you need to know about data mining and data-analytic thinking", wybierały także:

Analiza biznesowa. Praktyczne modelowanie organizacji 49,00 zł, (24,50 zł -50%)
Elasticsearch. Kurs video. Pozyskiwanie i analiza danych 249,00 zł, (136,95 zł -45%)
Procesy biznesowe w praktyce. Projektowanie, testowanie i optymalizacja. Wydanie II 67,77 zł, (37,95 zł -44%)
Procesy biznesowe w praktyce. Projektowanie, testowanie i optymalizacja 57,95 zł, (32,45 zł -44%)
Apache Spark. Kurs video. Przetwarzanie z 149,00 zł, (96,85 zł -35%)

Spis treści

Data Science for Business. What You Need to Know about Data Mining and Data-Analytic Thinking eBook -- spis treści

Data Science for Business
Dedication
Dedication
Preface
- Our Conceptual Approach to Data Science
- To the Instructor
- Other Skills and Concepts
- Sections and Notation
- Using Examples
- Safari Books Online
- How to Contact Us
- Acknowledgments
1. Introduction: Data-Analytic Thinking
- The Ubiquity of Data Opportunities
- Example: Hurricane Frances
- Example: Predicting Customer Churn
- Data Science, Engineering, and Data-Driven Decision Making
- Data Processing and Big Data
- From Big Data 1.0 to Big Data 2.0
- Data and Data Science Capability as a Strategic Asset
- Data-Analytic Thinking
- This Book
- Data Mining and Data Science, Revisited
- Chemistry Is Not About Test Tubes: Data Science Versus the Work of the Data Scientist
- Summary
2. Business Problems and Data Science Solutions
- From Business Problems to Data Mining Tasks
- Supervised Versus Unsupervised Methods
- Data Mining and Its Results
- The Data Mining Process
  - Business Understanding
  - Data Understanding
  - Data Preparation
  - Modeling
  - Evaluation
  - Deployment
- Implications for Managing the Data Science Team
- Other Analytics Techniques and Technologies
  - Statistics
  - Database Querying
  - Data Warehousing
  - Regression Analysis
  - Machine Learning and Data Mining
  - Answering Business Questions with These Techniques
- Summary
3. Introduction to Predictive Modeling: From Correlation to Supervised Segmentation
- Models, Induction, and Prediction
- Supervised Segmentation
  - Selecting Informative Attributes
  - Example: Attribute Selection with Information Gain
  - Supervised Segmentation with Tree-Structured Models
- Visualizing Segmentations
- Trees as Sets of Rules
- Probability Estimation
- Example: Addressing the Churn Problem with Tree Induction
- Summary
4. Fitting a Model to Data
- Classification via Mathematical Functions
  - Linear Discriminant Functions
  - Optimizing an Objective Function
  - An Example of Mining a Linear Discriminant from Data
  - Linear Discriminant Functions for Scoring and Ranking Instances
  - Support Vector Machines, Briefly
- Regression via Mathematical Functions
- Class Probability Estimation and Logistic Regression
  - * Logistic Regression: Some Technical Details
- Example: Logistic Regression versus Tree Induction
- Nonlinear Functions, Support Vector Machines, and Neural Networks
- Summary
5. Overfitting and Its Avoidance
- Generalization
- Overfitting
- Overfitting Examined
  - Holdout Data and Fitting Graphs
  - Overfitting in Tree Induction
  - Overfitting in Mathematical Functions
- Example: Overfitting Linear Functions
- * Example: Why Is Overfitting Bad?
- From Holdout Evaluation to Cross-Validation
- The Churn Dataset Revisited
- Learning Curves
- Overfitting Avoidance and Complexity Control
  - Avoiding Overfitting with Tree Induction
  - A General Method for Avoiding Overfitting
  - * Avoiding Overfitting for Parameter Optimization
- Summary
6. Similarity, Neighbors, and Clusters
- Similarity and Distance
- Nearest-Neighbor Reasoning
  - Example: Whiskey Analytics
  - Nearest Neighbors for Predictive Modeling
    - Classification
    - Probability Estimation
    - Regression
  - How Many Neighbors and How Much Influence?
  - Geometric Interpretation, Overfitting, and Complexity Control
  - Issues with Nearest-Neighbor Methods
    - Intelligibility
    - Dimensionality and domain knowledge
    - Computational efficiency
- Some Important Technical Details Relating to Similarities and Neighbors
  - Heterogeneous Attributes
  - * Other Distance Functions
  - * Combining Functions: Calculating Scores from Neighbors
- Clustering
  - Example: Whiskey Analytics Revisited
  - Hierarchical Clustering
  - Nearest Neighbors Revisited: Clustering Around Centroids
  - Example: Clustering Business News Stories
    - Data preparation
    - The news story clusters
  - Understanding the Results of Clustering
  - * Using Supervised Learning to Generate Cluster Descriptions
- Stepping Back: Solving a Business Problem Versus Data Exploration
- Summary
7. Decision Analytic Thinking I: What Is a Good Model?
- Evaluating Classifiers
  - Plain Accuracy and Its Problems
  - The Confusion Matrix
  - Problems with Unbalanced Classes
  - Problems with Unequal Costs and Benefits
- Generalizing Beyond Classification
- A Key Analytical Framework: Expected Value
  - Using Expected Value to Frame Classifier Use
  - Using Expected Value to Frame Classifier Evaluation
    - Error rates
    - Costs and benefits
- Evaluation, Baseline Performance, and Implications for Investments in Data
- Summary
8. Visualizing Model Performance
- Ranking Instead of Classifying
- Profit Curves
- ROC Graphs and Curves
- The Area Under the ROC Curve (AUC)
- Cumulative Response and Lift Curves
- Example: Performance Analytics for Churn Modeling
- Summary
9. Evidence and Probabilities
- Example: Targeting Online Consumers With Advertisements
- Combining Evidence Probabilistically
  - Joint Probability and Independence
  - Bayes Rule
- Applying Bayes Rule to Data Science
  - Conditional Independence and Naive Bayes
  - Advantages and Disadvantages of Naive Bayes
- A Model of Evidence Lift
- Example: Evidence Lifts from Facebook Likes
  - Evidence in Action: Targeting Consumers with Ads
- Summary
10. Representing and Mining Text
- Why Text Is Important
- Why Text Is Difficult
- Representation
  - Bag of Words
  - Term Frequency
  - Measuring Sparseness: Inverse Document Frequency
  - Combining Them: TFIDF
- Example: Jazz Musicians
- * The Relationship of IDF to Entropy
- Beyond Bag of Words
  - N-gram Sequences
  - Named Entity Extraction
  - Topic Models
- Example: Mining News Stories to Predict Stock Price Movement
  - The Task
  - The Data
  - Data Preprocessing
  - Results
- Summary
11. Decision Analytic Thinking II: Toward Analytical Engineering
- Targeting the Best Prospects for a Charity Mailing
  - The Expected Value Framework: Decomposing the Business Problem and Recomposing the Solution Pieces
  - A Brief Digression on Selection Bias
- Our Churn Example Revisited with Even More Sophistication
  - The Expected Value Framework: Structuring a More Complicated Business Problem
  - Assessing the Influence of the Incentive
  - From an Expected Value Decomposition to a Data Science Solution
  - Summary
12. Other Data Science Tasks and Techniques
- Co-occurrences and Associations: Finding Items That Go Together
  - Measuring Surprise: Lift and Leverage
  - Example: Beer and Lottery Tickets
  - Associations Among Facebook Likes
- Profiling: Finding Typical Behavior
- Link Prediction and Social Recommendation
- Data Reduction, Latent Information, and Movie Recommendation
- Bias, Variance, and Ensemble Methods
- Data-Driven Causal Explanation and a Viral Marketing Example
- Summary
13. Data Science and Business Strategy
- Thinking Data-Analytically, Redux
- Achieving Competitive Advantage with Data Science
- Sustaining Competitive Advantage with Data Science
  - Formidable Historical Advantage
  - Unique Intellectual Property
  - Unique Intangible Collateral Assets
  - Superior Data Scientists
  - Superior Data Science Management
- Attracting and Nurturing Data Scientists and Their Teams
- Examine Data Science Case Studies
- Be Ready to Accept Creative Ideas from Any Source
- Be Ready to Evaluate Proposals for Data Science Projects
  - Example Data Mining Proposal
  - Flaws in the Big Red Proposal
- A Firms Data Science Maturity
14. Conclusion
- The Fundamental Concepts of Data Science
  - Applying Our Fundamental Concepts to a New Problem: Mining Mobile Device Data
  - Changing the Way We Think about Solutions to Business Problems
- What Data Cant Do: Humans in the Loop, Revisited
- Privacy, Ethics, and Mining Data About Individuals
- Is There More to Data Science?
- Final Example: From Crowd-Sourcing to Cloud-Sourcing
- Final Words
A. Proposal Review Guide
- Business and Data Understanding
- Data Preparation
- Modeling
- Evaluation and Deployment
B. Another Sample Proposal
- Scenario and Proposal
  - Flaws in the GGC Proposal
Glossary
C. Bibliography
Index
Colophon
Copyright