reklama - zainteresowany?

Data Wrangling with Python. Tips and Tools to Make Your Life Easier - Helion

Data Wrangling with Python. Tips and Tools to Make Your Life Easier
ebook
Autor: Jacqueline Kazil, Katharine Jarmul
ISBN: 978-14-919-4877-4
stron: 508, Format: ebook
Data wydania: 2016-02-04
Księgarnia: Helion

Cena książki: 126,65 zł (poprzednio: 147,27 zł)
Oszczędzasz: 14% (-20,62 zł)

Dodaj do koszyka Data Wrangling with Python. Tips and Tools to Make Your Life Easier

Tagi: Python - Programowanie

How do you take your data analysis skills beyond Excel to the next level? By learning just enough Python to get stuff done. This hands-on guide shows non-programmers like you how to process information that’s initially too messy or difficult to access. You don't need to know a thing about the Python programming language to get started.

Through various step-by-step exercises, you’ll learn how to acquire, clean, analyze, and present data efficiently. You’ll also discover how to automate your data process, schedule file- editing and clean-up tasks, process larger datasets, and create compelling stories with data you obtain.

  • Quickly learn basic Python syntax, data types, and language concepts
  • Work with both machine-readable and human-consumable data
  • Scrape websites and APIs to find a bounty of useful information
  • Clean and format data to eliminate duplicates and errors in your datasets
  • Learn when to standardize data and when to test and script data cleanup
  • Explore and analyze your datasets with new Python libraries and techniques
  • Use Python solutions to automate your entire data-wrangling process

Dodaj do koszyka Data Wrangling with Python. Tips and Tools to Make Your Life Easier

 

Osoby które kupowały "Data Wrangling with Python. Tips and Tools to Make Your Life Easier", wybierały także:

  • Django 4. Praktyczne tworzenie aplikacji sieciowych. Wydanie IV
  • GraphQL. Kurs video. Buduj nowoczesne API w Pythonie
  • Flask. Kurs video. Od pierwszej linijki kodu do praktycznego zastosowania
  • Python na start. Kurs video. Tw
  • Python. Kurs video. Programowanie asynchroniczne

Dodaj do koszyka Data Wrangling with Python. Tips and Tools to Make Your Life Easier

Spis treści

Data Wrangling with Python. Tips and Tools to Make Your Life Easier eBook -- spis treści

  • Preface
    • Who Should Read This Book
    • Who Should Not Read This Book
    • How This Book Is Organized
    • What Is Data Wrangling?
    • What to Do If You Get Stuck
    • Conventions Used in This Book
    • Using Code Examples
    • Safari Books Online
    • How to Contact Us
    • Acknowledgments
  • 1. Introduction to Python
    • Why Python
    • Getting Started with Python
      • Which Python Version
      • Setting Up Python on Your Machine
        • Mac OS X
        • Windows 8 and 10
      • Test Driving Python
      • Install pip
      • Install a Code Editor
      • Optional: Install IPython
    • Summary
  • 2. Python Basics
    • Basic Data Types
      • Strings
      • Integers and Floats
        • Integers
        • Floats, decimals, and other nonwhole number types
    • Data Containers
      • Variables
      • Lists
      • Dictionaries
    • What Can the Various Data Types Do?
      • String Methods: Things Strings Can Do
      • Numerical Methods: Things Numbers Can Do
      • List Methods: Things Lists Can Do
      • Dictionary Methods: Things Dictionaries Can Do
    • Helpful Tools: type, dir, and help
      • type
      • dir
      • help
    • Putting It All Together
    • What Does It All Mean?
    • Summary
  • 3. Data Meant to Be Read by Machines
    • CSV Data
      • How to Import CSV Data
      • Saving the Code to a File; Running from Command Line
    • JSON Data
      • How to Import JSON Data
    • XML Data
      • How to Import XML Data
    • Summary
  • 4. Working with Excel Files
    • Installing Python Packages
    • Parsing Excel Files
    • Getting Started with Parsing
    • Summary
  • 5. PDFs and Problem Solving in Python
    • Avoid Using PDFs!
    • Programmatic Approaches to PDF Parsing
      • Opening and Reading Using slate
      • Converting PDF to Text
    • Parsing PDFs Using pdfminer
    • Learning How to Solve Problems
      • Exercise: Use Table Extraction, Try a Different Library
      • Exercise: Clean the Data Manually
      • Exercise: Try Another Tool
    • Uncommon File Types
    • Summary
  • 6. Acquiring and Storing Data
    • Not All Data Is Created Equal
    • Fact Checking
    • Readability, Cleanliness, and Longevity
    • Where to Find Data
      • Using a Telephone
      • US Government Data
      • Government and Civic Open Data Worldwide
        • EU and UK
        • Africa
        • Asia
        • Non-EU Europe, Central Asia, India, the Middle East, and Russia
        • South America and Canada
      • Organization and Non-Government Organization (NGO) Data
      • Education and University Data
      • Medical and Scientific Data
      • Crowdsourced Data and APIs
    • Case Studies: Example Data Investigation
      • Ebola Crisis
      • Train Safety
      • Football Salaries
      • Child Labor
    • Storing Your Data: When, Why, and How?
    • Databases: A Brief Introduction
      • Relational Databases: MySQL and PostgreSQL
        • MySQL and Python
        • PostgreSQL and Python
      • Non-Relational Databases: NoSQL
        • MongoDB with Python
      • Setting Up Your Local Database with Python
    • When to Use a Simple File
      • Cloud-Storage and Python
      • Local Storage and Python
    • Alternative Data Storage
    • Summary
  • 7. Data Cleanup: Investigation, Matching, and Formatting
    • Why Clean Data?
    • Data Cleanup Basics
      • Identifying Values for Data Cleanup
        • Replacing headers
        • Zipping questions and answers
      • Formatting Data
      • Finding Outliers and Bad Data
      • Finding Duplicates
      • Fuzzy Matching
      • RegEx Matching
      • What to Do with Duplicate Records
    • Summary
  • 8. Data Cleanup: Standardizing and Scripting
    • Normalizing and Standardizing Your Data
    • Saving Your Data
    • Determining What Data Cleanup Is Right for Your Project
    • Scripting Your Cleanup
    • Testing with New Data
    • Summary
  • 9. Data Exploration and Analysis
    • Exploring Your Data
      • Importing Data
      • Exploring Table Functions
      • Joining Numerous Datasets
      • Identifying Correlations
      • Identifying Outliers
      • Creating Groupings
      • Further Exploration
    • Analyzing Your Data
      • Separating and Focusing Your Data
      • What Is Your Data Saying?
      • Drawing Conclusions
      • Documenting Your Conclusions
    • Summary
  • 10. Presenting Your Data
    • Avoiding Storytelling Pitfalls
      • How Will You Tell the Story?
      • Know Your Audience
    • Visualizing Your Data
      • Charts
        • Charting with matplotlib
        • Charting with Bokeh
      • Time-Related Data
        • Time series data
        • Timeline data
      • Maps
      • Interactives
      • Words
      • Images, Video, and Illustrations
    • Presentation Tools
    • Publishing Your Data
      • Using Available Sites
        • Medium
        • Easy-to-start sites: WordPress, Squarespace
        • Your own blog
      • Open Source Platforms: Starting a New Site
        • Ghost
        • GitHub Pages and Jekyll
        • One-click deploys
      • Jupyter (Formerly Known as IPython Notebooks)
        • Shared Jupyter notebooks
    • Summary
  • 11. Web Scraping: Acquiring and Storing Data from the Web
    • What to Scrape and How
    • Analyzing a Web Page
      • Inspection: Markup Structure
      • Network/Timeline: How the Page Loads
      • Console: Interacting with JavaScript
        • Style basics
        • jQuery and JavaScript
      • In-Depth Analysis of a Page
    • Getting Pages: How to Request on the Internet
    • Reading a Web Page with Beautiful Soup
    • Reading a Web Page with LXML
      • A Case for XPath
    • Summary
  • 12. Advanced Web Scraping: Screen Scrapers and Spiders
    • Browser-Based Parsing
      • Screen Reading with Selenium
        • Selenium and headless browsers
      • Screen Reading with Ghost.Py
    • Spidering the Web
      • Building a Spider with Scrapy
      • Crawling Whole Websites with Scrapy
    • Networks: How the Internet Works and Why Its Breaking Your Script
    • The Changing Web (or Why Your Script Broke)
    • A (Few) Word(s) of Caution
    • Summary
  • 13. APIs
    • API Features
      • REST Versus Streaming APIs
      • Rate Limits
      • Tiered Data Volumes
      • API Keys and Tokens
        • Creating a Twitter API key and access token
    • A Simple Data Pull from Twitters REST API
    • Advanced Data Collection from Twitters REST API
    • Advanced Data Collection from Twitters Streaming API
    • Summary
  • 14. Automation and Scaling
    • Why Automate?
    • Steps to Automate
    • What Could Go Wrong?
    • Where to Automate
    • Special Tools for Automation
      • Using Local Files, argv, and Config Files
        • Local files
        • Config files
        • Command-line arguments
      • Using the Cloud for Data Processing
        • Using Git to deploy Python
      • Using Parallel Processing
      • Using Distributed Processing
    • Simple Automation
      • CronJobs
      • Web Interfaces
      • Jupyter Notebooks
    • Large-Scale Automation
      • Celery: Queue-Based Automation
      • Ansible: Operations Automation
    • Monitoring Your Automation
      • Python Logging
      • Adding Automated Messaging
        • Email
        • SMS and voice
        • Chat integration
      • Uploading and Other Reporting
      • Logging and Monitoring as a Service
        • Logging and exceptions
        • Logging and monitoring
    • No System Is Foolproof
    • Summary
  • 15. Conclusion
    • Duties of a Data Wrangler
    • Beyond Data Wrangling
      • Become a Better Data Analyst
      • Become a Better Developer
      • Become a Better Visual Storyteller
      • Become a Better Systems Architect
    • Where Do You Go from Here?
  • A. Comparison of Languages Mentioned
    • C, C++, and Java Versus Python
    • R or MATLAB Versus Python
    • HTML Versus Python
    • JavaScript Versus Python
    • Node.js Versus Python
    • Ruby and Ruby on Rails Versus Python
  • B. Python Resources for Beginners
    • Online Resources
    • In-Person Groups
  • C. Learning the Command Line
    • Bash
      • Navigation
      • Modifying Files
      • Executing Files
      • Searching with the Command Line
      • More Resources
    • Windows CMD/Power Shell
      • Navigation
      • Modifying Files
      • Executing Files
      • Searching with the Command Line
      • More Resources
  • D. Advanced Python Setup
    • Step 1: Install GCC
    • Step 2: (Mac Only) Install Homebrew
    • Step 3: (Mac Only) Tell Your System Where to Find Homebrew
    • Step 4: Install Python 2.7
    • Step 5: Install virtualenv (Windows, Mac, Linux)
    • Step 6: Set Up a New Directory
    • Step 7: Install virtualenvwrapper
      • Installing virtualenvwrapper (Mac and Linux)
        • Updating your .bashrc
      • Installing virtualenvwrapper-win (Windows)
      • Testing Your Virtual Environment (Windows, Mac, Linux)
    • Learning About Our New Environment (Windows, Mac, Linux)
    • Advanced Setup Review
  • E. Python Gotchas
    • Hail the Whitespace
    • The Dreaded GIL
    • = Versus == Versus is, and When to Just Copy
    • Default Function Arguments
    • Python Scope and Built-Ins: The Importance of Variable Names
    • Defining Objects Versus Modifying Objects
    • Changing Immutable Objects
    • Type Checking
    • Catching Multiple Exceptions
    • The Power of Debugging
  • F. IPython Hints
    • Why Use IPython?
    • Getting Started with IPython
    • Magic Functions
    • Final Thoughts: A Simpler Terminal
  • G. Using Amazon Web Services
    • Spinning Up an AWS Server
      • AWS Step 1: Choose an Amazon Machine Image (AMI)
      • AWS Step 2: Choose an Instance Type
      • AWS Step 7: Review Instance Launch
      • AWS Extra Question: Select an Existing Key Pair or Create a New One
    • Logging into an AWS Server
      • Get the Public DNS Name of the Instance
      • Prepare Your Private Key
      • Log into Your Server
      • Summary
  • Index

Dodaj do koszyka Data Wrangling with Python. Tips and Tools to Make Your Life Easier

Code, Publish & WebDesing by CATALIST.com.pl



(c) 2005-2025 CATALIST agencja interaktywna, znaki firmowe należą do wydawnictwa Helion S.A.