reklama - zainteresowany?

Practical Python Data Wrangling and Data Quality - Helion

Practical Python Data Wrangling and Data Quality
ebook
Autor: Susan E. McGregor
ISBN: 9781492091455
stron: 416, Format: ebook
Data wydania: 2021-12-03
Księgarnia: Helion

Cena książki: 211,65 zł (poprzednio: 246,10 zł)
Oszczędzasz: 14% (-34,45 zł)

Dodaj do koszyka Practical Python Data Wrangling and Data Quality

The world around us is full of data that holds unique insights and valuable stories, and this book will help you uncover them. Whether you already work with data or want to learn more about its possibilities, the examples and techniques in this practical book will help you more easily clean, evaluate, and analyze data so that you can generate meaningful insights and compelling visualizations.

Complementing foundational concepts with expert advice, author Susan E. McGregor provides the resources you need to extract, evaluate, and analyze a wide variety of data sources and formats, along with the tools to communicate your findings effectively. This book delivers a methodical, jargon-free way for data practitioners at any level, from true novices to seasoned professionals, to harness the power of data.

  • Use Python 3.8+ to read, write, and transform data from a variety of sources
  • Understand and use programming basics in Python to wrangle data at scale
  • Organize, document, and structure your code using best practices
  • Collect data from structured data files, web pages, and APIs
  • Perform basic statistical analyses to make meaning from datasets
  • Visualize and present data in clear and compelling ways

Dodaj do koszyka Practical Python Data Wrangling and Data Quality

 

Osoby które kupowały "Practical Python Data Wrangling and Data Quality", wybierały także:

  • Windows Media Center. Domowe centrum rozrywki
  • Ruby on Rails. Ćwiczenia
  • DevOps w praktyce. Kurs video. Jenkins, Ansible, Terraform i Docker
  • Przywództwo w Å›wiecie VUCA. Jak być skutecznym liderem w niepewnym Å›rodowisku
  • Scrum. O zwinnym zarzÄ…dzaniu projektami. Wydanie II rozszerzone

Dodaj do koszyka Practical Python Data Wrangling and Data Quality

Spis treści

Practical Python Data Wrangling and Data Quality eBook -- spis treści

  • Preface
    • Who Should Read This Book?
    • Who Shouldnt Read This Book?
    • What to Expect from This Volume
    • Conventions Used in This Book
    • Using Code Examples
    • OReilly Online Learning
    • How to Contact Us
    • Acknowledgments
  • 1. Introduction to Data Wrangling and Data Quality
    • What Is Data Wrangling?
    • What Is Data Quality?
      • Data Integrity
      • Data Fit
    • Why Python?
      • Versatility
      • Accessibility
      • Readability
      • Community
      • Python Alternatives
    • Writing and Running Python
    • Working with Python on Your Own Device
      • Getting Started with the Command Line
      • Installing Python, Jupyter Notebook, and a Code Editor
        • Chromebook
          • Installing Python and Jupyter Notebook
          • Installing Atom
        • macOS
          • Installing Python and Jupyter Notebook
          • Installing Atom
        • Windows 10+
          • Installing Python and Jupyter Notebook
          • Installing Atom
        • Testing your setup
    • Working with Python Online
    • Hello World!
      • Using Atom to Create a Standalone Python File
      • Using Jupyter to Create a New Python Notebook
      • Using Google Colab to Create a New Python Notebook
    • Adding the Code
      • In a Standalone File
      • In a Notebook
    • Running the Code
      • In a Standalone File
      • In a Notebook
    • Documenting, Saving, and Versioning Your Work
      • Documenting
      • Saving
      • Versioning
        • Getting started with GitHub
          • For backing up local files: installing and configuring Git
          • Tying it all together
          • For backing up online Python files: connecting Google Colab to GitHub
          • Tying it all together
    • Conclusion
  • 2. Introduction to Python
    • The Programming Parts of Speech
      • Nouns Variables
        • Whats in a name?
        • Best practices for naming variables
      • Verbs Functions
      • Cooking with Custom Functions
      • Libraries: Borrowing Custom Functions from Other Coders
    • Taking Control: Loops and Conditionals
      • In the Loop
      • One Condition
    • Understanding Errors
      • Syntax Snafus
      • Runtime Runaround
      • Logic Loss
    • Hitting the Road with Citi Bike Data
      • Starting with Pseudocode
      • Seeking Scale
    • Conclusion
  • 3. Understanding Data Quality
    • Assessing Data Fit
      • Validity
      • Reliability
      • Representativeness
    • Assessing Data Integrity
      • Necessary, but Not Sufficient
        • Of known provenance
        • Well-annotated
      • Important
        • Timely
        • Complete
        • High volume
        • Multivariate
        • Atomic
      • Achievable
        • Consistent
        • Clear
        • Dimensionally structured
    • Improving Data Quality
      • Data Cleaning
      • Data Augmentation
    • Conclusion
  • 4. Working with File-Based and Feed-Based Data in Python
    • Structured Versus Unstructured Data
    • Working with Structured Data
      • File-Based, Table-Type DataTake It to Delimit
        • When to work with table-type data
        • Where to find table-type data
      • Wrangling Table-Type Data with Python
        • Reading data from CSVs
        • Reading data from TSV and TXT files
    • Real-World Data Wrangling: Understanding Unemployment
      • XLSX, ODS, and All the Rest
      • Finally, Fixed-Width
      • Feed-Based DataWeb-Driven Live Updates
        • When to work with feed-type data
        • Where to find feed-type data
      • Wrangling Feed-Type Data with Python
        • XML: One markup to rule them all
        • JSON: Web data, the next generation
    • Working with Unstructured Data
      • Image-Based Text: Accessing Data in PDFs
        • When to work with text in PDFs
        • Where to find PDFs
      • Wrangling PDFs with Python
      • Accessing PDF Tables with Tabula
    • Conclusion
  • 5. Accessing Web-Based Data
    • Accessing Online XML and JSON
    • Introducing APIs
    • Basic APIs: A Search Engine Example
    • Specialized APIs: Adding Basic Authentication
      • Getting a FRED API Key
      • Using Your API key to Request Data
    • Reading API Documentation
    • Protecting Your API Key When Using Python
      • Creating Your Credentials File
      • Using Your Credentials in a Separate Script
      • Getting Started with .gitignore
    • Specialized APIs: Working With OAuth
      • Applying for a Twitter Developer Account
      • Creating Your Twitter App and Credentials
      • Encoding Your API Key and Secret
      • Requesting an Access Token and Data from the Twitter API
        • Requesting an access token: get versus post
    • API Ethics
    • Web Scraping: The Data Source of Last Resort
      • Carefully Scraping the MTA
      • Using Browser Inspection Tools
      • The Python Web Scraping Solution: Beautiful Soup
    • Conclusion
  • 6. Assessing Data Quality
    • The Pandemic and the PPP
    • Assessing Data Integrity
      • Is It of Known Pedigree?
      • Is It Timely?
      • Is It Complete?
      • Is It Well-Annotated?
      • Is It High Volume?
      • Is It Consistent?
      • Is It Multivariate?
      • Is It Atomic?
      • Is It Clear?
      • Is It Dimensionally Structured?
    • Assessing Data Fit
      • Validity
      • Reliability
      • Representativeness
        • The denominator problem
    • Conclusion
  • 7. Cleaning, Transforming, and Augmenting Data
    • Selecting a Subset of Citi Bike Data
      • A Simple Split
      • Regular Expressions: Supercharged String Matching
      • Making a Date
    • De-crufting Data Files
    • Decrypting Excel Dates
    • Generating True CSVs from Fixed-Width Data
    • Correcting for Spelling Inconsistencies
    • The Circuitous Path to Simple Solutions
    • Gotchas That Will Get Ya!
    • Augmenting Your Data
    • Conclusion
  • 8. Structuring and Refactoring Your Code
    • Revisiting Custom Functions
      • Will You Use It More Than Once?
      • Is It Ugly and Confusing?
      • Do You Just Really Hate the Default Functionality?
    • Understanding Scope
    • Defining the Parameters for Function Ingredients
      • What Are Your Options?
      • Getting Into Arguments?
    • Return Values
    • Climbing the Stack
    • Refactoring for Fun and Profit
      • A Function for Identifying Weekdays
      • Metadata Without the Mess
    • Documenting Your Custom Scripts and Functions with pydoc
    • The Case for Command-Line Arguments
    • Where Scripts and Notebooks Diverge
    • Conclusion
  • 9. Introduction to Data Analysis
    • Context Is Everything
    • Same but Different
    • Whats Typical? Evaluating Central Tendency
      • Whats That Mean?
      • Embrace the Median
    • Think Different: Identifying Outliers
    • Visualization for Data Analysis
      • Whats Our Datas Shape? Understanding Histograms
      • The Significance of Symmetry
      • Counting Clusters
    • The $2 Million Question
    • Proportional Response
    • Conclusion
  • 10. Presenting Your Data
    • Foundations for Visual Eloquence
    • Making Your Data Statement
    • Charts, Graphs, and Maps: Oh My!
      • Pie Charts
      • Bar and Column Charts
      • Line Charts
      • Scatter Charts
      • Maps
    • Elements of Eloquent Visuals
      • The Finicky Details Really Do Make a Difference
      • Trust Your Eyes (and the Experts)
      • Selecting Scales
      • Choosing Colors
      • Above All, Annotate!
    • From Basic to Beautiful: Customizing a Visualization with seaborn and matplotlib
    • Beyond the Basics
    • Conclusion
  • 11. Beyond Python
    • Additional Tools for Data Review
      • Spreadsheet Programs
      • OpenRefine
    • Additional Tools for Sharing and Presenting Data
      • Image Editing for JPGs, PNGs, and GIFs
      • Software for Editing SVGs and Other Vector Formats
    • Reflecting on Ethics
    • Conclusion
  • A. More Python Programming Resources
    • Official Python Documentation
    • Installing Python Resources
      • Where to Look for Libraries
    • Keeping Your Tools Sharp
    • Where to Learn More
  • B. A Bit More About Git
    • You Run git push/pull and End Up in a Weird Text Editor
    • Your git push/pull Command Gets Rejected
      • Run git pull
        • Fixing conflicts manually
        • Fixing conflicts by forcing an overwrite
    • Git Quick Reference
  • C. Finding Data
    • Data Repositories and APIs
    • Subject Matter Experts
    • FOIA/L Requests
    • Custom Data Collection
  • D. Resources for Visualization and Information Design
    • Foundational Books on Information Visualization
    • The Quick Reference Youll Reach For
    • Sources of Inspiration
  • Index

Dodaj do koszyka Practical Python Data Wrangling and Data Quality

Code, Publish & WebDesing by CATALIST.com.pl



(c) 2005-2024 CATALIST agencja interaktywna, znaki firmowe należą do wydawnictwa Helion S.A.