DuckDB: Up and Running - Helion
ISBN: 9781098159658
stron: 308, Format: ebook
Data wydania: 2024-12-05
Księgarnia: Helion
Cena książki: 220,15 zł (poprzednio: 255,99 zł)
Oszczędzasz: 14% (-35,84 zł)
DuckDB, an open source in-process database created for OLAP workloads, provides key advantages over more mainstream OLAP solutions: It's embeddable and optimized for analytics. It also integrates well with Python and is compatible with SQL, giving you the performance and flexibility of SQL right within your Python environment. This handy guide shows you how to get started with this versatile and powerful tool.
Author Wei-Meng Lee takes developers and data professionals through DuckDB's primary features and functions, best practices, and practical examples of how you can use DuckDB for a variety of data analytics tasks. You'll also dive into specific topics, including how to import data into DuckDB, work with tables, perform exploratory data analysis, visualize data, perform spatial analysis, and use DuckDB with JSON files, Polars, and JupySQL.
- Understand the purpose of DuckDB and its main functions
- Conduct data analytics tasks using DuckDB
- Integrate DuckDB with pandas, Polars, and JupySQL
- Use DuckDB to query your data
- Perform spatial analytics using DuckDB's spatial extension
- Work with a diverse range of data including Parquet, CSV, and JSON
Osoby które kupowały "DuckDB: Up and Running", wybierały także:
- Windows Media Center. Domowe centrum rozrywki 66,67 zł, (8,00 zł -88%)
- Ruby on Rails. Ćwiczenia 18,75 zł, (3,00 zł -84%)
- Przywództwo w świecie VUCA. Jak być skutecznym liderem w niepewnym środowisku 58,64 zł, (12,90 zł -78%)
- Scrum. O zwinnym zarządzaniu projektami. Wydanie II rozszerzone 58,64 zł, (12,90 zł -78%)
- Od hierarchii do turkusu, czyli jak zarządzać w XXI wieku 58,64 zł, (12,90 zł -78%)
Spis treści
DuckDB: Up and Running eBook -- spis treści
- Preface
- Conventions Used in This Book
- Using Code Examples
- OReilly Online Learning
- How to Contact Us
- Acknowledgements
- 1. Getting Started with DuckDB
- Introduction to DuckDB
- Why Use DuckDB?
- High-Performance Analytical Queries
- Versatile Integration and Ease of Use Across Multiple Programming Languages
- Open Source
- A Quick Look at DuckDB
- Loading Data into DuckDB
- Inserting a Record
- Querying a Table
- Performing Aggregation
- Joining Tables
- Reading Data from pandas
- Why DuckDB Is More Efficient
- Execution Speed
- Memory Usage
- Summary
- Introduction to DuckDB
- 2. Importing Data into DuckDB
- Creating DuckDB Databases
- Loading Data from Different Data Sources and Formats
- Working with CSV Files
- Loading using the SQL query method
- Loading using the register() method
- Exporting a table to CSV
- Working with Parquet Files
- Loading Parquet files
- Exporting Parquet files
- Working with Excel Files
- Loading Excel files
- Exporting tables to Excel
- Working with MySQL
- Working with CSV Files
- Summary
- 3. A Primer on SQL
- Using the DuckDB CLI
- Importing Data into DuckDB
- Dot Commands
- .database
- .open
- .table
- .dump
- .read
- Persisting the In-Memory Database on Disk
- DuckDB SQL Primer
- Creating a Database
- Creating Tables
- Viewing the Schemas of Tables
- Dropping a Table
- Working with Tables
- Populating Tables with Rows
- Updating Rows
- Deleting Rows
- Querying Tables
- Joining Tables
- Left join
- Right join
- Inner join
- Full join
- Multiple table joins
- Aggregating Data
- Analytics
- Summary
- Using the DuckDB CLI
- 4. Using DuckDB with Polars
- Introduction to Polars
- Creating a Polars DataFrame
- Selecting columns
- Selecting rows
- Selecting rows and columns
- Using SQL on Polars
- Understanding Lazy Evaluation in Polars
- Implicit lazy evaluation
- Explicit lazy evaluation
- Creating a Polars DataFrame
- Querying Polars DataFrames Using DuckDB
- Using the sql() Function
- Using the DuckDBPyRelation Object
- Inserting rows
- Joining tables
- Filtering rows
- Aggregating rows
- Projecting columns
- Limiting rows
- Summary
- Introduction to Polars
- 5. Performing EDA with DuckDB
- Our Dataset: The 2015 Flight Delays Dataset
- Geospatial Analysis
- Displaying a Map
- Displaying All Airports on the Map
- Using the spatial Extension in DuckDB
- Converting latitude and longitude to the Point data type
- Converting a pandas DataFrame to a GeoPandas GeoDataFrame
- Displaying airport locations on the map
- Finding nearby airports
- Performing Descriptive Analytics
- Finding the Airports for Each State and City
- Aggregating the Total Number of Airports in Each State
- Obtaining the Flight Counts for Each Pair of Origin and Destination Airports
- Getting the Canceled Flights from Airlines
- Getting the Flight Count for Each Day of the Week
- Finding the Most Common Timeslot for Flight Delays
- Finding the Airlines with the Most and Fewest Delays
- Summary
- 6. Using DuckDB with JSON Files
- Primer on JSON
- Object
- String
- Boolean
- Number
- Nested Object
- Array
- null
- Loading JSON Files into DuckDB
- Using the read_json_auto() Function
- Using the read_json() Function
- Array of JSON objects
- Newline-delimited (ND) JSON
- Nested JSON
- Custom JSON file
- Loading multiple JSON files
- Using the COPY-FROM Statement
- Exporting Tables to JSON
- Summary
- Primer on JSON
- 7. Using DuckDB with JupySQL
- What Is JupySQL?
- Installing JupySQL
- Loading the sql Extension
- Integrating with DuckDB
- Performing Queries
- Storing Snippets
- Visualization
- Histograms
- Box Plots
- Pie Charts
- Bar Plots
- Integrating with MySQL
- Using Environment Variables
- Using an .ini File
- Using keyring
- Summary
- What Is JupySQL?
- 8. Accessing Remote Data Using DuckDB
- DuckDBs httpfs Extension
- Querying CSV and Parquet Files Remotely
- Accessing CSV Files
- Accessing Parquet Files
- Querying Hugging Face Datasets
- Using Hugging Face Datasets
- Reading the Dataset Using hf:// Paths
- Accessing Files Within a Folder
- Querying Multiple Files Using the Glob Syntax
- Working with Private Hugging Face Datasets
- Uploading a private dataset
- Creating an access token
- Performing authentication
- Summary
- 9. Using DuckDB in the Cloud with MotherDuck
- Introduction to MotherDuck
- Signing Up for MotherDuck
- MotherDuck Plans
- Getting Started with MotherDuck
- Adding Tables
- Creating Schemas
- Sharing Databases
- Creating a Database
- Detaching a Database
- Using the Databases in MotherDuck
- Querying Your Database
- Writing SQL Using AI
- Using MotherDuck Through the DuckDB CLI
- Connecting to MotherDuck
- Querying Databases on MotherDuck
- Creating Databases on MotherDuck
- Performing Hybrid Queries
- Summary
- Introduction to MotherDuck
- Index