reklama - zainteresowany?

Mastering Spark with R. The Complete Guide to Large-Scale Analysis and Modeling - Helion

Mastering Spark with R. The Complete Guide to Large-Scale Analysis and Modeling
ebook
Autor: Javier Luraschi, Kevin Kuo, Edgar Ruiz
ISBN: 9781492046325
stron: 296, Format: ebook
Data wydania: 2019-10-07
Księgarnia: Helion

Cena książki: 143,65 zł (poprzednio: 167,03 zł)
Oszczędzasz: 14% (-23,38 zł)

Dodaj do koszyka Mastering Spark with R. The Complete Guide to Large-Scale Analysis and Modeling

Tagi: R - Programowanie

If you’re like most R users, you have deep knowledge and love for statistics. But as your organization continues to collect huge amounts of data, adding tools such as Apache Spark makes a lot of sense. With this practical book, data scientists and professionals working with large-scale data applications will learn how to use Spark from R to tackle big data and big compute problems.

Authors Javier Luraschi, Kevin Kuo, and Edgar Ruiz show you how to use R with Spark to solve different data analysis problems. This book covers relevant data science topics, cluster computing, and issues that should interest even the most advanced users.

  • Analyze, explore, transform, and visualize data in Apache Spark with R
  • Create statistical models to extract information and predict outcomes; automate the process in production-ready workflows
  • Perform analysis and modeling across many machines using distributed computing techniques
  • Use large-scale data from multiple sources and different formats with ease from within Spark
  • Learn about alternative modeling frameworks for graph processing, geospatial analysis, and genomics at scale
  • Dive into advanced topics including custom transformations, real-time data processing, and creating custom Spark extensions

Dodaj do koszyka Mastering Spark with R. The Complete Guide to Large-Scale Analysis and Modeling

 

Osoby które kupowały "Mastering Spark with R. The Complete Guide to Large-Scale Analysis and Modeling", wybierały także:

  • Wydajne programowanie w R. Praktyczny przewodnik po lepszym programowaniu
  • Wydajne programowanie w R. Praktyczny przewodnik po lepszym programowaniu
  • Mistrz analizy danych. Od danych do wiedzy

Dodaj do koszyka Mastering Spark with R. The Complete Guide to Large-Scale Analysis and Modeling

Spis treści

Mastering Spark with R. The Complete Guide to Large-Scale Analysis and Modeling eBook -- spis treści

  • Foreword
  • Preface
    • Formatting
    • Acknowledgments
    • Conventions Used in This Book
    • Using Code Examples
    • OReilly Online Learning
    • How to Contact Us
  • 1. Introduction
    • Overview
    • Hadoop
    • Spark
    • R
    • sparklyr
    • Recap
  • 2. Getting Started
    • Overview
    • Prerequisites
      • Installing sparklyr
      • Installing Spark
    • Connecting
    • Using Spark
      • Web Interface
      • Analysis
      • Modeling
      • Data
      • Extensions
      • Distributed R
      • Streaming
      • Logs
    • Disconnecting
    • Using RStudio
    • Resources
    • Recap
  • 3. Analysis
    • Overview
    • Import
    • Wrangle
      • Built-in Functions
      • Correlations
    • Visualize
      • Using ggplot2
      • Using dbplot
    • Model
      • Caching
    • Communicate
    • Recap
  • 4. Modeling
    • Overview
    • Exploratory Data Analysis
    • Feature Engineering
    • Supervised Learning
      • Generalized Linear Regression
      • Other Models
    • Unsupervised Learning
      • Data Preparation
      • Topic Modeling
    • Recap
  • 5. Pipelines
    • Overview
    • Creation
    • Use Cases
      • Hyperparameter Tuning
    • Operating Modes
    • Interoperability
    • Deployment
      • Batch Scoring
      • Real-Time Scoring
    • Recap
  • 6. Clusters
    • Overview
    • On-Premises
      • Managers
        • Standalone
        • YARN
        • Apache Mesos
      • Distributions
    • Cloud
      • Amazon
      • Databricks
      • Google
      • IBM
      • Microsoft
      • Qubole
    • Kubernetes
    • Tools
      • RStudio
      • Jupyter
      • Livy
    • Recap
  • 7. Connections
    • Overview
      • Edge Nodes
      • Spark Home
    • Local
    • Standalone
    • YARN
      • YARN Client
      • YARN Cluster
    • Livy
    • Mesos
    • Kubernetes
    • Cloud
    • Batches
    • Tools
    • Multiple Connections
    • Troubleshooting
      • Logging
      • Spark Submit
        • Detailed troubleshooting
      • Windows
    • Recap
  • 8. Data
    • Overview
    • Reading Data
      • Paths
      • Schema
      • Memory
      • Columns
    • Writing Data
    • Copying Data
    • File Formats
      • CSV
      • JSON
      • Parquet
      • Others
    • File Systems
    • Storage Systems
      • Hive
      • Cassandra
      • JDBC
    • Recap
  • 9. Tuning
    • Overview
      • Graph
      • Timeline
    • Configuring
      • Connect Settings
      • Submit Settings
      • Runtime Settings
      • sparklyr Settings
    • Partitioning
      • Implicit Partitions
      • Explicit Partitions
    • Caching
      • Checkpointing
      • Memory
    • Shuffling
    • Serialization
    • Configuration Files
    • Recap
  • 10. Extensions
    • Overview
    • H2O
    • Graphs
    • XGBoost
    • Deep Learning
    • Genomics
    • Spatial
    • Troubleshooting
    • Recap
  • 11. Distributed R
    • Overview
    • Use Cases
      • Custom Parsers
      • Partitioned Modeling
      • Grid Search
      • Web APIs
      • Simulations
    • Partitions
    • Grouping
    • Columns
    • Context
    • Functions
    • Packages
    • Cluster Requirements
      • Installing R
      • Apache Arrow
    • Troubleshooting
      • Worker Logs
      • Resolving Timeouts
      • Inspecting Partitions
      • Debugging Workers
    • Recap
  • 12. Streaming
    • Overview
    • Transformations
      • Analysis
      • Modeling
      • Pipelines
      • Distributed R
    • Kafka
    • Shiny
    • Recap
  • 13. Contributing
    • Overview
    • The Spark API
    • Spark Extensions
    • Using Scala Code
    • Recap
  • A. Supplemental Code References
    • Preface
      • Formatting
    • Chapter 1
      • The Worlds Capacity to Store Information
      • Daily Downloads of CRAN Packages
    • Chapter 2
      • Prerequisites
        • Installing R
        • Installing Java
        • Installing RStudio
        • Using RStudio
    • Chapter 3
      • Hive Functions
    • Chapter 4
      • MLlib Functions
        • Classification
        • Regression
        • Clustering
        • Recommendation
        • Frequent Pattern Mining
        • Feature Transformers
    • Chapter 6
      • Google Trends for On-Premises (Mainframes), Cloud Computing, and Kubernetes
    • Chapter 12
      • Stream Generator
      • Installing Kafka
  • Index

Dodaj do koszyka Mastering Spark with R. The Complete Guide to Large-Scale Analysis and Modeling

Code, Publish & WebDesing by CATALIST.com.pl



(c) 2005-2019 CATALIST agencja interaktywna, znaki firmowe należą do wydawnictwa Helion S.A.