Natural Language Processing with Transformers, Revised Edition - Helion

ebook

Autor: Lewis Tunstall, Leandro von Werra, Thomas Wolf
ISBN: 9781098136758
stron: 408, Format: ebook
Data wydania: 2022-05-26
Księgarnia: Helion

Cena książki: 203,15 zł (poprzednio: 236,22 zł)
Oszczędzasz: 14% (-33,07 zł)

Osoby, które kupiły tę książkę, wybierały także »

Since their introduction in 2017, transformers have quickly become the dominant architecture for achieving state-of-the-art results on a variety of natural language processing tasks. If you're a data scientist or coder, this practical book -now revised in full color- shows you how to train and scale these large models using Hugging Face Transformers, a Python-based deep learning library.

Transformers have been used to write realistic news stories, improve Google Search queries, and even create chatbots that tell corny jokes. In this guide, authors Lewis Tunstall, Leandro von Werra, and Thomas Wolf, among the creators of Hugging Face Transformers, use a hands-on approach to teach you how transformers work and how to integrate them in your applications. You'll quickly learn a variety of tasks they can help you solve.

Build, debug, and optimize transformer models for core NLP tasks, such as text classification, named entity recognition, and question answering
Learn how transformers can be used for cross-lingual transfer learning
Apply transformers in real-world scenarios where labeled data is scarce
Make transformer models efficient for deployment using techniques such as distillation, pruning, and quantization
Train transformers from scratch and learn how to scale to multiple GPUs and distributed environments

Osoby które kupowały "Natural Language Processing with Transformers, Revised Edition", wybierały także:

Jak zhakowa 125,00 zł, (10,00 zł -92%)
Biologika Sukcesji Pokoleniowej. Sezon 3. Konflikty na terytorium 126,36 zł, (13,90 zł -89%)
Windows Media Center. Domowe centrum rozrywki 66,67 zł, (8,00 zł -88%)
Podręcznik startupu. Budowa wielkiej firmy krok po kroku 92,67 zł, (13,90 zł -85%)
Ruby on Rails. Ćwiczenia 18,75 zł, (3,00 zł -84%)

Spis treści

Natural Language Processing with Transformers, Revised Edition eBook -- spis treści

Foreword
Preface
- Who Is This Book For?
- What You Will Learn
- Software and Hardware Requirements
- Conventions Used in This Book
- Using Code Examples
- OReilly Online Learning
- How to Contact Us
- Acknowledgments
  - Lewis
  - Leandro
  - Thomas
1. Hello Transformers
- The Encoder-Decoder Framework
- Attention Mechanisms
- Transfer Learning in NLP
- Hugging Face Transformers: Bridging the Gap
- A Tour of Transformer Applications
  - Text Classification
  - Named Entity Recognition
  - Question Answering
  - Summarization
  - Translation
  - Text Generation
- The Hugging Face Ecosystem
  - The Hugging Face Hub
  - Hugging Face Tokenizers
  - Hugging Face Datasets
  - Hugging Face Accelerate
- Main Challenges with Transformers
- Conclusion
2. Text Classification
- The Dataset
  - A First Look at Hugging Face Datasets
  - From Datasets to DataFrames
  - Looking at the Class Distribution
  - How Long Are Our Tweets?
- From Text to Tokens
  - Character Tokenization
  - Word Tokenization
  - Subword Tokenization
  - Tokenizing the Whole Dataset
- Training a Text Classifier
  - Transformers as Feature Extractors
    - Using pretrained models
    - Extracting the last hidden states
    - Creating a feature matrix
    - Visualizing the training set
    - Training a simple classifier
  - Fine-Tuning Transformers
    - Loading a pretrained model
    - Defining the performance metrics
    - Training the model
    - Error analysis
    - Saving and sharing the model
- Conclusion
3. Transformer Anatomy
- The Transformer Architecture
- The Encoder
  - Self-Attention
    - Scaled dot-product attention
    - Multi-headed attention
  - The Feed-Forward Layer
  - Adding Layer Normalization
  - Positional Embeddings
  - Adding a Classification Head
- The Decoder
- Meet the Transformers
  - The Transformer Tree of Life
  - The Encoder Branch
  - The Decoder Branch
  - The Encoder-Decoder Branch
- Conclusion
4. Multilingual Named Entity Recognition
- The Dataset
- Multilingual Transformers
- A Closer Look at Tokenization
  - The Tokenizer Pipeline
  - The SentencePiece Tokenizer
- Transformers for Named Entity Recognition
- The Anatomy of the Transformers Model Class
  - Bodies and Heads
  - Creating a Custom Model for Token Classification
  - Loading a Custom Model
- Tokenizing Texts for NER
- Performance Measures
- Fine-Tuning XLM-RoBERTa
- Error Analysis
- Cross-Lingual Transfer
  - When Does Zero-Shot Transfer Make Sense?
  - Fine-Tuning on Multiple Languages at Once
- Interacting with Model Widgets
- Conclusion
5. Text Generation
- The Challenge with Generating Coherent Text
- Greedy Search Decoding
- Beam Search Decoding
- Sampling Methods
- Top-k and Nucleus Sampling
- Which Decoding Method Is Best?
- Conclusion
6. Summarization
- The CNN/DailyMail Dataset
- Text Summarization Pipelines
  - Summarization Baseline
  - GPT-2
  - T5
  - BART
  - PEGASUS
- Comparing Different Summaries
- Measuring the Quality of Generated Text
  - BLEU
  - ROUGE
- Evaluating PEGASUS on the CNN/DailyMail Dataset
- Training a Summarization Model
  - Evaluating PEGASUS on SAMSum
  - Fine-Tuning PEGASUS
  - Generating Dialogue Summaries
- Conclusion
7. Question Answering
- Building a Review-Based QA System
  - The Dataset
  - Extracting Answers from Text
    - Span classification
    - Tokenizing text for QA
    - Dealing with long passages
  - Using Haystack to Build a QA Pipeline
    - Initializing a document store
    - Initializing a retriever
    - Initializing a reader
    - Putting it all together
- Improving Our QA Pipeline
  - Evaluating the Retriever
    - Dense Passage Retrieval
  - Evaluating the Reader
  - Domain Adaptation
  - Evaluating the Whole QA Pipeline
- Going Beyond Extractive QA
- Conclusion
8. Making Transformers Efficient in Production
- Intent Detection as a Case Study
- Creating a Performance Benchmark
- Making Models Smaller via Knowledge Distillation
  - Knowledge Distillation for Fine-Tuning
  - Knowledge Distillation for Pretraining
  - Creating a Knowledge Distillation Trainer
  - Choosing a Good Student Initialization
  - Finding Good Hyperparameters with Optuna
  - Benchmarking Our Distilled Model
- Making Models Faster with Quantization
- Benchmarking Our Quantized Model
- Optimizing Inference with ONNX and the ONNX Runtime
- Making Models Sparser with Weight Pruning
  - Sparsity in Deep Neural Networks
  - Weight Pruning Methods
    - Magnitude pruning
    - Movement pruning
- Conclusion
9. Dealing with Few to No Labels
- Building a GitHub Issues Tagger
  - Getting the Data
  - Preparing the Data
  - Creating Training Sets
  - Creating Training Slices
- Implementing a Naive Bayesline
- Working with No Labeled Data
- Working with a Few Labels
  - Data Augmentation
  - Using Embeddings as a Lookup Table
  - Fine-Tuning a Vanilla Transformer
  - In-Context and Few-Shot Learning with Prompts
- Leveraging Unlabeled Data
  - Fine-Tuning a Language Model
  - Fine-Tuning a Classifier
  - Advanced Methods
    - Unsupervised data augmentation
    - Uncertainty-aware self-training
- Conclusion
10. Training Transformers from Scratch
- Large Datasets and Where to Find Them
  - Challenges of Building a Large-Scale Corpus
  - Building a Custom Code Dataset
    - Creating a dataset with Google BigQuery
  - Working with Large Datasets
    - Memory mapping
    - Streaming
  - Adding Datasets to the Hugging Face Hub
- Building a Tokenizer
  - The Tokenizer Model
  - Measuring Tokenizer Performance
  - A Tokenizer for Python
  - Training a Tokenizer
  - Saving a Custom Tokenizer on the Hub
- Training a Model from Scratch
  - A Tale of Pretraining Objectives
    - Causal language modeling
    - Masked language modeling
    - Sequence-to-sequence training
  - Initializing the Model
  - Implementing the Dataloader
  - Defining the Training Loop
  - The Training Run
- Results and Analysis
- Conclusion
11. Future Directions
- Scaling Transformers
  - Scaling Laws
  - Challenges with Scaling
  - Attention Please!
  - Sparse Attention
  - Linearized Attention
- Going Beyond Text
  - Vision
    - iGPT
    - ViT
  - Tables
- Multimodal Transformers
  - Speech-to-Text
  - Vision and Text
    - VQA
    - LayoutLM
    - DALLE
    - CLIP
- Where to from Here?
Index