Hands-On Differential Privacy - Helion

ebook

Autor: Ethan Cowan, Michael Shoemate, Mayana Pereira
ISBN: 9781492097709
stron: 362, Format: ebook
Data wydania: 2024-05-16
Księgarnia: Helion

Cena książki: 220,67 zł (poprzednio: 275,84 zł)
Oszczędzasz: 20% (-55,17 zł)

Osoby, które kupiły tę książkę, wybierały także »

Many organizations today analyze and share large, sensitive datasets about individuals. Whether these datasets cover healthcare details, financial records, or exam scores, it's become more difficult for organizations to protect an individual's information through deidentification, anonymization, and other traditional statistical disclosure limitation techniques. This practical book explains how differential privacy (DP) can help.

Authors Ethan Cowan, Michael Shoemate, and Mayana Pereira explain how these techniques enable data scientists, researchers, and programmers to run statistical analyses that hide the contribution of any single individual. You'll dive into basic DP concepts and understand how to use open source tools to create differentially private statistics, explore how to assess the utility/privacy trade-offs, and learn how to integrate differential privacy into workflows.

With this book, you'll learn:

How DP guarantees privacy when other data anonymization methods don't
What preserving individual privacy in a dataset entails
How to apply DP in several real-world scenarios and datasets
Potential privacy attack methods, including what it means to perform a reidentification attack
How to use the OpenDP library in privacy-preserving data releases
How to interpret guarantees provided by specific DP data releases

Osoby które kupowały "Hands-On Differential Privacy", wybierały także:

Jak zhakowa 125,00 zł, (10,00 zł -92%)
Biologika Sukcesji Pokoleniowej. Sezon 3. Konflikty na terytorium 126,36 zł, (13,90 zł -89%)
Windows Media Center. Domowe centrum rozrywki 66,67 zł, (8,00 zł -88%)
Podręcznik startupu. Budowa wielkiej firmy krok po kroku 92,67 zł, (13,90 zł -85%)
Ruby on Rails. Ćwiczenia 18,75 zł, (3,00 zł -84%)

Spis treści

Hands-On Differential Privacy eBook -- spis treści

Preface
- The Structure of This Book
  - Part 1: Differential Privacy Concepts
  - Part 2: Differential Privacy in Practice
  - Part 3: Deploying Differential Privacy
- Conventions Used in This Book
- Using Code Examples
- OReilly Online Learning
- How to Contact Us
- Acknowledgments
I. Differential Privacy Concepts
1. Welcome to Differential Privacy
- History
- Privatization Before Differential Privacy
- Case Study: Applying DP in a Classroom
  - Privacy and the Mean
  - How Could This Be Prevented?
    - Randomized response
    - Adding noise
- Adjacent Data Sets: What If Someone Else Had Dropped the Class?
- Sensitivity: How Much Can the Statistic Change?
- Adding Noise
  - What Is a Trusted Curator?
- Available Tools
- Summary
- Exercises
2. Differential Privacy Fundamentals
- Intuitive Privacy
  - Privacy Unit
  - Privacy Loss
- Formalizing the Concept of Differential Privacy
  - Randomized Response
    - Randomized response is -DP
  - Privacy Violation
- Models of Differential Privacy
- Sensitivity
- Differentially Private Mechanisms
  - Laplace Mechanism
  - The Laplace Mechanism Is -DP
  - Mechanism Accuracy
  - Most Common Family Type Among Students
  - Exponential Mechanism
- Composition
- Postprocessing Immunity
- Implementing Differentially Private Queries with SmartNoise
  - Example 1: Differentially Private Counts
  - Example 2: Differentially Private Sum
  - Example 3: Multiple Queries from a Single Database
- Summary
- Exercises
3. Stable Transformations
- Distance Metrics
  - Data Set Adjacency
  - Bounded Versus Unbounded Differential Privacy
- Definition of a c-Stable Transformation
  - Transformation: Double
  - Transformation: Row-by-Row
- Stability Is a Necessary and Sufficient Condition for Sensitivity
  - Transformation: Count
  - Transformation: Unknown-Size Sum
- Domain Descriptors
  - Transformation: Data Clipping
- Chaining
- Metric Spaces
- Definition of Stability
  - Transformation: Known-Size Sum
  - Transformation: Known-Size Mean
  - Transformation: Unknown-Size Mean
  - Transformation: Resize
  - Recap of Scalar Aggregators
- Vector-Valued Aggregators
  - Vector Norm, Distance, and Sensitivity
  - Aggregating Data with Bounded Norm
  - Grouped Data
- In Practice
- Summary
- Exercises
4. Private Mechanisms
- Privacy Measure
  - Privacy Measure: Max-Divergence
  - Metric Versus Divergence Versus Privacy Measure
- Private Mechanisms
  - Randomized Response
  - The Vector Laplace Mechanism
    - Discrete Laplace mechanism (geometric mechanism)
  - Exponential Mechanism
  - Quantile Score Transformation
    - Finite support exponential mechanism
    - Piecewise-Constant Support Exponential Mechanism
    - Top-K
  - Report Noisy Max Mechanisms
    - Alternative distributions
- Interactivity
- Above Threshold
  - Streams
  - Online Private Selection
  - Stable Transformations on Streams
- Summary
- Exercises
5. Definitions of Privacy
- The Privacy Loss Random Variable
- Approximate Differential Privacy
  - Truncated Noise Mechanisms
  - Propose-Test-Release
  - (Advanced) Composition
- The Gaussian Mechanism
- Rényi Differential Privacy
  - Zero-Concentrated Differential Privacy (zCDP)
  - Strength of Moments-Based Privacy Measures
- Bounded Range
- Privacy Loss Distributions
  - Numerical Composition
  - Characteristic Functions
- Hypothesis Testing Interpretation
  - f -differential privacy
- Summary
- Exercises
6. Fearless Combinators
- Chaining
  - Example: Bounds Estimation
  - Example: B-Tree
- Privacy Measure Conversion
- Composition
  - Adaptivity
  - Odometers and Filters
- Partitioned Data
  - Example: Grouping on Asylum Seeker Data
  - Parallel Composition
  - Example: Multi-Quantiles
- Privacy Amplification
  - Privacy Amplification by Simple Random Sampling
  - Privacy Amplification by Poisson Sampling
  - Privacy Amplification by Shuffling
- Sample and Aggregate
- Private Selection from Private Candidates
  - Example: k-Means
- Summary
- Exercises
II. Differential Privacy in Practice
7. Eyes on the Privacy Unit
- Levels of Privacy
  - User-Level Privacy in Practice
- Browser Logs Example: A Naive Event-Level Guarantee
- Data Sets with Unbounded Contributions
  - Statistics with Constant Sensitivity
- Data Set Truncation
  - Reservoir Sampling
  - Truncation on Partitioned Data
  - Hospital Visits Example: A Bias-Variance Trade-Off
    - Scenario 1: Truncation threshold matches the true maximum influence of any single individual in the data
    - Scenario 2: Maximum allowed number of visits per patient overestimates the true maximum influence of any single individual in the data
    - Scenario 3: Maximum allowed number of visits per patient underestimates the true maximum influence of any single individual in the data
- Privately Estimating the Truncation Threshold
  - Further Analysis with Unbounded Contributions
- Unknown Domain
- When to Apply Truncation
  - Stable Grouping Transformations
  - Stable Union Transformations
  - Stable Join Transformations
- Summary
- Exercises
8. Differentially Private Statistical Modeling
- Private Inference
- Differentially Private Linear Regression
  - Sufficient Statistics Perturbation
  - Private Theil-Sen Estimator
  - Objective Function Perturbation
- Algorithm Selection
- Differentially Private Naive Bayes
  - Categorical Naive Bayes
  - Continuous Naive Bayes
  - Mechanism Design
  - Example: Naive Bayes
- Tree-Based Algorithms
- Summary
- Exercises
9. Differentially Private Machine Learning
- Why Make Machine Learning Models Differentially Private?
- Machine Learning Terminology Recap
- Differentially Private Gradient Descent (DP-GD)
  - Example: Minimum Viable DP-GD
- Stochastic Batching (DP-SGD)
  - Parallel Composition
  - Privacy Amplification by Subsampling
  - Hyperparameter Tuning
    - Public holdout
    - Private selection from private candidates
- Private Aggregations of Teacher Ensembles
- Training Differentially Private Models with PyTorch
  - Example: Predicting Income Privately
- Summary
- Exercises
10. Differentially Private Synthetic Data
- Defining Synthetic Data
  - Types of Synthetic Data
- Practical Scenarios for Synthetic Data Usage
- Marginal-Based Synthesizers
  - Multiplicative Weights Update Rule with the Exponential Mechanism
- Graphical Models
  - PrivBayes
- GAN Synthesizers
  - Potential Problems
- Summary
- Exercises
III. Deploying Differential Privacy
11. Protecting Your Data Against Privacy Attacks
- Definition of a Privacy Violation
- Attacks on Tabular Data Sets
  - Record Linkage
    - Example: teacher survey
  - Singling Out
  - Differencing Attack
  - Reconstruction via Systems of Equations
  - Tracing
  - k-Anonymity Vulnerabilities
- Attacks on Machine Learning
- Summary
- Exercises
12. Defining Privacy Loss Parameters of a Data Release
- Sampling
- Metadata Parameters
- Allocating Privacy Loss Budget
- Practices That Aid Decision-Making
  - Codebook and Data Annotation
  - Translating Contextual Norms into Parameters
- Making These Decisions in the Context of Exploratory Data Analysis
- Adaptively Choosing Privacy Parameters
- Potential (Unexpected) Consequences of Transparent Parameter Selection
- Summary
- Exercises
13. Planning Your First DP Project
- DP Deployment Considerations
  - Frequency of DP Deployments
  - Composition and Budget Accountability
- DP Deployment Checklist
- An Example Project: Back to the Classroom
- Proper Real-World Data Publications
  - LinkedIns Economic Graph
  - Microsofts Broadband Data
- DP Release Table: A Standard for Releasing Details About Your Release
- Thats All, Folks
Further Reading
- Theory
- Applications
A. Supplementary Definitions
B. Rényi Differential Privacy
- Theorem: RDP Is Immune to Postprocessing
  - Proof
- Theorem: Youngs Inequality
  - Proof via Calculus
- Elementary Proof
- Theorem: Holders Inequality
  - Proof
- Theorem: Probability Preservation
  - Proof
- Theorem: RDP to ( , ) -DP
  - Proof
C. The Exponential Mechanism Satisfies Bounded Range
- Proof
D. Structured Query Language (SQL)
E. Composition Proofs
- Theorem: Basic Sequential Composition
  - Proof
- Theorem: General Sequential Composition
  - Proof
- Theorem: Parallel Composition
  - Proof
- Theorem: Immunity to Postprocessing
  - Proof
F. Machine Learning
- Supervised Versus Unsupervised Learning
- Gradient Descent
  - Using Gradient Descent to Learn Parameters
  - Stochastic Gradient Descent
G. Where to Find Solutions
Index