Hands-On Differential Privacy - Helion
ISBN: 9781492097709
stron: 362, Format: ebook
Data wydania: 2024-05-16
Księgarnia: Helion
Cena książki: 220,67 zł (poprzednio: 275,84 zł)
Oszczędzasz: 20% (-55,17 zł)
Many organizations today analyze and share large, sensitive datasets about individuals. Whether these datasets cover healthcare details, financial records, or exam scores, it's become more difficult for organizations to protect an individual's information through deidentification, anonymization, and other traditional statistical disclosure limitation techniques. This practical book explains how differential privacy (DP) can help.
Authors Ethan Cowan, Michael Shoemate, and Mayana Pereira explain how these techniques enable data scientists, researchers, and programmers to run statistical analyses that hide the contribution of any single individual. You'll dive into basic DP concepts and understand how to use open source tools to create differentially private statistics, explore how to assess the utility/privacy trade-offs, and learn how to integrate differential privacy into workflows.
With this book, you'll learn:
- How DP guarantees privacy when other data anonymization methods don't
- What preserving individual privacy in a dataset entails
- How to apply DP in several real-world scenarios and datasets
- Potential privacy attack methods, including what it means to perform a reidentification attack
- How to use the OpenDP library in privacy-preserving data releases
- How to interpret guarantees provided by specific DP data releases
Osoby które kupowały "Hands-On Differential Privacy", wybierały także:
- Windows Media Center. Domowe centrum rozrywki 66,67 zł, (8,00 zł -88%)
- Ruby on Rails. Ćwiczenia 18,75 zł, (3,00 zł -84%)
- Przywództwo w świecie VUCA. Jak być skutecznym liderem w niepewnym środowisku 58,64 zł, (12,90 zł -78%)
- Scrum. O zwinnym zarządzaniu projektami. Wydanie II rozszerzone 58,64 zł, (12,90 zł -78%)
- Od hierarchii do turkusu, czyli jak zarządzać w XXI wieku 58,64 zł, (12,90 zł -78%)
Spis treści
Hands-On Differential Privacy eBook -- spis treści
- Preface
- The Structure of This Book
- Part 1: Differential Privacy Concepts
- Part 2: Differential Privacy in Practice
- Part 3: Deploying Differential Privacy
- Conventions Used in This Book
- Using Code Examples
- OReilly Online Learning
- How to Contact Us
- Acknowledgments
- The Structure of This Book
- I. Differential Privacy Concepts
- 1. Welcome to Differential Privacy
- History
- Privatization Before Differential Privacy
- Case Study: Applying DP in a Classroom
- Privacy and the Mean
- How Could This Be Prevented?
- Randomized response
- Adding noise
- Adjacent Data Sets: What If Someone Else Had Dropped the Class?
- Sensitivity: How Much Can the Statistic Change?
- Adding Noise
- What Is a Trusted Curator?
- Available Tools
- Summary
- Exercises
- 2. Differential Privacy Fundamentals
- Intuitive Privacy
- Privacy Unit
- Privacy Loss
- Formalizing the Concept of Differential Privacy
- Randomized Response
- Randomized response is -DP
- Privacy Violation
- Randomized Response
- Models of Differential Privacy
- Sensitivity
- Differentially Private Mechanisms
- Laplace Mechanism
- The Laplace Mechanism Is -DP
- Mechanism Accuracy
- Most Common Family Type Among Students
- Exponential Mechanism
- Composition
- Postprocessing Immunity
- Implementing Differentially Private Queries with SmartNoise
- Example 1: Differentially Private Counts
- Example 2: Differentially Private Sum
- Example 3: Multiple Queries from a Single Database
- Summary
- Exercises
- Intuitive Privacy
- 3. Stable Transformations
- Distance Metrics
- Data Set Adjacency
- Bounded Versus Unbounded Differential Privacy
- Definition of a c-Stable Transformation
- Transformation: Double
- Transformation: Row-by-Row
- Stability Is a Necessary and Sufficient Condition for Sensitivity
- Transformation: Count
- Transformation: Unknown-Size Sum
- Domain Descriptors
- Transformation: Data Clipping
- Chaining
- Metric Spaces
- Definition of Stability
- Transformation: Known-Size Sum
- Transformation: Known-Size Mean
- Transformation: Unknown-Size Mean
- Transformation: Resize
- Recap of Scalar Aggregators
- Vector-Valued Aggregators
- Vector Norm, Distance, and Sensitivity
- Aggregating Data with Bounded Norm
- Grouped Data
- In Practice
- Summary
- Exercises
- Distance Metrics
- 4. Private Mechanisms
- Privacy Measure
- Privacy Measure: Max-Divergence
- Metric Versus Divergence Versus Privacy Measure
- Private Mechanisms
- Randomized Response
- The Vector Laplace Mechanism
- Discrete Laplace mechanism (geometric mechanism)
- Exponential Mechanism
- Quantile Score Transformation
- Finite support exponential mechanism
- Piecewise-Constant Support Exponential Mechanism
- Top-K
- Report Noisy Max Mechanisms
- Alternative distributions
- Interactivity
- Above Threshold
- Streams
- Online Private Selection
- Stable Transformations on Streams
- Summary
- Exercises
- Privacy Measure
- 5. Definitions of Privacy
- The Privacy Loss Random Variable
- Approximate Differential Privacy
- Truncated Noise Mechanisms
- Propose-Test-Release
- (Advanced) Composition
- The Gaussian Mechanism
- Rényi Differential Privacy
- Zero-Concentrated Differential Privacy (zCDP)
- Strength of Moments-Based Privacy Measures
- Bounded Range
- Privacy Loss Distributions
- Numerical Composition
- Characteristic Functions
- Hypothesis Testing Interpretation
- f -differential privacy
- Summary
- Exercises
- 6. Fearless Combinators
- Chaining
- Example: Bounds Estimation
- Example: B-Tree
- Privacy Measure Conversion
- Composition
- Adaptivity
- Odometers and Filters
- Partitioned Data
- Example: Grouping on Asylum Seeker Data
- Parallel Composition
- Example: Multi-Quantiles
- Privacy Amplification
- Privacy Amplification by Simple Random Sampling
- Privacy Amplification by Poisson Sampling
- Privacy Amplification by Shuffling
- Sample and Aggregate
- Private Selection from Private Candidates
- Example: k-Means
- Summary
- Exercises
- Chaining
- II. Differential Privacy in Practice
- 7. Eyes on the Privacy Unit
- Levels of Privacy
- User-Level Privacy in Practice
- Browser Logs Example: A Naive Event-Level Guarantee
- Data Sets with Unbounded Contributions
- Statistics with Constant Sensitivity
- Data Set Truncation
- Reservoir Sampling
- Truncation on Partitioned Data
- Hospital Visits Example: A Bias-Variance Trade-Off
- Scenario 1: Truncation threshold matches the true maximum influence of any single individual in the data
- Scenario 2: Maximum allowed number of visits per patient overestimates the true maximum influence of any single individual in the data
- Scenario 3: Maximum allowed number of visits per patient underestimates the true maximum influence of any single individual in the data
- Privately Estimating the Truncation Threshold
- Further Analysis with Unbounded Contributions
- Unknown Domain
- When to Apply Truncation
- Stable Grouping Transformations
- Stable Union Transformations
- Stable Join Transformations
- Summary
- Exercises
- Levels of Privacy
- 8. Differentially Private Statistical Modeling
- Private Inference
- Differentially Private Linear Regression
- Sufficient Statistics Perturbation
- Private Theil-Sen Estimator
- Objective Function Perturbation
- Algorithm Selection
- Differentially Private Naive Bayes
- Categorical Naive Bayes
- Continuous Naive Bayes
- Mechanism Design
- Example: Naive Bayes
- Tree-Based Algorithms
- Summary
- Exercises
- 9. Differentially Private Machine Learning
- Why Make Machine Learning Models Differentially Private?
- Machine Learning Terminology Recap
- Differentially Private Gradient Descent (DP-GD)
- Example: Minimum Viable DP-GD
- Stochastic Batching (DP-SGD)
- Parallel Composition
- Privacy Amplification by Subsampling
- Hyperparameter Tuning
- Public holdout
- Private selection from private candidates
- Private Aggregations of Teacher Ensembles
- Training Differentially Private Models with PyTorch
- Example: Predicting Income Privately
- Summary
- Exercises
- 10. Differentially Private Synthetic Data
- Defining Synthetic Data
- Types of Synthetic Data
- Practical Scenarios for Synthetic Data Usage
- Marginal-Based Synthesizers
- Multiplicative Weights Update Rule with the Exponential Mechanism
- Graphical Models
- PrivBayes
- GAN Synthesizers
- Potential Problems
- Summary
- Exercises
- Defining Synthetic Data
- III. Deploying Differential Privacy
- 11. Protecting Your Data Against Privacy Attacks
- Definition of a Privacy Violation
- Attacks on Tabular Data Sets
- Record Linkage
- Example: teacher survey
- Singling Out
- Differencing Attack
- Reconstruction via Systems of Equations
- Tracing
- k-Anonymity Vulnerabilities
- Record Linkage
- Attacks on Machine Learning
- Summary
- Exercises
- 12. Defining Privacy Loss Parameters of a Data Release
- Sampling
- Metadata Parameters
- Allocating Privacy Loss Budget
- Practices That Aid Decision-Making
- Codebook and Data Annotation
- Translating Contextual Norms into Parameters
- Making These Decisions in the Context of Exploratory Data Analysis
- Adaptively Choosing Privacy Parameters
- Potential (Unexpected) Consequences of Transparent Parameter Selection
- Summary
- Exercises
- 13. Planning Your First DP Project
- DP Deployment Considerations
- Frequency of DP Deployments
- Composition and Budget Accountability
- DP Deployment Checklist
- An Example Project: Back to the Classroom
- Proper Real-World Data Publications
- LinkedIns Economic Graph
- Microsofts Broadband Data
- DP Release Table: A Standard for Releasing Details About Your Release
- Thats All, Folks
- DP Deployment Considerations
- Further Reading
- Theory
- Applications
- A. Supplementary Definitions
- B. Rényi Differential Privacy
- Theorem: RDP Is Immune to Postprocessing
- Proof
- Theorem: Youngs Inequality
- Proof via Calculus
- Elementary Proof
- Theorem: Holders Inequality
- Proof
- Theorem: Probability Preservation
- Proof
- Theorem: RDP to ( , ) -DP
- Proof
- Theorem: RDP Is Immune to Postprocessing
- C. The Exponential Mechanism Satisfies Bounded Range
- Proof
- D. Structured Query Language (SQL)
- E. Composition Proofs
- Theorem: Basic Sequential Composition
- Proof
- Theorem: General Sequential Composition
- Proof
- Theorem: Parallel Composition
- Proof
- Theorem: Immunity to Postprocessing
- Proof
- Theorem: Basic Sequential Composition
- F. Machine Learning
- Supervised Versus Unsupervised Learning
- Gradient Descent
- Using Gradient Descent to Learn Parameters
- Stochastic Gradient Descent
- G. Where to Find Solutions
- Index