EECS 598-012, Winter 2023

Schedule

Schedule subject to change.

Date	Topic	Papers	Things Due
Week 1 Wed Jan 4	Introduction / Adversarial Machine Learning	Background reading for more info: Adversarial Classification, (Dalvi et al.) Evasion Attacks against Machine Learning at Test Time, (Biggio et al.) Intriguing Properties of Neural Networks, (Szegedy et al.) Explaining and Harnessing Adversarial Examples, (Goodfellow et al.)
Week 2 Mon Jan 9	Adversarial Machine Learning	Background reading for more info: Intriguing Properties of Neural Networks, (Szegedy et al.) Explaining and Harnessing Adversarial Examples, (Goodfellow et al.) Towards Deep Learning Models Resistant to Adversarial Attacks (Madry et al.) Feature Squeezing: Detecting Adversarial Examples in Deep Neural Networks (Xu et al.) Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples (Athalye et al.) PyTorch tutorial
Week 2 Wed Jan 11	Model Stealing / Poisoning	Background reading for more info: Stealing Machine Learning Models via Prediction APIs (Tramer et al.) Knockoff Nets: Stealing Functionality of Black-Box Models (Orekondy et al.) Data-Free Model Extraction (Truong et al.) Towards Data-Free Model Stealing in a Hard Label Setting (Sanyal et al.) BadNets: Identifying Vulnerabilities in the Machine Learning Model Supply Chain (Gu et al.) Targeted Backdoor Attacks on Deep Learning Systems Using Data Poisoning (Chen et al.) Input-Aware Dynamic Backdoor Attack (Nguyen and Tran) Anti-Backdoor Learning: Training Clean Models on Poisoned Data (Li et al.) BackdoorBench: A Comprehensive Benchmark of Backdoor Learning (Wu et al.)
Week 3 Mon Jan 16	No class - MLK Day
Week 3 Wed Jan 18	Poisoning (cont.) / Privacy	Background reading for more info: Membership Inference Attacks against Machine Learning Models (Shokri et al.) Deep Learning with Differential Privacy (Abadi et al.) Extracting Training Data from Large Language Models (Carlini et al.)	Paper Preferences Form due on Wed. Jan 18 Homework 1 due on Fri Jan 20
Week 4 Mon Jan 23	Project Discussion
Week 4 Wed Jan 25	Deepfake Lecture / Early AML Papers	Background reading for more info for deepfake lecture: Progressive Growing of GANs for Improved Quality, Stability, and Variation (Karras et al.) A Style-Based Generator Architecture for Generative Adversarial Networks (Karras et al.) Leveraging Frequency Analysis for Deep Fake Image Recognition (Frank et al.) Evading Deepfake-Image Detectors with White- and Black-Box Attacks (Carlini and Farid) Paper Presentation: (2) Distillation as a Defense to Adversarial Perturbations against Deep Neural Networks (Papernot et al.)	Week 4 Paper Review
Week 5 Mon Jan 30	Adversarial Attacks	(3) Towards Evaluating the Robustness of Neural Networks (Carlini and Wagner) (4) Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples (Athalye et al.)	Week 5 Paper Review Homework 2 due Feb 1
Week 5 Wed Feb 1	Adversarial Attacks / Adversarial Training I	(5) Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks (Croce and Hein) (6) Theoretically Principled Trade-off between Robustness and Accuracy (Zhang et al.)
Week 6 Mon Feb 6	Efficient Adversarial Training	(7) Fast is better than free: Revisiting adversarial training (Wong et al.) (8) Efficient Adversarial Training with Transferable Adversarial Examples (Zheng et al.)	Week 6 Paper Review Proposal
Week 6 Wed Feb 8	Black-box Attacks	(9) HopSkipJumpAttack: A Query-Efficient Decision-Based Attack (Chen et al.) (10) Black-box Adversarial Attacks with Limited Queries and Information (Ilyas et al.)
Week 7 Mon Feb 13	Physical Attacks	(11) Synthesizing Robust Adversarial Examples (Athalye et al.) (12) GRAPHITE: Generating Automatic Physical Examples for Machine-Learning Attacks on Computer Vision Systems (Feng et al.)	Week 7 Paper Review
Week 7 Wed Feb 15	Certified Defenses	(13) Certified Adversarial Robustness via Randomized Smoothing (Cohen et al.) (14) (Certified!!) Adversarial Robustness for Free! (Carlini et al.)
Week 8 Mon Feb 20	Patch Attacks / Defenses	(15) Adversarial Patch (Brown et al.) (16) PatchGuard: A Provably Robust Defense against Adversarial Patches via Small Receptive Fields and Masking (Xiang et al.)	Week 8 Paper Review
Week 8 Wed Feb 22	Automatic Speech Recognition	(17) Imperceptible, Robust, and Targeted Adversarial Examples for Automatic Speech Recognition (Qin et al.) (18) Towards More Robust Keyword Spotting for Voice Assistants (Ahmed et al.)
	Spring Break
	Spring Break
Week 9 Mon Mar 6	Theory	(19) Robustness May Be at Odds with Accuracy (Tsipras et al.) (20) Detecting Adversarial Examples Is (Nearly) As Hard As Classifying Them (Tramer)	Week 9 Paper Review
Week 9 Wed Mar 8	Theory / Diffusion	(21) Adversarial Examples Are Not Bugs, They Are Features (Ilyas et al.) (22) Diffusion Models for Adversarial Purification (Nie et al.)
Week 10 Mon Mar 13	Model Stealing Attacks	(23) Entangled Watermarks as a Defense against Model Extraction (Jia et al.) (24) Dataset Inference: Ownership Resolution in Machine Learning (Maini et al.)	Week 10 Paper Review
Week 10 Wed Mar 15	Poisoning I	(25) Poison Frogs! Targeted Clean-Label Poisoning Attacks on Neural Networks (Shafahi et al.) (26) Just How Toxic is Data Poisoning? A Unified Benchmark for Backdoor and Data Poisoning Attacks (Schwarzschild et al.)
Week 11 Mon Mar 20	Poisoning II	(27) Spectral Signatures in Backdoor Attacks (Tran et al.) (28) Adversarial Neuron Pruning Purifies Backdoored Deep Models (Wu and Wang)	Week 11 Paper Review Midterm report
Week 11 Wed Mar 22	Fairness	(29) Fairness Without Demographics in Repeated Loss Minimization (Hashimoto et al.) (30) Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification (Buolamwini and Gebru)
Week 12 Mon Mar 27	Privacy I	(31) Deep Models Under the GAN: Information Leakage from Collaborative Deep Learning (Hitaj et al.) (32) Label-Only Membership Inference Attacks (Choquette-Choo et al.)	Week 12 Paper Review
Week 12 Wed Mar 29	Privacy II	(33) Extracting Training Data from Large Language Models (Carlini et al.) (34) Scalable Private Learning with PATE (Papernot et al.)
Week 13 Mon Apr 3	Unlearning / Research vs. Industry	(35) Machine Unlearning (Bourtoule et al.) (36) "Real Attackers Don't Compute Gradients": Bridging the Gap Between Adversarial ML Research and Practice (Apruzzese et al.)	Week 13 Paper Review
Week 13 Wed Apr 5	LLM Security	(37) A Watermark for Large Language Models (Kirchenbauer et al.) (38) Can AI-Generated Text be Reliably Detected? (Sadasivan et al.)
Week 14 Mon Apr 10	Guest Lecture: Neal Mangaokar
Week 14 Wed Apr 12	Final Project Presentations		Final Project Presentation
Week 15 Mon Apr 17	Final Project Presentations		Final Project Report Due TBD

Schedule

Acknowledgements