Skip to content

Commit a402db0

Browse files
committed
Initial commit
1 parent 3ab9c1f commit a402db0

File tree

1,244 files changed

+219931
-0
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

1,244 files changed

+219931
-0
lines changed

DVERGE/.gitignore

+6
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
data/
2+
*__pycache__
3+
results/
4+
runs/
5+
checkpoints/
6+
.cph*

DVERGE/README.md

+41
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,41 @@
1+
# DVERGE
2+
This repository contains code for reproducing our NeurIPS 2020 paper ["DVERGE: Diversifying Vulnerabilities for Enhanced Robust Generation of Ensembles"](https://papers.nips.cc/paper/2020/hash/3ad7c2ebb96fcba7cda0cf54a2e802f5-Abstract.html).
3+
4+
# Dependencies
5+
Create the conda environment called `dverge` containing all the dependencies by running
6+
```
7+
conda env create -f environment.yml
8+
```
9+
We were using PyTorch 1.4.0 for all the experiments. You may want to install other versions of PyTorch according to the cuda version of your computer/server.
10+
The code is run and tested on a single TITAN Xp GPU. Running on multiple GPUs with parallelism may need adjustments.
11+
12+
# Data and pre-trained models
13+
The pre-trained models and generated black-box transfer adversarial examples can be accessed via [this link](https://drive.google.com/drive/folders/1i96Bk_bCWXhb7afSNp1t3woNjO1kAMDH?usp=sharing). Specifically, the pre-trained models are stored in the folder named `checkpoints`. Download and put `checkpoints` under this repo.
14+
15+
The black-box transfer adversarial examples (refer to the paper for more details) are stored in `transfer_adv_examples.zip`. Make a folder named `data` under this repo. Download the zip file, unzip it, and put the extracted folder `transfer_adv_examples/` under `data/`. Then one can evaluate the black-box transfer robustness of ensembles.
16+
17+
# Usage
18+
Examples of training and evaluation scripts can be found in `scripts/training.sh` and `scripts/evaluation.sh`.
19+
20+
Note that for now we extract models' intermediate features in a very naive way which may only support the ResNet20 architecture. One can implement a more robust feature extraction with the help of `forward hook` of Pytorch.
21+
22+
Also, you may observe a high variation in results when training DVERGE, which we suspect is due to the random layer sampling for distillation. Please refer to **Appendix C.5** of the paper for a discussion on the layer effects.
23+
24+
# Decision region plot
25+
We have been receiving many questions regarding the decision region plot in Figure 1. To understand how it works, a neat working example can be found in the "What is happening with these robust models?" section in [this fantastic tutorial](https://adversarial-ml-tutorial.org/adversarial_training/). Our code is adapted from that example, and the only difference is that while they plot the loss, we plot the model's decision/predicted class. Our code can be found [here](https://drive.google.com/file/d/1KNoQGTXm3g_RBwE0a6IkrlSks4Wez_tN/view). It is pretty messy, yet the essential part starts from line 177. When plotting Figure 1, we use `args.steps=1000` and `args.vmax=0.1`, which means that we are perturbing along each direction by a maximum of distance of `0.1`, and along each direction we sample `1000` perturbations and record the model's decision on each of the corresponding perturbed sample. So totally we sample `1000*1000` data points to make each of the plot in Figure 1.
26+
27+
28+
# Reference
29+
If you find our paper/this repo useful for your research, please consider citing our work.
30+
```
31+
@article{yang2020dverge,
32+
title={DVERGE: Diversifying Vulnerabilities for Enhanced Robust Generation of Ensembles},
33+
author={Yang, Huanrui and Zhang, Jingyang and Dong, Hongliang and Inkawhich, Nathan and Gardner, Andrew and Touchet, Andrew and Wilkes, Wesley and Berry, Heath and Li, Hai},
34+
journal={Advances in Neural Information Processing Systems},
35+
volume={33},
36+
year={2020}
37+
}
38+
```
39+
40+
# Acknowledgement
41+
The training code of [ADP](https://arxiv.org/pdf/1901.08846.pdf) (Adaptive Diversity Promoting Regularizer) is adapted from [the official repo](https://github.com/P2333/Adaptive-Diversity-Promoting), which is originally written in TensorFlow and we turned it into Pytorch here.

DVERGE/__init__.py

Whitespace-only changes.

DVERGE/arguments.py

+175
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,175 @@
1+
# MODEL OPTS
2+
def model_args(parser):
3+
group = parser.add_argument_group('Model', 'Arguments control Model')
4+
group.add_argument('--arch', default='ResNet', type=str, choices=['ResNet'],
5+
help='model architecture')
6+
group.add_argument('--depth', default=20, type=int,
7+
help='depth of the model')
8+
group.add_argument('--model-num', default=3, type=int,
9+
help='number of submodels within the ensemble')
10+
group.add_argument('--model-file', default=None, type=str,
11+
help='Path to the file that contains model checkpoints')
12+
group.add_argument('--gpu', default='0', type=str,
13+
help='gpu id')
14+
group.add_argument('--seed', default=0, type=int,
15+
help='random seed for torch')
16+
group.add_argument("--batch_size", default=20, type=int, help="batch_size as an integer")
17+
group.add_argument("--config_idx", default=101, type=int, help="experiment config index")
18+
19+
20+
# DATALOADING OPTS
21+
def data_args(parser):
22+
group = parser.add_argument_group('Data', 'Arguments control Data and loading for training')
23+
group.add_argument('--data-dir', type=str, default='./data',
24+
help='Dataset directory')
25+
group.add_argument('--batch-size', type=int, default=128,
26+
help='batch size of the train loader')
27+
28+
29+
# BASE TRAINING ARGS
30+
def base_train_args(parser):
31+
group = parser.add_argument_group('Base Training', 'Base arguments to configure training')
32+
group.add_argument('--epochs', default=200, type=int,
33+
help='number of training epochs')
34+
group.add_argument('--lr', default=0.1, type=float,
35+
help='learning rate')
36+
group.add_argument('--sch-intervals', nargs='*', default=[100,150], type=int,
37+
help='learning scheduler milestones')
38+
group.add_argument('--lr-gamma', default=0.1, type=float,
39+
help='learning rate decay ratio')
40+
41+
42+
# DVERGE TRAINING ARGS
43+
def dverge_train_args(parser):
44+
group = parser.add_argument_group('DVERGE Training', 'Arguments to configure DVERGE training')
45+
group.add_argument('--distill-eps', default=0.07, type=float,
46+
help='perturbation budget for distillation')
47+
group.add_argument('--distill-alpha', default=0.007, type=float,
48+
help='step size for distillation')
49+
group.add_argument('--distill-steps', default=10, type=int,
50+
help='number of steps for distillation')
51+
group.add_argument('--distill-fixed-layer', default=False, action="store_true",
52+
help='whether fixing the layer for distillation')
53+
group.add_argument('--distill-layer', default=20, type=int,
54+
help='which layer is used for distillation, only useful when distill-fixed-layer is True')
55+
group.add_argument('--distill-rand-start', default=False, action="store_true",
56+
help='whether use random start for distillation')
57+
group.add_argument('--distill-no-momentum', action="store_false", dest='distill_momentum',
58+
help='whether use momentum for distillation')
59+
group.add_argument('--plus-adv', default=False, action="store_true",
60+
help='whether perform adversarial training in the mean time with diversity training')
61+
group.add_argument('--dverge-coeff', default=1., type=float,
62+
help='the coefficient to balance diversity training and adversarial training')
63+
group.add_argument('--start-from', default='baseline', type=str, choices=['baseline', 'scratch'],
64+
help='starting point of the training')
65+
group.add_argument('--eps', default=8./255., type=float,
66+
help='perturbation budget for adversarial training')
67+
group.add_argument('--alpha', default=2./255., type=float,
68+
help='step size for adversarial training')
69+
group.add_argument('--steps', default=10, type=int,
70+
help='number of steps for adversarial training')
71+
72+
73+
# ADVERSARIAL TRAINING ARGS
74+
def adv_train_args(parser):
75+
group = parser.add_argument_group('Adversarial Training', 'Arguments to configure adversarial training')
76+
group.add_argument('--eps', default=8./255., type=float,
77+
help='perturbation budget for adversarial training')
78+
group.add_argument('--alpha', default=2./255., type=float,
79+
help='step size for adversarial training')
80+
group.add_argument('--steps', default=10, type=int,
81+
help='number of steps for adversarial training')
82+
83+
84+
# ADP TRAINING ARGS
85+
# https://arxiv.org/abs/1901.08846
86+
def adp_train_args(parser):
87+
group = parser.add_argument_group('ADP Training', 'Arguments to configure ADP training')
88+
group.add_argument('--alpha', default=2.0, type=float,
89+
help='coefficient for ensemble entropy')
90+
group.add_argument('--beta', default=0.5, type=float,
91+
help='coefficient for log determinant')
92+
group.add_argument('--plus-adv', default=False, action="store_true",
93+
help='whether perform adversarial training in the mean time with diversity training')
94+
group.add_argument('--adv-eps', default=8./255., type=float,
95+
help='perturbation budget for adversarial training')
96+
group.add_argument('--adv-alpha', default=2./255., type=float,
97+
help='step size for adversarial training')
98+
group.add_argument('--adv-steps', default=10, type=int,
99+
help='number of steps for adversarial training')
100+
101+
102+
# GAL TRAINING ARGS
103+
# https://arxiv.org/pdf/1901.09981.pdf
104+
def gal_train_args(parser):
105+
group = parser.add_argument_group('GAL Training', 'Arguments to configure GAL training')
106+
group.add_argument('--lambda', default=.5, type=float,
107+
help='coefficient for coherence')
108+
group.add_argument('--plus-adv', default=False, action="store_true",
109+
help='whether perform adversarial training in the mean time with diversity training')
110+
group.add_argument('--adv-eps', default=8./255., type=float,
111+
help='perturbation budget for adversarial training')
112+
group.add_argument('--adv-alpha', default=2./255., type=float,
113+
help='step size for adversarial training')
114+
group.add_argument('--adv-steps', default=10, type=int,
115+
help='number of steps for adversarial training')
116+
117+
118+
# WBOX EVALUATION ARGS
119+
def wbox_eval_args(parser):
120+
group = parser.add_argument_group('White-box Evaluation', 'Arguments to configure evaluation of white-box robustness')
121+
group.add_argument('--subset-num', default=1000, type=int,
122+
help='number of samples of the subset, will use the full test set if none')
123+
group.add_argument('--random-start', default=5, type=int,
124+
help='number of random starts for PGD')
125+
group.add_argument('--steps', default=50, type=int,
126+
help='number of steps for PGD')
127+
group.add_argument('--loss-fn', default='xent', type=str, choices=['xent', 'cw'],
128+
help='which loss function to use')
129+
group.add_argument('--cw-conf', default=.1, type=float,
130+
help='confidence for cw loss function')
131+
group.add_argument('--save-to-csv', action="store_true",
132+
help='whether save the results to a csv file')
133+
group.add_argument('--overwrite', action="store_false", dest="append_out",
134+
help='when saving results, whether use append mode')
135+
group.add_argument('--convergence-check', action="store_true",
136+
help='whether perform sanity check to make sure the attack converges')
137+
138+
139+
# BBOX TRANSFER EVALUATION ARGS
140+
def bbox_eval_args(parser):
141+
group = parser.add_argument_group('Black-box Evaluation', 'Arguments to configure evaluation of black-box robustness')
142+
group.add_argument('--folder', default='transfer_adv_examples', type=str,
143+
help='name of the folder that contains transfer adversarial examples')
144+
group.add_argument('--steps', default=100, type=int,
145+
help='number of PGD steps for convergence check')
146+
group.add_argument('--which-ensemble', default='baseline', choices=['baseline', 'dverge', 'adp', 'gal'],
147+
help='transfer from which ensemble')
148+
group.add_argument('--save-to-csv', action="store_true",
149+
help='whether save the results to a csv file')
150+
group.add_argument('--overwrite', action="store_false", dest="append_out",
151+
help='when saving results, whether use append mode')
152+
153+
154+
155+
# TRANSFERABILITY EVALUATION ARGS
156+
def transf_eval_args(parser):
157+
group = parser.add_argument_group('Transferability Evaluation', 'Arguments to configure evaluation of transferablity among submodels')
158+
group.add_argument('--subset-num', default=1000, type=int,
159+
help='number of samples of the subset')
160+
group.add_argument('--random-start', default=5, type=int,
161+
help='number of random starts for PGD')
162+
group.add_argument('--steps', default=50, type=int,
163+
help='number of steps for PGD')
164+
group.add_argument('--save-to-file', action="store_true",
165+
help='whether save the results to a file')
166+
167+
168+
# DIVERSITY EVALUATION ARGS
169+
def diversity_eval_args(parser):
170+
group = parser.add_argument_group('Diversity Evaluation', 'Arguments to configure evaluation of diversity of the ensemble')
171+
group.add_argument('--subset-num', default=1000, type=int,
172+
help='number of samples of the subset')
173+
group.add_argument('--save-to-file', action="store_true",
174+
help='whether save the results to a file')
175+

DVERGE/distillation.py

+102
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,102 @@
1+
import torch
2+
import torch.nn as nn
3+
import torch.nn.functional as F
4+
5+
6+
def gradient_wrt_input(model, inputs, targets, criterion=nn.CrossEntropyLoss()):
7+
inputs.requires_grad = True
8+
9+
outputs = model(inputs)
10+
loss = criterion(outputs, targets)
11+
model.zero_grad()
12+
loss.backward()
13+
14+
data_grad = inputs.grad.data
15+
return data_grad.clone().detach()
16+
17+
18+
def gradient_wrt_feature(model, source_data, target_data, layer, before_relu, criterion=nn.MSELoss()):
19+
source_data.requires_grad = True
20+
21+
out = model.get_features(x=source_data, layer=layer, before_relu=before_relu)
22+
target = model.get_features(x=target_data, layer=layer, before_relu=before_relu).data.clone().detach()
23+
24+
loss = criterion(out, target)
25+
model.zero_grad()
26+
loss.backward()
27+
28+
data_grad = source_data.grad.data
29+
return data_grad.clone().detach()
30+
31+
32+
def Linf_PGD(model, dat, lbl, eps, alpha, steps, is_targeted=False, rand_start=True, momentum=False, mu=1, criterion=nn.CrossEntropyLoss()):
33+
x_nat = dat.clone().detach()
34+
x_adv = None
35+
if rand_start:
36+
x_adv = dat.clone().detach() + torch.FloatTensor(dat.shape).uniform_(-eps, eps).cuda()
37+
else:
38+
x_adv = dat.clone().detach()
39+
x_adv = torch.clamp(x_adv, 0., 1.) # respect image bounds
40+
g = torch.zeros_like(x_adv)
41+
42+
# Iteratively Perturb data
43+
for i in range(steps):
44+
# Calculate gradient w.r.t. data
45+
grad = gradient_wrt_input(model, x_adv, lbl, criterion)
46+
with torch.no_grad():
47+
if momentum:
48+
# Compute sample wise L1 norm of gradient
49+
flat_grad = grad.view(grad.shape[0], -1)
50+
l1_grad = torch.norm(flat_grad, 1, dim=1)
51+
grad = grad / torch.clamp(l1_grad, min=1e-12).view(grad.shape[0],1,1,1)
52+
# Accumulate the gradient
53+
new_grad = mu * g + grad # calc new grad with momentum term
54+
g = new_grad
55+
else:
56+
new_grad = grad
57+
# Get the sign of the gradient
58+
sign_data_grad = new_grad.sign()
59+
if is_targeted:
60+
x_adv = x_adv - alpha * sign_data_grad # perturb the data to MINIMIZE loss on tgt class
61+
else:
62+
x_adv = x_adv + alpha * sign_data_grad # perturb the data to MAXIMIZE loss on gt class
63+
# Clip the perturbations w.r.t. the original data so we still satisfy l_infinity
64+
#x_adv = torch.clamp(x_adv, x_nat-eps, x_nat+eps) # Tensor min/max not supported yet
65+
x_adv = torch.max(torch.min(x_adv, x_nat+eps), x_nat-eps)
66+
# Make sure we are still in bounds
67+
x_adv = torch.clamp(x_adv, 0., 1.)
68+
return x_adv.clone().detach()
69+
70+
71+
def Linf_distillation(model, dat, target, eps, alpha, steps, layer, before_relu=True, mu=1, momentum=True, rand_start=False):
72+
x_nat = dat.clone().detach()
73+
x_adv = None
74+
if rand_start:
75+
x_adv = dat.clone().detach() + torch.FloatTensor(dat.shape).uniform_(-eps, eps).cuda()
76+
else:
77+
x_adv = dat.clone().detach()
78+
x_adv = torch.clamp(x_adv, 0., 1.) # respect image bounds
79+
g = torch.zeros_like(x_adv)
80+
81+
# Iteratively Perturb data
82+
for i in range(steps):
83+
# Calculate gradient w.r.t. data
84+
grad = gradient_wrt_feature(model, x_adv, target, layer, before_relu)
85+
with torch.no_grad():
86+
if momentum:
87+
# Compute sample wise L1 norm of gradient
88+
flat_grad = grad.view(grad.shape[0], -1)
89+
l1_grad = torch.norm(flat_grad, 1, dim=1)
90+
grad = grad / torch.clamp(l1_grad, min=1e-12).view(grad.shape[0],1,1,1)
91+
# Accumulate the gradient
92+
new_grad = mu * g + grad # calc new grad with momentum term
93+
g = new_grad
94+
else:
95+
new_grad = grad
96+
x_adv = x_adv - alpha * new_grad.sign() # perturb the data to MINIMIZE loss on tgt class
97+
# Clip the perturbations w.r.t. the original data so we still satisfy l_infinity
98+
#x_adv = torch.clamp(x_adv, x_nat-eps, x_nat+eps) # Tensor min/max not supported yet
99+
x_adv = torch.max(torch.min(x_adv, x_nat+eps), x_nat-eps)
100+
# Make sure we are still in bounds
101+
x_adv = torch.clamp(x_adv, 0., 1.)
102+
return x_adv.clone().detach()

DVERGE/environment.yml

+13
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
name: dverge
2+
channels:
3+
- defaults
4+
dependencies:
5+
- python=3.7
6+
- pip=19.1.1
7+
- pip:
8+
- torch==1.4.0
9+
- torchvision==0.5.0
10+
- tensorboard==2.2.0
11+
- advertorch==0.2.2
12+
- tqdm==4.46.1
13+
- pandas==1.0.1

0 commit comments

Comments
 (0)