Skip to content

Commit 4d59bc1

Browse files
committed
Add visu and Improve README
1 parent f1a6ccf commit 4d59bc1

7 files changed

+470
-20
lines changed

.gitignore

+1-1
Original file line numberDiff line numberDiff line change
@@ -19,4 +19,4 @@ __pycache__/
1919
._.DS_Store
2020

2121
.ipynb_checkpoints
22-
*.html
22+

README.md

+69-19
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,31 @@ The goal of this repo is two folds:
88

99
If you have any questions about our code or model, don't hesitate to contact us or to submit any issues. Pull request are welcome!
1010

11+
#### Summary:
12+
13+
* [Introduction](#introduction)
14+
* [What is the task about?](#what-is-the-task-about)
15+
* [Quick insight about our method](#quick-insight-about-our-method)
16+
* [Installation](#installation)
17+
* [Requirements](#requirements)
18+
* [Submodules](#submodules)
19+
* [Data](#data)
20+
* [Reproducing results](#reproducing-results)
21+
* [Features](#features)
22+
* [Pretrained models](#pretrained-models)
23+
* [Documentation](#documentation)
24+
* [Architecture](#architecture)
25+
* [Options](#options)
26+
* [Datasets](#datasets)
27+
* [Models](#models)
28+
* [Quick examples](#quick-examples)
29+
* [Extract features from COCO](#extract-features-from-coco)
30+
* [Train models on VQA](#train-models-on-vqa)
31+
* [Monitor training](#monitor-training)
32+
* [Restart training](#restart-training)
33+
* [Evaluate models on VQA](#evaluate-models-on-vqa)
34+
* [Acknowledgment](#acknowledgment)
35+
1136
## Introduction
1237

1338
### What is the task about?
@@ -46,12 +71,10 @@ One of our claim is that the multimodal fusion between the image and the questio
4671
- our proposed Mutan (based on a Tucker Decomposition) for the fusion scheme,
4772
- an attention scheme with two "glimpses".
4873

49-
## Using this code
74+
## Installation
5075

5176
### Requirements
5277

53-
#### Installation
54-
5578
First install python 3 (we don't provide support for python 2). We advise you to install python 3 and pytorch with Anaconda:
5679

5780
- [python with anaconda](https://www.continuum.io/downloads)
@@ -72,21 +95,21 @@ cd vqa.pytorch
7295
pip install -r requirements.txt
7396
```
7497

75-
#### Submodules
98+
### Submodules
7699

77100
Our code has two external dependencies:
78101

79102
- [VQA](https://github.com/Cadene/VQA) is used to evaluate results files on the valset with the OpendEnded accuracy,
80103
- [skip-thoughts.torch](https://github.com/Cadene/skip-thoughts.torch) is used to import pretrained GRUs and embeddings.
81104

82-
#### Data
105+
### Data
83106

84107
Data will be automaticaly downloaded and preprocessed when needed. Links to data are stored in `vqa/datasets/vqa.py` and `vqa/datasets/coco.py`.
85108

86109

87-
### Reproducing results
110+
## Reproducing results
88111

89-
#### Features
112+
### Features
90113

91114
As we first developped on Lua/Torch7, we used the features of [Resnet-152 pretrained with Torch7](https://github.com/facebook/fb.resnet.torch). We plan to port the model in pytorch as well. Meanwhile, you can download the features as following:
92115

@@ -103,7 +126,7 @@ wget https://data.lip6.fr/coco/testset.txt
103126

104127
/!\ Notice that we've tried the features of [Resnet-152 pretrained with pytorch](https://github.com/pytorch/vision/blob/master/torchvision/models/resnet.py) and got lower results.
105128

106-
#### Pretrained models
129+
### Pretrained models
107130

108131
We currently provide three models trained with our old Torch7 code and ported to Pytorch:
109132

@@ -129,9 +152,9 @@ python train.py -e --path_opt options/vqa/mutan_att_trainval.yaml --resume ckpt
129152

130153
To obtain test and testdev results, you will need to zip your result json file (name it as `results.zip`) and to submit it on the [evaluation server](https://competitions.codalab.org/competitions/6961).
131154

132-
### Documentation
155+
## Documentation
133156

134-
#### Architecture
157+
### Architecture
135158

136159
```
137160
.
@@ -150,10 +173,10 @@ To obtain test and testdev results, you will need to zip your result json file (
150173
├── train.py # train & eval models
151174
├── eval_res.py # eval results files with OpenEnded metric
152175
├── extract.pt # extract features from coco with CNNs
153-
└── visu.ipynb # visualizing logs (under development)
176+
└── visu.py # visualize logs and monitor training
154177
```
155178

156-
#### Options
179+
### Options
157180

158181
There are three kind of options:
159182

@@ -163,7 +186,7 @@ There are three kind of options:
163186

164187
You can easly add new options in your custom yaml file if needed. Also, if you want to grid search a parameter, you can add an ArgumentParser option and modify the dictionnary in `train.py:L80`.
165188

166-
#### Datasets
189+
### Datasets
167190

168191
We currently provide three datasets:
169192

@@ -177,7 +200,7 @@ We plan to add:
177200
- [VQA2](http://www.visualqa.org/)
178201
- [CLEVR](http://cs.stanford.edu/people/jcjohns/clevr/)
179202

180-
#### Models
203+
### Models
181204

182205
We currently provide four models:
183206

@@ -188,9 +211,9 @@ We currently provide four models:
188211

189212
We plan to add several other strategies in the futur.
190213

191-
### Quick examples
214+
## Quick examples
192215

193-
#### Extract features from COCO
216+
### Extract features from COCO
194217

195218
The needed images will be automaticaly downloaded to `dir_data` and the features will be extracted with a resnet152 by default.
196219

@@ -221,7 +244,7 @@ CUDA_VISIBLE_DEVICES=0 python extract.py
221244
CUDA_VISIBLE_DEVICES=1,2 python extract.py
222245
```
223246

224-
#### Train models on VQA
247+
### Train models on VQA
225248

226249
Display help message, selected options and run default. The needed data will be automaticaly downloaded and processed using the options in `options/default.yaml`.
227250

@@ -249,8 +272,35 @@ Run a MutanAtt model on the trainset and valset (by default) and run throw the t
249272
python train.py --vqa_trainsplit trainval --path_opt options/vqa/mutan_att.yaml
250273
```
251274

275+
### Monitor training
276+
277+
Create a visualization of an experiment using `plotly` to monitor the training, just like the picture bellow (**click the image to access the html/js file**):
278+
279+
<p align="center">
280+
<a href="https://rawgit.com/Cadene/vqa.pytorch/master/doc/mutan_noatt.html">
281+
<img src="https://raw.githubusercontent.com/Cadene/vqa.pytorch/master/doc/mutan_noatt.png" width="600"/>
282+
</a>
283+
</p>
284+
285+
Note that you have to wait until the first open ended accuracy has finished processing and then the html file will be created and will pop out on your default browser. The html will be refreshed every 60 seconds. However, you will currently need to press F5 on your browser to see the change.
286+
287+
```
288+
python visu.py --dir_logs logs/vqa/mutan_noatt
289+
```
290+
291+
Create a visualization of multiple experiments to compare them or monitor them like the picture bellow (**click the image to access the html/js file**):
292+
293+
<p align="center">
294+
<a href="https://rawgit.com/Cadene/vqa.pytorch/master/doc/mutan_noatt_vs_att.html">
295+
<img src="https://raw.githubusercontent.com/Cadene/vqa.pytorch/master/doc/mutan_noatt_vs_att.png" width="600"/>
296+
</a>
297+
</p>
298+
299+
```
300+
python visu.py --dir_logs logs/vqa/mutan_noatt,logs/vqa/mutan_att
301+
```
252302

253-
#### Restart training
303+
### Restart training
254304

255305
Restart the model from the last checkpoint.
256306

@@ -264,7 +314,7 @@ Restart the model from the best checkpoint.
264314
python train.py --path_opt options/vqa/mutan_noatt.yaml --dir_logs logs/vqa/mutan_noatt --resume best
265315
```
266316

267-
#### Evaluate models on VQA
317+
### Evaluate models on VQA
268318

269319
Evaluate the model from the best checkpoint. If your model has been trained on the training set only (`vqa_trainsplit=train`), the model will be evaluate on the valset and will run throw the testset. If it was trained on the trainset + valset (`vqa_trainsplit=trainval`), it will not be evaluate on the valset.
270320

doc/mutan_noatt.html

+75
Large diffs are not rendered by default.

doc/mutan_noatt.png

31.6 KB
Loading

doc/mutan_noatt_vs_att.html

+75
Large diffs are not rendered by default.

doc/mutan_noatt_vs_att.png

42.8 KB
Loading

0 commit comments

Comments
 (0)