You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+69-19
Original file line number
Diff line number
Diff line change
@@ -8,6 +8,31 @@ The goal of this repo is two folds:
8
8
9
9
If you have any questions about our code or model, don't hesitate to contact us or to submit any issues. Pull request are welcome!
10
10
11
+
#### Summary:
12
+
13
+
*[Introduction](#introduction)
14
+
*[What is the task about?](#what-is-the-task-about)
15
+
*[Quick insight about our method](#quick-insight-about-our-method)
16
+
*[Installation](#installation)
17
+
*[Requirements](#requirements)
18
+
*[Submodules](#submodules)
19
+
*[Data](#data)
20
+
*[Reproducing results](#reproducing-results)
21
+
*[Features](#features)
22
+
*[Pretrained models](#pretrained-models)
23
+
*[Documentation](#documentation)
24
+
*[Architecture](#architecture)
25
+
*[Options](#options)
26
+
*[Datasets](#datasets)
27
+
*[Models](#models)
28
+
*[Quick examples](#quick-examples)
29
+
*[Extract features from COCO](#extract-features-from-coco)
30
+
*[Train models on VQA](#train-models-on-vqa)
31
+
*[Monitor training](#monitor-training)
32
+
*[Restart training](#restart-training)
33
+
*[Evaluate models on VQA](#evaluate-models-on-vqa)
34
+
*[Acknowledgment](#acknowledgment)
35
+
11
36
## Introduction
12
37
13
38
### What is the task about?
@@ -46,12 +71,10 @@ One of our claim is that the multimodal fusion between the image and the questio
46
71
- our proposed Mutan (based on a Tucker Decomposition) for the fusion scheme,
47
72
- an attention scheme with two "glimpses".
48
73
49
-
## Using this code
74
+
## Installation
50
75
51
76
### Requirements
52
77
53
-
#### Installation
54
-
55
78
First install python 3 (we don't provide support for python 2). We advise you to install python 3 and pytorch with Anaconda:
56
79
57
80
-[python with anaconda](https://www.continuum.io/downloads)
@@ -72,21 +95,21 @@ cd vqa.pytorch
72
95
pip install -r requirements.txt
73
96
```
74
97
75
-
####Submodules
98
+
### Submodules
76
99
77
100
Our code has two external dependencies:
78
101
79
102
-[VQA](https://github.com/Cadene/VQA) is used to evaluate results files on the valset with the OpendEnded accuracy,
80
103
-[skip-thoughts.torch](https://github.com/Cadene/skip-thoughts.torch) is used to import pretrained GRUs and embeddings.
81
104
82
-
####Data
105
+
### Data
83
106
84
107
Data will be automaticaly downloaded and preprocessed when needed. Links to data are stored in `vqa/datasets/vqa.py` and `vqa/datasets/coco.py`.
85
108
86
109
87
-
###Reproducing results
110
+
## Reproducing results
88
111
89
-
####Features
112
+
### Features
90
113
91
114
As we first developped on Lua/Torch7, we used the features of [Resnet-152 pretrained with Torch7](https://github.com/facebook/fb.resnet.torch). We plan to port the model in pytorch as well. Meanwhile, you can download the features as following:
/!\ Notice that we've tried the features of [Resnet-152 pretrained with pytorch](https://github.com/pytorch/vision/blob/master/torchvision/models/resnet.py) and got lower results.
105
128
106
-
####Pretrained models
129
+
### Pretrained models
107
130
108
131
We currently provide three models trained with our old Torch7 code and ported to Pytorch:
To obtain test and testdev results, you will need to zip your result json file (name it as `results.zip`) and to submit it on the [evaluation server](https://competitions.codalab.org/competitions/6961).
131
154
132
-
###Documentation
155
+
## Documentation
133
156
134
-
####Architecture
157
+
### Architecture
135
158
136
159
```
137
160
.
@@ -150,10 +173,10 @@ To obtain test and testdev results, you will need to zip your result json file (
150
173
├── train.py # train & eval models
151
174
├── eval_res.py # eval results files with OpenEnded metric
152
175
├── extract.pt # extract features from coco with CNNs
@@ -163,7 +186,7 @@ There are three kind of options:
163
186
164
187
You can easly add new options in your custom yaml file if needed. Also, if you want to grid search a parameter, you can add an ArgumentParser option and modify the dictionnary in `train.py:L80`.
Display help message, selected options and run default. The needed data will be automaticaly downloaded and processed using the options in `options/default.yaml`.
227
250
@@ -249,8 +272,35 @@ Run a MutanAtt model on the trainset and valset (by default) and run throw the t
Create a visualization of an experiment using `plotly` to monitor the training, just like the picture bellow (**click the image to access the html/js file**):
Note that you have to wait until the first open ended accuracy has finished processing and then the html file will be created and will pop out on your default browser. The html will be refreshed every 60 seconds. However, you will currently need to press F5 on your browser to see the change.
286
+
287
+
```
288
+
python visu.py --dir_logs logs/vqa/mutan_noatt
289
+
```
290
+
291
+
Create a visualization of multiple experiments to compare them or monitor them like the picture bellow (**click the image to access the html/js file**):
@@ -264,7 +314,7 @@ Restart the model from the best checkpoint.
264
314
python train.py --path_opt options/vqa/mutan_noatt.yaml --dir_logs logs/vqa/mutan_noatt --resume best
265
315
```
266
316
267
-
####Evaluate models on VQA
317
+
### Evaluate models on VQA
268
318
269
319
Evaluate the model from the best checkpoint. If your model has been trained on the training set only (`vqa_trainsplit=train`), the model will be evaluate on the valset and will run throw the testset. If it was trained on the trainset + valset (`vqa_trainsplit=trainval`), it will not be evaluate on the valset.
0 commit comments