Visual Discrimination Puzzles Website #

CLEVR VDP Dataset #

The CLEVR dataset contains 825 visual discrimination puzzles and our solver’s performance on the same. The pipeline uses a MaskRCNN + Attribute network (from the NS-VQA project) as the backend neural module.

GQA VDP Dataset #

The GQA VDP dataset consists 5000 puzzles that were randomly generated using the GQA dataset. More information about the generation process can be found here.

Image Licenses #

The GQA images are part of the COCO dataset. More information about the respective licenses for these images can be found here.

Citations #

The following external tools were used for this project:

