Visual Learning and Recognition

Weakly Supervised Object Localization

Implements a weakly supervised object detector which utilizes only image-level annotations and no bounding box annotations on the PASCAL VOC 2007 dataset.

Visual Question Answering

In Visual Question Answering (VQA), given an image and a question about it, our goal is to select an answer from a large pool of possible answers. We implement a transformer based architecture which uses pre-trained ResNet18 and RoBERTa to featurize input images and text.

Generative Adversarial Networks

Trained GAN’s with various losses on the CUB 2011 Dataset to generate realistic-looking samples of these birds.

Ninaad Damis

Visual Learning and Recognition

Weakly Supervised Object Localization

Visual Question Answering

Generative Adversarial Networks

Vanilla GAN

Least Squares GAN

Wassertein GAN with Gradient Penalty

Share on