Gunjan Aggarwal

I am a second year graduate student in the department of Computer Science at Georgia Institute of Technology. My interests span a broad range of sub-fields in Computer Science, including Deep Learning, Machine Learning, Computer Vision and Natural Language Processing. I have worked on various projects and internships involving Computer Vision and NLP tasks. I am currently pursuing my Graduate Research under Professor Devi Parikh and Dhruv Batra where I am working on problems related to multi-modal AI.

This summer of 2022, I interned with Adobe Applied Science ML team as an ML intern, working on real-time generation of temporally consistent videos for face makeup transfer.

Prior to pursuing my Master's I was working as a Software Development Engineer-2 at Adobe, working on Adobe's Chat Application and also exploring different Computer Vision related projects. I started working at Adobe after graduating from BITS Pilani where I did my major in Computer Science.

In my free time I like to participate in adventure sports and appreciate the unexplored beauty of nature through trekking.


LinkedIn  /  Resume  /  Google Scholar


ZSON: Zero-Shot Object-Goal Navigation using Multimodal Goal Embeddings

Paper accepted at NeurIPS 2022.

Proposed a zero-shot approach for object-goal navigation by encoding goal images into a multi-modal, semantic embedding space.

Achieved 4-20% improvement for object-goal navigation task over state-of-the-art methods.

Showed the importance of using a self-supervised pre-trained visual encoder for zero-shot transfer.


Dance2Music: Automatic Dance-driven Music Generation

Paper accepted as a workshop paper at NeurIPS 2021.
Project Page

Proposed an approach to generate music conditioned on dance in real-time.

Designed an offline approach to generate a paired dance and music dataset which was then used to train a deep neural network.

Human subjects favoured our approach more than other baselines.


Neuro-Symbolic Generative Art: A Preliminary Study

Paper accepted as a short paper at ICCC 2020.
Project Page

Proposed a new hybrid genre of art: neuro-symbolic generative art (NSG).

A progressive GAN was trained over the dataset collected from symbolic art approach.

Evaluated the creativity of NSG vs the creativity of the original symbolic data through human studies. Human subjects rated the NSG art as more creative than the symbolic art.


On the Benefits of Models with Perceptually-Aligned Gradients

Paper accepted as a workshop paper at ICLR 2020.

Explored the benefits of adversarial training for neural networks.

Adversarial training with small epsilon improved the model's performance for downstream tasks.

Showed improvement in performance for domain adaptation tasks, like SVHN to MNIST transfer, and for weakly supervised object localization task.


cFineGAN: Unsupervised multi-conditional fine-grained image generation

Paper accepted as a workshop paper at NeurIPS 2019.

Developed a multi-conditional image generation pipeline in an unsupervised way using a hierarchical GAN framework.

Given a texture and a shape image, the pipeline generates an output that preserves the shape of first and texture of second input image.

Used standard and shape biased ImageNet pre-trained Resnet50 models to identify the shape and texture characteristics of inputs, respectively.

The work was also selected as one of the top 11 projects to be showcased live on stage in front of 15,000 people at Adobe MAX SNEAKS, 2019 - Video link.


Unsupervised Domain Adaptation

Used FixMatch consistency to achieve 4% improvement over the state-of-the-art approach for Unsupervised Domain Adaptation from SVHN to MNIST.


Sentiment Analysis using Deep Learning

Applied deep learning to perform sentiment analysis over different Indian languages.

Experimented with different optimizers such as Adam and Momentum Optimizer, and also withdifferent network architectures such as CNN and RNN.

Achieved 85% and 83% mean validation accuracy with CNN and RNN respectively over different languages.

Extended the project to contrast the impacts of product-centric and social cause marketing adver-tisements on users by analyzing their comments.


Knowldge Extraction

Analyzed UCI Student Performance dataset and classified the student grades using several models such as KNN, Decision trees and SVM.

Applied different pre-processing techniques over Census Income dataset, classified the income using Logistic Regression and computed the correlation between different features.


Textual Search Engine

Implemented sentence tokenization, normalization, building of inverted index and processing of wild-card queries for document retrieval on Reuters Corpus.



One of the 11 presenters Adobe wide to present my work on multi-conditional image generation at Adobe MAX SNEAKS, 2019 - Video link.


Google Code Jam

Achieved a global rank of 27 in "Code Jam to I/O for Women" and got invited to attend Google I/O, 2018.


ML Intern at Adobe, San Jose | June 2022 - August 2022

Worked on real-time generation of temporally consistent videos for face makeup transfer.

Integrated blind video temporal consistency to create paired video data using videos from image based models.

Incorporated Face Mesh to improve lip segmentation and trained Pix2Pix generative model and ConvGRU based recurrent model to achieve superior qualitative and quantitative performance (2.5% increase in color consistency).


Graduate Research Assistant, Visual Intelligence Lab, Georgia Tech | Aug 2021 - Present

Working under the supervision of Prof. Devi Parikh and Prof. Dhruv Batra.

Exploring the creative applications of Deep Learning.


Software Development Engineer-2 at Adobe, India | July 2018-August 2021

Worked on the chatbot framework for Adobe Messaging platform from scratch, starting with Microsoft LUIS and Rasa, and moving on to designing in-house multi-lingual intent classifier by utilizing embedding from Google's Universal Sentence Encoder (USE) model. The chatbot is serving ~ 20,000 customers daily.

Applied HDBSCAN clustering algorithm on top of embeddings of low-confidence user utterances to identify new intents.

Worked on a PoC for designing a zero-shot pipeline for user intent identification using pre-trained BART model which alleviated the need to re-train model over each new intent.


Master of Science in Computer Science

Georgia Institute of Technology, Atlanta | Specialization in Machine Learning.

Thesis advised by Prof. Devi Parikh.

Expected Graduation: May 2023.


Bachelor Of Engineering (Hons.) in Computer Science.

BITS Pilani | Aug 2014 - July 2018.

inspired from this website