VQA: Asking Questions About Image Content
Published in North South University, 2022
Visual Question Answering (VQA) has piqued the interest of researchers in deep learning, computer vision, and natural language processing, a relatively recent challenge in these fields. VQA requires an algorithm to respond to text-based inquiries about pictures. Because many open-ended answers comprise a few words or a closed set of options in a multiple choice format, VQA adapts itself to automated review. We used the combination of algorithms CNN (for the Image part) and LSTM/GRU/CNN (for the linguistic part), and a second algorithm which uses Vision Transformer (ViT)(for the Image part) and BERT(for the linguistic part).
Project Demo
Download Capstone Poster here
Supervisor: Dr. Mohammad Ashrafuzzman Khan
Authors: S M Gazzali Arafat Nishan, Ashfaq Uddin Ahmed,Tashfia Tabassum