EarlyFusion-on-EasyVQA

This repository contains the streamlit demo for the Episode 1 of Vision Language Modelling Series by "Donkey Stereotype by PrithiviDa".

Youtube: Video Link

Original Reference: Training Notebook

Dataset: Training and Testing Dataset

Demo: Host Link

test_samples directory contains some images to interact with demo. Their corresponding questions are in questions.txt. For anyone who has no idea what this is all about, just pick up the images and questions from the directory and play around.

Note:The model demonstrated here is EarlyFusion one from the video.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

EarlyFusion-on-EasyVQA

Files

README.md

Latest commit

History

README.md

File metadata and controls

EarlyFusion-on-EasyVQA