A toolkit for vision-language processing to support the increasing popularity of mulit-modal transformer-based models
-
Updated
Oct 30, 2022 - HTML
A toolkit for vision-language processing to support the increasing popularity of mulit-modal transformer-based models
Add a description, image, and links to the gqa topic page so that developers can more easily learn about it.
To associate your repository with the gqa topic, visit your repo's landing page and select "manage topics."