This work aims to produce enriched semantic embedding by including the knowledgesources from text and image for action classes.
- Scraping images for each action class from Google Imageg Source.
Check codes in the google images folder
- Downloaed images according to the given keywords in the sub-folder
downloads
- Using a pre-defined RESNET101 as feature extractor. Reference
- Averaging image representations for each action class
[1] Narayan, S., Gupta, A., Khan, F. S., Snoek, C. G., & Shao, L. (2020). Latent embedding feedback and discriminative features for zero-shot classification. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXII 16 (pp. 479-495). Springer International Publishing. github repo.