python -m pip install dfn
# download CLIP VIT-b32 and put these files into ./clip_image_search/clip/bmodels/EN
python3 -m dfn --url https://disk.sophgo.vip/sharing/optDG3uDs
# download ChineseCLIP VIT-16 and put these files into ./clip_image_search/clip/bmodels/CH
python3 -m dfn --url https://disk.sophgo.vip/sharing/qw6hvmVWs
CLIP (Contrastive Language–Image Pre-training) is a technique which efficiently learns visual concepts from natural language supervision. CLIP has found applications in stable diffusion.
This repository aims act as a POC in exploring the ability to use CLIP for video search using natural language outlined in the article found here.
- python >= 3.8
streamlit run app.py EN