Skip to content

Latest commit

 

History

History
31 lines (31 loc) · 1.3 KB

README.md

File metadata and controls

31 lines (31 loc) · 1.3 KB

'CLIP' (Radford et al., 2021) implementation from scratch in PyTorch

Pretrained Model

Linear Classification on ImageNet1k (mini) Dataset

# e.g.,
python3 linear_classification.py\
    --ckpt_path="../clip_flickr.pth"\
    --data_dir="../imagenet-mini/"\
    --n_epochs=64\
    --batch_size=128\
    --n_cpus=4 # Optional
  • Top-5 accuracy on validation set: 5.8%

Zero-shot Classification on ImageNet1k (mini) Dataset

# e.g.,
python3 zero_shot_classification.py\
    --ckpt_path="../clip_flickr.pth"\
    --data_dir="../imagenet-mini/"\
    --batch_size=16\
    --n_cpus=4\ # Optional
    --max_len=128\ # Optional
    --k=10 # Optional
  • Top-10 accuracy on train + validation set: 3.0%

Implementation Details

  • Temperature와 관련한 부분은 구현하지 않았습니다.
    • "The learnable temperature parameter was clipped to prevent scaling the logits by more than 100 which we found necessary to prevent training instability."