FID of Tiny-ImageNet or ImageNet 64x64 #4

Yeez-lee · 2023-11-21T08:46:17Z

Hi,

Thanks for your codes. But I have questions about FID when dataset is larger. If my dataset is either Tiny-ImageNet or ImageNet 64x64, how many images should I generate to calculate FID? The exact number of Tiny-ImageNet or ImageNet 64x64 (larger than 50k)? And I should change the batch number (125) and 400 (125*400=50k) in sample.py, right?

BTW, I see other codes use total_training_steps instead of epoch. What is the relationship between these?

FutureXiang · 2023-11-21T14:43:13Z

Hi,

Thank you for your interest.

The standard way to report FID is to (1) generate 50k images and (2) compare them to the source dataset. For example, you may want to use 50k generated images and 1.28m ImageNet images to calculate FID on IN64x64. It is NOT necessary to keep the source set size (e.g., 50k Cifar / 100k Tiny-IN /1.28m IN) the same as the target set size (50k), because 50k is large enough.
- However, for experimental purposes (e.g., checkpoint & hyper-parameter selection), you may want to monitor the FID using only 10k images, which is more efficient.
- Please note that the FID calculation module used in this repo (i.e., pytorch-fid) may NOT be suitable for very large dataset, because it extracts features and statistics for the source set repeatedly on each calculation. To do this more efficiently and elegantly, please check the EDM repo.
I prefer using num_epochs because the total number of training images can be determined, given a specific dataset. Likewise, the EDM training code uses total_kimg to represent training duration. In contrast, total_training_steps is kind of meaningless in terms of representing the training budget because different implementations may use different batch_size.

Yeez-lee · 2023-11-30T04:39:33Z

Hi,
Thank you for your reply. And I want to make the following comments.
(1) Even for the larger dataset (100k Tiny-IN /1.28m IN), only 50k generated images with source images (100k or 1.28m) are enough for FID. Is this correct?
(2) In your codes, does num_epochs mean that in each epoch, every data in the dataset is trained? For CIFAR-10, if I have 1000 epochs, do I have 1000*50k training images totally (like total_kimg in EDM)? If total_kimg is smaller than the size of a dataset, does it mean that not every data is trained?

FutureXiang · 2023-11-30T10:48:48Z

(1) Yes.
(2) Yes. But typically, we have num_epochs >> 1 and total_kimg >> |dataset|. For example, DDPM trains 2048 epochs on CIFAR-10, while EDM trains 4000 epochs on CIFAR-10 and 1950 epochs on ImageNet-64.

Yeez-lee · 2023-12-04T00:24:04Z

Thanks for your help. And I notice that your work uses unconditional models (DDPM or EDM). What if these models are conditional ones (with CFG in your another repository)? DDAE (DiT-XL/2) is evaluated in an unconditional manner, but how about conditional models‘ (DDPM or EDM) results?

FutureXiang · 2023-12-05T02:13:14Z

The CFG models (which are joint 10% uncond + 90% cond models) yield worse representations than pure unconditional models, despite achieving SOTA generative FIDs.

If we use y=null to retrieve features (i.e. in an unconditional manner), the features are still somewhat good, but the performance degrades (e.g. unconditional models reach ~90% K-NN acc on CIFAR-10, while CFG ones reach ~84%). I think (1) the insufficient 10% training of the unconditional version and (2) the joint parameterization limit the performance.
If we pass non-null y in [1...C] label embeddings to retrieve features, these features are basically meaningless.
An alternative way to use conditional models for classification is to probe the correct label in a zero-shot manner arxiv1 arxiv2, which is similar to CLIP. However, this way is somewhat inefficient because it relies on $T\times C$ forward passes to infer an answer.

FutureXiang closed this as completed Dec 12, 2023

FutureXiang pinned this issue Dec 12, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FID of Tiny-ImageNet or ImageNet 64x64 #4

FID of Tiny-ImageNet or ImageNet 64x64 #4

Yeez-lee commented Nov 21, 2023 •

edited

Loading

FutureXiang commented Nov 21, 2023

Yeez-lee commented Nov 30, 2023

FutureXiang commented Nov 30, 2023

Yeez-lee commented Dec 4, 2023 •

edited

Loading

FutureXiang commented Dec 5, 2023

FID of Tiny-ImageNet or ImageNet 64x64 #4

FID of Tiny-ImageNet or ImageNet 64x64 #4

Comments

Yeez-lee commented Nov 21, 2023 • edited Loading

FutureXiang commented Nov 21, 2023

Yeez-lee commented Nov 30, 2023

FutureXiang commented Nov 30, 2023

Yeez-lee commented Dec 4, 2023 • edited Loading

FutureXiang commented Dec 5, 2023

Yeez-lee commented Nov 21, 2023 •

edited

Loading

Yeez-lee commented Dec 4, 2023 •

edited

Loading