can't reproduce results on cub dataset #1

Dyfine · 2021-08-25T16:12:47Z

Hi @JennySeidenschwarz, thanks very much for sharing the code with clear instructions. Recently I have tried training on cub dataset but I only get 67.07% best R@1, which is worse than the reported result (70.3%). I have created the env with the provided environment.yaml and used the training command in the instructions. I only modify the bssampling setting below from "NumberSampler" to "no".

intra_batch/config/config_cub_train.yaml

Lines 53 to 55 in 5d10e8d

 sampling: "no" 

 bssampling: "NumberSampler" 

 val: 0

Besides, I have tried testing the provided best_weights on cub dataset and the result is 70.3%, so I think the downloaded dataset is correct. I wonder do I miss anything and could you give me some suggestions for getting the reported result? Thanks!

m990130 · 2022-10-11T13:41:07Z

Same issue here. I've created the conda env with the provided environment.yaml and used the training command in the instructions, but I can only reach R@1=~68% for CUB dataset with the config as it is.

The packages version are the same as the environment.yaml stated, I tried the setup with cuda 10 and cuda 11 (as sometimes this can create different results). But none of go more than 68% for cub.

I can only reproduce the result with the provided weights. It would be nice if you can provide some further insights on how to reach the performance reported.

JennySeidenschwarz · 2022-11-05T13:08:42Z

Hey everyone :)

sorry for the late result! We lately experienced something similar. Some of the layers we use (especially torch scatter) are unfortunately undeterministc. Also, batch sampling has a significant impact on the performance. If you re-train the network you should be able to reproduce the results (at least this is what our experiments showed).

Please let me know if you have more questions!

m990130 · 2022-11-07T13:33:58Z

Hi Jenny,

thanks for your reply. I am using the same config as provided in your repo and the command shown on README. Did you use any specific setting different from the one provided in [config_cub_train.yaml](./config/config_cub_train.yaml)? I am using this one to re-train the model, however, the performance was only around R@1=68% on average (from several runs on different gpus and cuda versions).

Did you find out ways to make the process more deterministic or closer to your runs? As you mentioned above, I think this could be a strong possibility why I cannot reproduce the result.

Thanks in advance! :D

JennySeidenschwarz · 2022-11-16T17:05:30Z

We used the same config file and my students were actually able to reproduce the performance :(

My students used the following to make it more deterministic:

def set_seeds(seed: int):

    # For reproducibility

    random.seed(seed)

    np.random.seed(seed)

    rng = np.random.default_rng(seed)

    os.environ["PYTHONHASHSEED"] = f"{seed}"

    os.environ["CUBLAS_WORKSPACE_CONFIG"] = ":4096:8"

    torch.manual_seed(seed)

    torch.cuda.manual_seed_all(seed)

    torch.backends.cudnn.deterministic = True

    torch.backends.cudnn.benchmark = False

    torch.use_deterministic_algorithms(True)

    torch.set_num_threads(1)

However, torch.backends.cudnn.deterministic = True for example does not work for the torch version we used in our project and you have to turn it off. This should bring you closer to the results we got.

I'm sorry that I cannot help you more, reproducibility in pytorch is a pain...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

can't reproduce results on cub dataset #1

can't reproduce results on cub dataset #1

Dyfine commented Aug 25, 2021

m990130 commented Oct 11, 2022

JennySeidenschwarz commented Nov 5, 2022

m990130 commented Nov 7, 2022

JennySeidenschwarz commented Nov 16, 2022 •

edited

Loading

can't reproduce results on cub dataset #1

can't reproduce results on cub dataset #1

Comments

Dyfine commented Aug 25, 2021

m990130 commented Oct 11, 2022

JennySeidenschwarz commented Nov 5, 2022

m990130 commented Nov 7, 2022

JennySeidenschwarz commented Nov 16, 2022 • edited Loading

JennySeidenschwarz commented Nov 16, 2022 •

edited

Loading