Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot reproduce linear evaluation performance on UCF-101 #4

Closed
KT27-A opened this issue Dec 16, 2020 · 7 comments
Closed

Cannot reproduce linear evaluation performance on UCF-101 #4

KT27-A opened this issue Dec 16, 2020 · 7 comments

Comments

@KT27-A
Copy link

KT27-A commented Dec 16, 2020

Dear friend, thank you very much for your work, I really learned a lot from it. It is impressive that after training only on speed prediction and 50 epoch, it got 49.3% acc on UCF-101 with linear evaluation. Nowadays, I have been trying to reproduce this performance on Pytorch following your code. But I just got 15% acc on UCF-101 with linear evaluation. Could you please give me some advice on how to achieve the performance? I have checked a lot of times that I followed your code and I may neglect some important things. Thank you very much.

@sjenni
Copy link
Owner

sjenni commented Dec 16, 2020

Hi,
It is difficult to tell what the issue is since there are many factors involved.
Did you check what the performance of supervised pre-training on UCF gets (approx 60% in my case)?
Another test would be to see what random initialization achieves... How is the performance on the pre-training task?

@KT27-A
Copy link
Author

KT27-A commented Dec 18, 2020

Thank you very much for your quick response. I trained from scratch and got 55% acc on UCF101, which seems OK. Would you please tell me the details of random initialization on Conv and FC layers? Random initialization on FC layer is more important? I used Normal distribution with (0, 1/np.sqrt(NUMBER of features)) on FC layer.

@KT27-A
Copy link
Author

KT27-A commented Dec 19, 2020

Also, could you please tell me what the accuracy you got when only training the speed prediction? I got ~57% for it. Thanks.

@KT27-A KT27-A closed this as completed Dec 20, 2020
@KT27-A KT27-A reopened this Dec 20, 2020
@sjenni
Copy link
Owner

sjenni commented Dec 21, 2020

Thank you very much for your quick response. I trained from scratch and got 55% acc on UCF101, which seems OK. Would you please tell me the details of random initialization on Conv and FC layers? Random initialization on FC layer is more important? I used Normal distribution with (0, 1/np.sqrt(NUMBER of features)) on FC layer.

Hi, 55% with supervised sounds reasonable (although a bit lower than the 60% I got). I used the default initialization of TF (glorot-uniform I believe). Do you mean the training accuracy on the speed prediction task? I believe around 60% with 4 speed classes.

@KT27-A
Copy link
Author

KT27-A commented Dec 22, 2020

Hi, thanks for your response. I need to check my code further. I found a strange thing when I trained on your source code. When I was going to pre-train only on speed prediction task, I set --transform 'orig' and delet skip_label = tf.concat([skip_label, skip_label], 0) at line 41 in train/VideoSSLTrainer.py. Finally, I got 65% in Evaluation, which is far higher than what you reported in the paper (49.3%). Although the training epochs and batch sizes are different from those in the paper, the gap seems strange. Did I misunderstand something? Thank you, man.

@sjenni
Copy link
Owner

sjenni commented Dec 22, 2020

Hi, if you used train_test_C3D.py for this, then the setup is different from Table 1 in the paper. The script does full fine-tuning of all the layers following the setup of Table 2. Otherwise, your steps to train only on speed prediction sound correct. You could specify train_scopes=''.join(['{}/fc_{}'.format(net_scope, i+1) for i in range(3)]) in line 50 to keep the conv layers fixed.

@sjenni sjenni closed this as completed Dec 22, 2020
@KT27-A
Copy link
Author

KT27-A commented Jan 4, 2021

Hi, Jenni, thanks for your patience. I nearly reproduced the performance on Pytorch platform with different learning rate settings on speed prediction task. Now I am still confused about one thing that why you used net = tf.pad(net, [[0, 0], [1, 1], [1, 1], [1, 1], [0, 0]]) before the 5th conv layer. I think directly using conv3d as the former conv layers is doing the same thing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants