Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Single card(RTX 3090) training results #65

Open
sjhfdl opened this issue Jun 8, 2023 · 14 comments
Open

Single card(RTX 3090) training results #65

sjhfdl opened this issue Jun 8, 2023 · 14 comments

Comments

@sjhfdl
Copy link

sjhfdl commented Jun 8, 2023

Thanks for the great work and sharing the code!

I trained through a single RTX3090 graphics card, the configuration file is maptr_tiny_r50_24e.py, the results after training are shown in the figure, the results are not ideal, and there is a big gap between the results in the paper, I would like to ask if you have tried the single card training, or what should be paid attention to in the training
Figure_1

@adasfag
Copy link

adasfag commented Jun 20, 2023

I have also encounter the same problem

@adasfag
Copy link

adasfag commented Jun 20, 2023

We trained it on the two A100 GPUS, and the Map result is about 0.35 in the epoch 24

@sjhfdl
Copy link
Author

sjhfdl commented Jun 20, 2023

We trained it on the two A100 GPUS, and the Map result is about 0.35 in the epoch 24

Hello, I solved this problem, the reason is that the paper used 8 Gpus for training, and I trained on a single card, so I reduced the initial learning rate lr and weight_decay by 8 times, changed to lr=0.75e-4, weight_decay=0.00125, and then decreased the initial learning rate LR and weight_decay by 8 times. Also, enlarge the warmup_iters in lr_config by a factor of eight, to 4000

@adasfag
Copy link

adasfag commented Jun 20, 2023

Thanks

@adasfag
Copy link

adasfag commented Jun 23, 2023

Hello .I train the model again as your advice,but the Map is about 45.7 in the epoch 24. Could you provicd your single GPU result?Thank you very much

@sjhfdl
Copy link
Author

sjhfdl commented Jun 23, 2023

Hello .I train the model again as your advice,but the Map is about 45.7 in the epoch 24. Could you provicd your single GPU result?Thank you very much

First of all, I would like to apologize to you. Due to the computing power of my graphics card, when I adjusted the learning rate, I only trained the author's code for two epochs, and I felt that the accuracy of the second epoch had reached 0.15, so I did not continue the training. Then I went to verify my method, and the accuracy of the training was similar to the results given by the author. My idea is that the results of the multi-card run will be slightly lower than those of the single card, and then I assume that the method of the author can also run on my own computer and produce similar results as in the paper.

@adasfag
Copy link

adasfag commented Jun 23, 2023

Hello .I train the model again as your advice,but the Map is about 45.7 in the epoch 24. Could you provicd your single GPU result?Thank you very much

First of all, I would like to apologize to you. Due to the computing power of my graphics card, when I adjusted the learning rate, I only trained the author's code for two epochs, and I felt that the accuracy of the second epoch had reached 0.15, so I did not continue the training. Then I went to verify my method, and the accuracy of the training was similar to the results given by the author. My idea is that the results of the multi-card run will be slightly lower than those of the single card, and then I assume that the method of the author can also run on my own computer and produce similar results as in the paper.

It seems that the single result is lower than those of the multi-card, maybe it needs a more suitable lr and It confused me.

@sjhfdl
Copy link
Author

sjhfdl commented Jun 23, 2023

It seems that the single result is lower than those of the multi-card, maybe it needs a more suitable lr and It confused me.

Yes, you need a good learning rate configuration, you can try it a few times, maybe because our graphics card models are different

@sjhfdl
Copy link
Author

sjhfdl commented Jun 23, 2023

Hello .I train the model again as your advice,but the Map is about 45.7 in the epoch 24. Could you provicd your single GPU result?Thank you very much

Could you show me the test results of the training?

@adasfag
Copy link

adasfag commented Jun 23, 2023

0 214-0 5 0 498-1 0 0 659-1 0

@lrx02
Copy link

lrx02 commented Jul 1, 2023

I have met this problem as well

@dynamic721
Copy link

Hello guys, I wonder how much time you spend with a single card?

@VanHelen
Copy link

Hello,have you solve this problem?I have trained through a single RTX4090, the configuration file is maptr_tiny_r50_24e.py, but it alwalys stop training at epoch 10, with the following problem
image
May I ask what are the parameters required for completing 24 epochs of training?

@VanHelen
Copy link

Hello guys, I wonder how much time you spend with a single card?

Hello,have you successfully trained the model on a single GPU?May I ask what parameters need to be modified? I am currently experiencing training failure at epoch 10.
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants