Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow training? #66

Open
OiOchai opened this issue Jan 12, 2024 · 6 comments
Open

Slow training? #66

OiOchai opened this issue Jan 12, 2024 · 6 comments

Comments

@OiOchai
Copy link

OiOchai commented Jan 12, 2024

Hi,

Thanks for the amazing work.

I have this issue on super slow training speed. I am using a 3080. When the training just start the coarse training can reach ~10 it/s but afterwards it became 1 it/s, which is 10x slower and it took 10 hours for a 30 frame videos.. Any clue why this happens?

@guanjunwu
Copy link
Collaborator

guanjunwu commented Jan 12, 2024

Hi, are you using my latest code and set PipelineParams.debug=False?
Meanwhile, what's the resolution of training images? Larger resolution and large number of Gaussians will also lead to slow rendering/training speed.

@OiOchai
Copy link
Author

OiOchai commented Jan 12, 2024

Hi, are you using my latest code and set PipelineParams.debug=False? Meanwhile, what's the resolution of training images? Larger resolution and large number of Gaussians will also lead to slow rendering/training speed.

Thanks for the quick response. I am using an old commit 5a80f11. Let me pull the latest code and see what happen then. And my training images is 1080p. I have 30 cameras, and very long sequence video. What do you think is the maximum frames per camera for 4D Gaussians?

@guanjunwu
Copy link
Collaborator

Thanks, I think this old commit can also work. what I want to note is that setting PipelineParams.debug=False is important (which will cause CPU overload).
Do you check the memory usage? In Neu3D's dataset(present 'dynerf' in my paper),1352*1014 with 300 frames per video can also work.
maybe you can also check number of initialized point clouds? so much point clouds will also cause the slow training even OOM Error

@OiOchai
Copy link
Author

OiOchai commented Jan 12, 2024

PipelineParams

I think they are False by default no? For the pc initialization, I was just using 2000 points. Let me dig more. Thanks!

@OiOchai
Copy link
Author

OiOchai commented Jan 13, 2024

Thanks, I think this old commit can also work. what I want to note is that setting PipelineParams.debug=False is important (which will cause CPU overload). Do you check the memory usage? In Neu3D's dataset(present 'dynerf' in my paper),1352*1014 with 300 frames per video can also work. maybe you can also check number of initialized point clouds? so much point clouds will also cause the slow training even OOM Error

I also have another question. Would appreciate it if you can answer

I currently have a dataset with similar structure with Neu3D data except that it does not use NDC and we have gt calibration for that. So I follow how you read Neu3D data but instead of using readdynerfInfo I modified the readCamerasFromTransforms to append m x n CameraInfo, where m is the num of cameras per frame and n is the number of frames.

However, I always find the console return 'Killed' in the data loading stage. So I have to reduce the number of frames to make it work. I cannot even load 1080p * 30 cameras in 100 frames. Have you had this issue? Is it normal?

@guanjunwu
Copy link
Collaborator

guanjunwu commented Feb 17, 2024

Hi, I dont think it's normal.
Actually, In my original code, I set a dataloader to dynamic loading the training image. So if your data format is similar to Neu3D, the training process will begin fast.
And you can use the colmap.sh to generate the point cloud in my latest version of the code.
Btw, you can check the memory of your computer (such as watch -n 0.1 free -g) to supervise the data loading process.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants