Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multithreaded and Vsync Significantly Degrade Performance #48

Open
TokisanGames opened this issue Aug 16, 2019 · 13 comments
Open

Multithreaded and Vsync Significantly Degrade Performance #48

TokisanGames opened this issue Aug 16, 2019 · 13 comments
Labels
Waiting for Godot Godot needs an improvement

Comments

@TokisanGames
Copy link
Contributor

TokisanGames commented Aug 16, 2019

I've added user editability to my fps demo for voxelgame. I'm almost ready to submit it. However, I noticed an issue with updating the edited terrain.

WIth vsync turned on, my game runs at 60fps. Though 1-2 changes might be processed, the majority wait until the entire VoxelTerrain is done loading before making changes. This is about 20-30 seconds.

With vsync turned off, I could get anywhere between 40-200fps, yet even at the slower FPS, I can start editing the terrain the instant the game starts without delay. The terrain blocks are definitely still loading, but I can start digging into the terrain right away.

Is there any way to fix this?
Occurs with VoxelTerrain on blocky or smooth.

Edit: Also occurs on both my demo and yours, Zylann.

Edit 2:

Multithreaded mode and vsync on both and independently have a significant negative impact on the performance for both terrain generation and user editability.

* changed title for clarity

See this comment below for more tests.

@Zylann
Copy link
Owner

Zylann commented Aug 16, 2019

Which terrain type are you referring to?

@TokisanGames
Copy link
Contributor Author

VoxelTerrain. Blocky or smooth. My demos or yours. They all do it.

@Zylann
Copy link
Owner

Zylann commented Aug 17, 2019

Though 1-2 changes might be processed, the majority wait until the entire VoxelTerrain is done loading before making changes. This is about 20-30 seconds.

So that's when loading the world for the first time? 20-30 seconds sounds really slow, are you using a debug build? Also it's only taking me a few seconds in debug mode to be able to start editing terrain.

@Zylann Zylann added the Needs more info Couldn't repro / more info needed label Aug 19, 2019
@TokisanGames
Copy link
Contributor Author

TokisanGames commented Aug 23, 2019

So that's when loading the world for the first time?

Yes, as it's building out the blocks.

20-30 seconds sounds really slow, are you using a debug build?

My target was release_debug, but I recompiled with release and experienced the same issue.

Check out the video below. It shows four modes with vastly distinct performance metrics:
From slowest to fastest:

  • Multithreaded, vsync on
  • Singlethreaded, vsync on
  • Multithreaded, vsync off
  • Singlethreaded, vsync off

Multithreaded mode and vsync on, both and independently have a significant negative impact on performance for both terrain generation and user editability.

With vsync off, the FPS is 2x faster at most; but more like 1.3x. But with vsync on, the updates are much slower than 2x. More like 10x or even slower, so it's not FPS bound, it is vsync bound. And multithreaded has it's own performance issue.

See it all in the video:
https://youtu.be/lVFxiHwQsJU

@TokisanGames TokisanGames changed the title User destructability delayed with vsync on Multithreaded and Vsync Significantly Degrade Performance Aug 23, 2019
@Zylann
Copy link
Owner

Zylann commented Aug 23, 2019

I'm really suspicious of multithreaded rendering in Godot 3, so far to me it didn't give any advantage (mostly because it's OpenGL behind and it sucks at threading so no real effort is put into uploading resources efficiently). But regardless of that, there isn't much the module can do about it.

For V-sync I really wonder what's going on because if framerate doesn't change it shouldn't affect how fast the module updates. Also threads are used anyways, the only part in _process is the one checking the visible area and sending meshes to VisualServer.
I've been testing this module so far with default settings, so with V-sync on and single-safe threading, haven't noticed such an issue with this even in debug mode.

At first I thought the update sorting didn't work, because it's really supposed to prioritise loading and updating terrain that is closer to you. But then I realized that because I had to call VisualServer from the main thread, I made a queue of blocks to send in _process. This queue is consumed once per frame until a maximum time budget has elapsed (currently hardcoded to 8ms), after which it stops sending them and will continue the next frame. If that wasnt time-bound, your game would stutter terribly.
But here, it seems the settings you had makes uploading meshes to VisualServer considerably slower. And because you built physics in the same place, you also don't have physics updates either.
I bet Godot's implementation of multithreaded upload of meshes is actually blocking the caller until the rendering thread syncs, and that's a real shame xD For V-sync I don't know what's up but it's all about VisualServer performance here. Maybe you could try printing how much that queue grows, or profiling this section:

Ref<ArrayMesh> mesh;
mesh.instance();
int surface_index = 0;
const VoxelMeshUpdater::OutputBlockData &data = ob.data;
for (int i = 0; i < data.blocky_surfaces.surfaces.size(); ++i) {
Array surface = data.blocky_surfaces.surfaces[i];
if (surface.empty()) {
continue;
}
CRASH_COND(surface.size() != Mesh::ARRAY_MAX);
mesh->add_surface_from_arrays(data.blocky_surfaces.primitive_type, surface);
mesh->surface_set_material(surface_index, _materials[i]);
++surface_index;
}
for (int i = 0; i < data.smooth_surfaces.surfaces.size(); ++i) {
Array surface = data.smooth_surfaces.surfaces[i];
if (surface.empty()) {
continue;
}
CRASH_COND(surface.size() != Mesh::ARRAY_MAX);
mesh->add_surface_from_arrays(data.smooth_surfaces.primitive_type, surface);
mesh->surface_set_material(surface_index, _materials[i]);
++surface_index;
}
if (is_mesh_empty(mesh)) {
mesh = Ref<Mesh>();
}
block->set_mesh(mesh, world);

You could also check if multithreaded physics also ruins it, in which case the way it was implemented would be at fault.

Edit: reduz confirmed that threaded mode makes VisualServer wait for sync. Vulkan will be much better in that apsect.
Could be interesting to see if multithreaded physics has this issue as well?
Edit2: reduz said physics would also wait for sync...
Bottom line is, will be better with Vulkan for graphics stuff, and hopefully 4.0 will improve the physics side.

@Zylann Zylann added Waiting for Godot Godot needs an improvement and removed Needs more info Couldn't repro / more info needed labels Aug 24, 2019
@TokisanGames
Copy link
Contributor Author

Thanks for asking. I'll try looking at the queue and profiling that section in a while.

I turned off generate collision and terrain updates are still vsync bound. They do happen but queue up and are sluggish until the terrain is finished loading.

Even your blocky demo is vsync bound. Try this with vsync on and off: immediately dig straight down into the terrain, as deep and fast as you can get. You move down and the boxmover provides collision, but it doesn't draw for 10-20 seconds with vsync on. Whereas with vsync off, the visuals catch up in 2-3 seconds.

The only physics threads settings I see are in 2d and they don't make a difference that I can tell.

@Zylann
Copy link
Owner

Zylann commented Aug 25, 2019

I also realize that creating the physics mesh is litterally asking VisualServer for the vertices, which further destroys framerate not only because it triggers again the wait for the render thread, but also because it might pull the data back from the graphics card. I hope I'm wrong about the latter...
Those vertices are generated for unique voxel meshes so the cache in Mesh is also useless. But more importantly, since they are in RAM already (since I generate them), there is no point using VisualServer for anything physics-related.

Also Godot builds a BVH when TriangleMesh is created, which is probably completely useless when Bullet is used anyways. There is a huge overhead creating those collision shapes...

@blockspacer
Copy link

Whats the state of that issue after #54 ?

@TokisanGames
Copy link
Contributor Author

No change, unrelated. This issue is "waiting for Godot" to improve their visual server.

@Zylann
Copy link
Owner

Zylann commented Sep 18, 2021

This might explain the problems you had: godotengine/godot#52801
Turning off v-sync kinda "improves" the issue, but not as much as when verbose output is on. There is something fishy with OpenGL/drivers.

@Zylann
Copy link
Owner

Zylann commented Oct 1, 2021

@tinmanjuggernaut you reported multi-threaded rendering with v-sync had bad performance, however when I test it today, that doesn't seem to be the case. The "slowness" from the first OpenGL calls per frame happens in the rendering thread, so they don't stall the main thread.

@TokisanGames
Copy link
Contributor Author

It was 2 years ago and I documented it all with video. So if it's working well now, it's probably due to changes you've made or those made in the core engine. Feel free to close this ticket if you can no longer duplicate it.

@Zylann
Copy link
Owner

Zylann commented Oct 1, 2021

Actually I'm having a closer look, it's still bad for the same reason. Despite multithread, VisualServer stalls the main thread. The irony!
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Waiting for Godot Godot needs an improvement
Projects
None yet
Development

No branches or pull requests

3 participants