Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mach: linux/vulkan needs frame rate limiter to avoid severe issues with some distro+WM+nvidia combinations #444

Open
slimsag opened this issue Aug 4, 2022 · 7 comments
Labels
bug Something isn't working os/linux
Milestone

Comments

@slimsag
Copy link
Member

slimsag commented Aug 4, 2022

We've had two reports of this type of behavior on (Linux, Vulkan, X11, NVIDIA, and less common distros), where we request no vsync (I believe?) and instead of the frame rate being smoothed, instead e.g. 60 frames are rendered in a burst almost instantaneously, then a vulkan swapchain call pauses for an entire second rendering nothing, then another burst, and so on:

KZUTLez.mp4

All instances appear to be solved by adding a frame rate limiter, e.g. opening any example application and adding std.time.sleep(16 * std.time.ns_per_ms); to the top of pub fn update

The reports:

  • adayoldbagel on Discord:
    • Distro: Arch
    • X11/Wayland: X11
    • Window manager: bspwm, no desktop environment
    • Vulkan driver: proprietary nvidia 515.57
    • GPU: discrete RTX 2070 SUPER
    • Using GPU_BACKEND=opengl works around the issue? Yes
    • Using frame rate limiter solves the issue? Yes
  • boopinski on Discord:
    • Distro: lubuntu
    • X11/Wayland: X11
    • Window manager: LXDE
    • Vulkan driver: proprietary nvidia 510.73.05
    • GPU: GTX 750 Ti
    • Using GPU_BACKEND=opengl works around the issue? Yes
    • Using frame rate limiter solves the issue? Yes
  • bitflip on Zig Discord:
    • xwayland + sway
    • Disabling vsync fixed the issue
    • Facing it in their own GLX usage (not webgpu/Mach)

I've found that all example programs using vulkan tie up a CPU core to 100% and lock me out of all input for 10 or more seconds per frame. I've tracked the behaviour down to the Swapchain.getCurrentTextureView() call. It seems to just hang for ages

@slimsag slimsag added the bug Something isn't working label Aug 4, 2022
@slimsag slimsag changed the title linux/vulkan needs frame rate limiter to avoid severe issues with some distro+WM+nvidia combinations mach: linux/vulkan needs frame rate limiter to avoid severe issues with some distro+WM+nvidia combinations Aug 6, 2022
slimsag added a commit that referenced this issue Aug 18, 2022
This correctly sets presentation modes for vsync, both at startup and at runtime via
a `setOptions` request.

Note: There may still be platforms where setting vsync is not enough, and a frame rate
limiter is needed to achieve proper synchronization. This is tracked in #444
and not fixed by this change.

Fixes #307

Signed-off-by: Stephen Gutekanst <[email protected]>
@slimsag slimsag added this to the Mach 0.2 milestone Sep 1, 2022
@slimsag

This comment was marked as outdated.

@slimsag

This comment was marked as outdated.

@slimsag

This comment was marked as resolved.

@slimsag

This comment was marked as off-topic.

@slimsag

This comment was marked as resolved.

@slimsag slimsag modified the milestones: Mach 0.2, Mach 0.3 Jul 28, 2023
@slimsag
Copy link
Member Author

slimsag commented Jul 28, 2023

decision: before imposing a frame rate limiter on all our Linux users by default, we're going to wait and see if anyone can reproduce this behavior again. If you can, please do comment.

Our hope/belief is that (a) some Dawn changes or (b) the multi-threaded rendering support we added (always on) may have resolved the poor vsync behavior.

@slimsag slimsag closed this as completed Aug 7, 2023
@drunderscore
Copy link

I've been able to re-produce this behavior (on 9250310) -- using the textured cube example and "Getting started", though I assume every mach causes this. Speed up and slow-down of the cube spinning, framerate goes between 22, 61, 181, etc. My window manager and entire system become quite unresponsive, and switching window focus is very slow and makes it worse, causing all windows to become unresponsive.

I attempted to record this, but quickly realized it's actually pretty difficult because my encoder quickly fails when trying to do so, and OBS will stop recording if the encoder gets too far behind. A CPU encoder might work if I try it later.

I tried setting MACH_GPU_BACKEND=opengl to see if that fixes it, but that causes a panic:

nfo(mach): found Vulkan backend on Discrete GPU adapter: NVIDIA GeForce RTX 2070 SUPER, NVIDIA: 525.116.04 525.116.4.0

info(mach): gamemode: activated
error(mach): mach: device lost: CreateSwapChain failed with VK_ERROR_NATIVE_WINDOW_IN_USE_KHR
 - While handling unexpected error type Internal when allowed errors are (Validation|DeviceLost).
    at CheckVkSuccessImpl (/home/runner/work/mach-gpu-dawn/mach-gpu-dawn/libs/dawn/src/dawn/native/vulkan/VulkanError.cpp:88)
    at Initialize (/home/runner/work/mach-gpu-dawn/mach-gpu-dawn/libs/dawn/src/dawn/native/vulkan/SwapChainVk.cpp:344)
    at Create (/home/runner/work/mach-gpu-dawn/mach-gpu-dawn/libs/dawn/src/dawn/native/vulkan/SwapChainVk.cpp:250)
    at CreateSwapChain (/home/runner/work/mach-gpu-dawn/mach-gpu-dawn/libs/dawn/src/dawn/native/Device.cpp:1779)

thread 584035 panic: mach: device lost
/home/james/mach-core/src/platform/native/Core.zig:128:5: 0xc49d9b in deviceLostCallback (textured-cube)
    @panic("mach: device lost");
    ^
???:?:?: 0x60719e in ??? (???)
???:?:?: 0x61c419 in ??? (???)
???:?:?: 0x623c1a in ??? (???)
Unwind error at address `:0x623c1a` (error.UnimplementedUserOpcode), trace may be incomplete

/home/james/zig-linux-x86_64-0.12.0-dev.1092+68ed78775/lib/std/Thread.zig:412:13: 0xccb721 in callFn__anon_43598 (textured-cube)
            @call(.auto, f, args);
            ^
/home/james/zig-linux-x86_64-0.12.0-dev.1092+68ed78775/lib/std/Thread.zig:685:30: 0xc681c2 in entryFn (textured-cube)
                return callFn(f, args_ptr.*);
                             ^
???:?:?: 0x7efcb63ac9ea in ??? (libc.so.6)

I can supply much more information here, or create a new issue, just let me know what's most important and where I should put it.

@slimsag slimsag modified the milestones: Mach 0.4, Mach 0.3 Jan 29, 2024
@slimsag slimsag reopened this Mar 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working os/linux
Projects
None yet
Development

No branches or pull requests

2 participants