wayland demos hang in _egluNativeEventLoop loop on nVidia with __NV_PRIME_RENDER_OFFLOAD
I have a dual GPU setup. My primary GPU is an AMD RX 580 and my secondary is an nVidia GTX 970
Any attempt to run the mesa demos with __NV_PRIME_RENDER_OFFLOAD=1
cause these demos to block in __GI___poll
with the following backtrace
(gdb) bt full
#0 0x00007ffff7b90f6f in __GI___poll (fds=0x7fffffffe300, nfds=3, timeout=-1) at ../sysdeps/unix/sysv/linux/poll.c:29
sc_ret = -516
sc_cancel_oldtype = 0
#1 0x000055555555b947 in event_loop () at ../src/egl/eglut/wsi/wayland.c:549
poll_count = 3
ret = 56
pollfds = {{fd = 3, events = 1, revents = 0}, {fd = 3, events = 1, revents = 0}, {fd = 16, events = 1, revents = 0}}
#2 0x000055555555a615 in _eglutNativeEventLoop () at ../src/egl/eglut/wsi/wsi.c:70
#3 0x000055555555a226 in eglutMainLoop () at ../src/egl/eglut/eglut.c:269
win = 0x555555629a70
#4 0x00005555555597fe in main (argc=1, argv=0x7fffffffe488) at ../src/egl/opengl/eglgears.c:319
I'm running: Gnome 45.3 Linux 6.6.10 Mesa 23.3.2 nVidia 545.29.06
This is admittedly a minor issue. Under normal circumstances (without __NV_PRIME_RENDER_OFFLOAD) these demos run just fine.
The upstream issue appears to be with one of nVidia/Wayland/Mutter, as it seems the window frame surface itself is not being created/presented
With WAYLAND_DEBUG
enabled the last messages reported are:
wl_seat@23.capabilities(3)
-> wl_seat@23.get_pointer(new id wl_pointer@18)
wl_seat@23.name("seat0")
xdg_toplevel@14.configure_bounds(3840, 2128)
xdg_toplevel@14.wm_capabilities(array[16])
xdg_toplevel@14.configure(0, 0, array[0])
xdg_surface@15.configure(35)
-> xdg_toplevel@14.set_min_size(0, 0)
-> xdg_toplevel@14.set_max_size(0, 0)
-> wl_compositor@12.create_surface(new id wl_surface@25)
-> wl_subcompositor@22.get_subsurface(new id wl_subsurface@26, wl_surface@25, wl_surface@10)
-> wl_shm@17.create_pool(new id wl_shm_pool@27, fd 20, 484416)
-> wl_shm_pool@27.create_buffer(new id wl_buffer@28, 0, 348, 348, 1392, 0)
-> wl_shm_pool@27.destroy()
-> wl_surface@25.attach(wl_buffer@28, 0, 0)
-> wl_surface@25.set_buffer_scale(1)
-> wl_surface@25.commit()
-> wl_surface@25.damage_buffer(0, 0, 348, 348)
-> wl_subsurface@26.set_position(-24, -24)
-> wl_compositor@12.create_surface(new id wl_surface@29)
-> wl_subcompositor@22.get_subsurface(new id wl_subsurface@30, wl_surface@29, wl_surface@10)
-> wl_shm@17.create_pool(new id wl_shm_pool@31, fd 21, 44400)
-> wl_shm_pool@31.create_buffer(new id wl_buffer@32, 0, 300, 37, 1200, 0)
-> wl_shm_pool@31.destroy()
-> wl_surface@29.attach(wl_buffer@32, 0, 0)
-> wl_surface@29.set_buffer_scale(1)
-> wl_surface@29.commit()
-> wl_surface@29.damage_buffer(0, 0, 300, 37)
-> wl_subsurface@30.set_position(0, -37)
-> xdg_surface@15.set_window_geometry(0, -37, 300, 337)
-> xdg_surface@15.ack_configure(35)
-> wl_compositor@4.create_region(new id wl_region@33)
-> wl_region@33.add(0, 0, 300, 300)
-> wl_surface@10.set_opaque_region(wl_region@33)
-> wl_region@33.destroy()
What seems to be missing from that xdg_surface.configure is everything from this point:
-> wl_surface@10.frame(new id wl_callback@34)
-> zwp_linux_dmabuf_v1@9.create_params(new id zwp_linux_buffer_params_v1@35)
-> zwp_linux_buffer_params_v1@35.add(fd 20, 0, 0, 1216, 16777215, 4294967295)
-> zwp_linux_buffer_params_v1@35.create_immed(new id wl_buffer@36, 300, 300, 808669784, 0)
-> zwp_linux_buffer_params_v1@35.destroy()
-> wl_surface@10.attach(wl_buffer@36, 0, 0)
-> wl_surface@10.damage(0, 0, 2147483647, 2147483647)
-> wl_surface@10.commit()
After which a normal execution would yield:
wl_display@1.delete_id(27)
wl_display@1.delete_id(31)
wl_display@1.delete_id(33)
wl_display@1.delete_id(35)
wl_buffer@28.release()
wl_buffer@32.release()
-> wl_compositor@4.create_region(new id wl_region@35)
-> wl_region@35.add(0, 0, 300, 300)
-> wl_surface@10.set_opaque_region(wl_region@35)
-> wl_region@35.destroy()
wl_display@1.delete_id(34)
wl_callback@34.done(672637)
-> wl_surface@10.frame(new id wl_callback@34)
-> zwp_linux_dmabuf_v1@9.create_params(new id zwp_linux_buffer_params_v1@33)
-> zwp_linux_buffer_params_v1@33.add(fd 20, 0, 0, 1216, 16777215, 4294967295)
-> zwp_linux_buffer_params_v1@33.create_immed(new id wl_buffer@31, 300, 300, 808669784, 0)
-> zwp_linux_buffer_params_v1@33.destroy()
-> wl_surface@10.attach(wl_buffer@31, 0, 0)
-> wl_surface@10.damage(0, 0, 2147483647, 2147483647)
-> wl_surface@10.commit()
xdg_toplevel@14.configure_bounds(3840, 2128)
xdg_toplevel@14.configure(300, 337, array[4])
xdg_surface@15.configure(37)
-> xdg_toplevel@14.set_min_size(0, 0)
-> xdg_toplevel@14.set_max_size(0, 0)
...
...
and so on
As for a solution, it's arguable that one is even necessary from these demos. Perhaps a timeout on the poll event?
My personal project does not appear to wait on any poll events, but without a window frame there isn't much left to do, since the resulting OpenGL context ends up with an undefined default framebuffer as a result.
I've attached the wayland debug logs in any case. wayland.noprime.txt wayland.prime.txt
I've confirmed this to affect the current release (mesa-demos-9) and also the current main (2e40dee9)