Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vulkan applications don't work in Xvnc #1674

Closed
DocMAX opened this issue Sep 13, 2023 · 28 comments
Closed

Vulkan applications don't work in Xvnc #1674

DocMAX opened this issue Sep 13, 2023 · 28 comments
Labels
bug Something isn't working

Comments

@DocMAX
Copy link

DocMAX commented Sep 13, 2023

is it possible to add vulkan support in an Xvnc server?

@CendioOssman
Copy link
Member

I'm not terribly familiar with Vulkan, but I would assume it has a software fallback just like OpenGL that should work fine in Xvnc. What errors are you seeing?

@CendioOssman
Copy link
Member

No response. Closing.

@CendioOssman CendioOssman closed this as not planned Won't fix, can't repro, duplicate, stale Nov 14, 2023
@DocMAX
Copy link
Author

DocMAX commented Dec 9, 2023

Sorry, can you reopen? The problem is simply this in a xvnc session:

grafik

@CendioOssman
Copy link
Member

Thanks, that's a bit more clear.

The DRI3 complaint seems to be just a warning, so I am not convinced that's what's causing things to fail. Will need to investigate more.

@CendioOssman CendioOssman reopened this Dec 11, 2023
@CendioOssman CendioOssman added the bug Something isn't working label Dec 11, 2023
@CendioOssman CendioOssman changed the title dri3 / vulkan support? Vulkan applications don't work in Xvnc Dec 11, 2023
@CendioOssman
Copy link
Member

vkcube works for me here, so it might be application-specific issues. I had to specify --gpu_number 1 for it to choose the software fallback, though.

@DocMAX
Copy link
Author

DocMAX commented Dec 11, 2023

grafik

Not working here. It works with my Xrdp server. I remember i had to enable "glamor" for this, but i don't remember how i did it.
Maybe the same is needed for Xvnc server?

@CendioOssman
Copy link
Member

That's odd. It doesn't look like you have a CPU fallback on that system. GPU 0 is clearly a real GPU. Perhaps some more packages need to be installed?

This is what should be expected:

$ vkcube --gpu_number 1
Selected GPU 1: llvmpipe (LLVM 16.0.6, 256 bits), type: Cpu

@DrasLorus
Copy link

DrasLorus commented Dec 30, 2023

Hello!
I have kind of the same issue. I have a VirtualGL enabled PC, thus Glxgears run at around 5k FPS on my AMD RX7900XTX using radeonsi driver.

When I run vkcube, I either get:

  • a correct GPU detection (AMD RX7900XTX NAVI41 RADV) but DRI3 not found,
  • a pretty fast cube with software rendering (llvmpipe) using --gpu_number 2, but no hardware acceleration I guess, which for games, is not ideal.

Is there a way to get Vulkan calls or DRI3 working? Or do I misunderstand something?

Note: the server has gone to sleep thanks to GDM during testing, and I will not be able to restart it before at least a week.

@CendioOssman
Copy link
Member

Hardware acceleration is an entirely different story, so let's focus on just getting basic software rendering up and running.

It sounds like that is working for you, though. What about vulkaninfo?

It looks like the current issues are:

  • vulkaninfo doesn't work, for unknown reasons
  • vkcube doesn't work for everyone, again for unknown reasons
  • The systems isn't automatically picking the software renderer

@DrasLorus
Copy link

Hi again !

Understood for hardware acceleration. I will wait.

Concerning the current issue, I can confirm that when llvmpipe is used, vkcube --gpu_number 2 works fine, as well as vkgears (using VK_DRIVER_FILES=/usr/share/vulkan/icd.d/lvp_icd.i686.json:/usr/share/vulkan/icd.d/lvp_icd.x86_64.json vkgears).

I also have observed that forcing the use of amdvlk or RADV drivers removes llvmpipe from the GPU lists. While maybe expected (since llvmpipe is not an AMD GPU), it may be worth checking if such force loading is enabled.

However, no luck for vulkaninfo. I got the following:

# VK_DRIVER_FILES=/usr/share/vulkan/icd.d/lvp_icd.i686.json:/usr/share/vulkan/icd.d/lvp_icd.x86_64.json vulkaninfo
X Error of failed request:  BadMatch (invalid parameter attributes)
  Major opcode of failed request:  1 (X_CreateWindow)
  Serial number of failed request:  7
  Current serial number in output stream:  8

Here is the output of vulkaninfo on a local X11 gnome session:

VP_VULKANINFO_llvmpipe_(LLVM_16_0_6,_256_bits)_0_0_1.json

I have tried to use GDB on vulkaninfo, but I don't have any knowledge on X11 so...
Here is the backtrace, maybe it is useful, maybe not.

#0  __GI_exit (status=status@entry=1) at exit.c:140
#1  0x00007ffff7e4662c in _XDefaultError (event=<optimized out>, dpy=0x5555556b5cf0) at /usr/src/debug/libx11/libX11-1.8.7/src/XlibInt.c:1449
#2  _XDefaultError (dpy=0x5555556b5cf0, event=<optimized out>) at /usr/src/debug/libx11/libX11-1.8.7/src/XlibInt.c:1434
#3  0x00007ffff7e4674c in _XError (dpy=dpy@entry=0x5555556b5cf0, rep=rep@entry=0x5555556ad6c0) at /usr/src/debug/libx11/libX11-1.8.7/src/XlibInt.c:1503
#4  0x00007ffff7e46858 in handle_error (dpy=0x5555556b5cf0, err=0x5555556ad6c0, in_XReply=<optimized out>) at /usr/src/debug/libx11/libX11-1.8.7/src/xcb_io.c:211
#5  0x00007ffff7e46915 in handle_response (dpy=dpy@entry=0x5555556b5cf0, response=0x5555556ad6c0, in_XReply=in_XReply@entry=1) at /usr/src/debug/libx11/libX11-1.8.7/src/xcb_io.c:403
#6  0x00007ffff7e482fd in _XReply (dpy=dpy@entry=0x5555556b5cf0, rep=rep@entry=0x7fffffffd1c0, extra=extra@entry=0, discard=discard@entry=1) at /usr/src/debug/libx11/libX11-1.8.7/src/xcb_io.c:722
#7  0x00007ffff7e48691 in XSync (dpy=0x5555556b5cf0, discard=0) at /usr/src/debug/libx11/libX11-1.8.7/src/Sync.c:44
#8  0x0000555555567f70 in AppCreateXlibWindow (inst=...) at /usr/src/debug/vulkan-tools/Vulkan-Tools-1.3.269/vulkaninfo/./vulkaninfo.h:1010
#9  0x00005555555653ba in main (argc=<optimized out>, argv=<optimized out>) at /usr/src/debug/vulkan-tools/Vulkan-Tools-1.3.269/vulkaninfo/vulkaninfo.cpp:1154

I hope I provided useful information.

@CendioOssman
Copy link
Member

Yeah, it's very unclear what vulkaninfo is upset about. Might be a general X11 thing and doesn't have anything to do with vulkan per se.

Have you reported the issue to the vulkaninfo developers? They are probably in a better position to understand what their tool needs.

@ilylily
Copy link

ilylily commented Jan 10, 2024

same issue as DrasLorus. virtualgl works, vulkan apps crash with X BadMatch and/or missing DRI3

vkcube under tigervnc with default (hardware) gpu selected:

Selected GPU 0: AMD Radeon RX 580 Series (RADV POLARIS10), type: DiscreteGpu
vulkan: No DRI3 support detected - required for presentation
Note: you can probably enable DRI3 in your Xorg config
vulkan: No DRI3 support detected - required for presentation
Note: you can probably enable DRI3 in your Xorg config
Could not find both graphics and present queues

vulkaninfo, also under tigervnc:

vulkan: No DRI3 support detected - required for presentation
Note: you can probably enable DRI3 in your Xorg config
X Error of failed request:  BadMatch (invalid parameter attributes)
  Major opcode of failed request:  1 (X_CreateWindow)
  Serial number of failed request:  7
  Current serial number in output stream:  8

vkcube worked after installing mesa-vulkan-swrast and selecting gpu 1 instead of the default 0. i'm also using radv on amd, like the others

here's a minimal example copied from https://github.com/KhronosGroup/Vulkan-Tools/blob/sdk-1.3.261.1/vulkaninfo/vulkaninfo.h#L979

i call it crashable.c

#include <stdio.h>
#include <X11/Xutil.h>

int main() {
    const int width = 640;
    const int height = 480;

    long visualMask = VisualScreenMask;
    int numberOfVisuals;

    Display *xlib_display = XOpenDisplay(NULL);
    if (xlib_display == NULL) {
        printf("XLib failed to connect to the X server.\nExiting...\n");
        return 1;
    }

    XVisualInfo vInfoTemplate = {};
    vInfoTemplate.screen = DefaultScreen(xlib_display);
    XVisualInfo *visualInfo = XGetVisualInfo(xlib_display, visualMask, &vInfoTemplate, &numberOfVisuals);
    Window xlib_window = XCreateWindow(xlib_display, RootWindow(xlib_display, vInfoTemplate.screen), 0, 0, width,
                                     height, 0, visualInfo->depth, InputOutput, visualInfo->visual, 0, NULL);

    XSync(xlib_display, 0);
    XFree(visualInfo);

    printf("%p\n", (void *)xlib_window); // silence analyzer, prevent optimizing out our window
}

build with cc -o crashable crashable.c -lX11. it prints a pointer on :0, crashes on :1 (the tigervnc display). still crashes in the same place if we XSync before the XCreateWindow, so that's definitely the call that's doing it

honestly, i can't make heads nor tails of it. looks like a normal XCreateWindow to me. but then, it's bedtime. hopefully i'm missing something obvious

@CendioOssman
Copy link
Member

Thanks. A minimal example is very helpful. It also clearly shows that the BadMatch error has nothing to do with Vulkan.

I'm guessing the issue is with the visual. That code is probably too simplistic and is not guaranteed to pick a useful one. We've seen issues like that before, where applications assume a certain order of visuals.

@CendioOssman
Copy link
Member

It was indeed that bug. I thought we already fixed that ages ago, but apparently not.

With 7ad74d1 in place, vulkaninfo works just fine.

vkcube still doesn't pick the right GPU automatically, though. And to be honest, I wouldn't be fully sure if vulkan is even supposed to. Most systems will just have one GPU, so they might have figured it to be good enough that it picks the first "real" one it finds.

Need to dig more in to how vulkan enumerates and picks GPUs.

@ilylily
Copy link

ilylily commented Jan 10, 2024

nice! clean fix :)

it makes sense to use the software renderer only as fallback for hardware. i think the problem is it seems to be failing to fall back when a hardware device is present but not presentable. this is visible in vulkaninfo in tigervnc with 7ad74d1 applied - search GPU id, note devices under Layers vs ids under Presentable Surfaces. this may an issue for mesa, but may also be widespread improper device selection by vulkan applications? not my area of expertise

i was able to force vkcube to use llvmpipe on my alpine linux system with the env var VK_ICD_FILENAMES=/usr/share/vulkan/icd.d/lvp_icd.x86_64.json. the json file for the icd is provided by the mesa-vulkan-swrast package. if the radeon icd is in the list, before or after lvp, it fails to start and complains about dri3

it's worth noting that VirtualGL/virtualgl#37 indicates that nvidia's drivers enable hardware rendering in an x proxy. i'd be interested to know if this change improves compatibility for nvidia users, since everyone with the problem in this thread is using amd+mesa

@DocMAX
Copy link
Author

DocMAX commented Jan 11, 2024

glad this thread actually lead to something. but what about that DRI3 thing now? is there a switch at Xvnc like -dri3 to enable it or is DRI3 in general not working on "virtual" displays?

grafik

Also tried to enable with Xvnc +extension DRI3 ...

@DocMAX
Copy link
Author

DocMAX commented Jan 11, 2024

Just found another "version" of your server. Its here https://github.com/kasmtech/KasmVNC.
I can launch the server like this:
Xvnc -SecurityTypes=none -geometry 1280x720 -ac -listen tcp -nowebsocket -hw3d :20
But it seems it's a different protocol. The VNC client just disconnects with "unknown message type ...." messages.
The -hw3d switch causes DRI3 to be activated. We need an implementation like this :-)

Edit: Just checked the source code it looks like a modified TigerVNC version to me. Are you aware of this?
Here is the DRI3 implementation: kasmtech/KasmVNC@d049821

And there is more to read... TurboVNC/turbovnc#373

@CendioOssman
Copy link
Member

it makes sense to use the software renderer only as fallback for hardware. i think the problem is it seems to be failing to fall back when a hardware device is present but not presentable. this is visible in vulkaninfo in tigervnc with 7ad74d1 applied - search GPU id, note devices under Layers vs ids under Presentable Surfaces. this may an issue for mesa, but may also be widespread improper device selection by vulkan applications? not my area of expertise

Looking at the code for vkcube, it seems like it's up to each application to pick a sensible device. And vkcube simply picks the "fastest", without checking if it will actually work. Should probably file a bug with them for that.

Do you have a more "real" vulkan application we can test with and see how it behaves?

glad this thread actually lead to something. but what about that DRI3 thing now? is there a switch at Xvnc like -dri3 to enable it or is DRI3 in general not working on "virtual" displays?

DRI3 is a buffer sharing system, with some hardware handling thrown in to complicate things. It's not something we've implemented in TigerVNC yet.

Just found another "version" of your server. Its here https://github.com/kasmtech/KasmVNC.

Yeah, we are aware of them. Unfortunately, they aren't terribly active in actually working with us.

But it seems it's a different protocol. The VNC client just disconnects with "unknown message type ...." messages.

That doesn't surprise me. I don't think they have any intention of being compatible with VNC. Just with their fork of noVNC they include with the server.

The -hw3d switch causes DRI3 to be activated. We need an implementation like this :-)

I actually tested their patch, and as DRC also noticed, there are still some issues to be resolved. If someone feels up to it, then feel free to submit a PR once you have something that works. :)

@euuurgh
Copy link

euuurgh commented Feb 12, 2024

Hi there!
I'm having the exact same problem.
I was trying to launch steam remotely on a lxc container on proxmox, but TigerVNC's X session does not support DRI3 and I'm also getting the error that zink does not work either.

I would really love to see this fixed, so if anyone needs more info about my system/setup, let me know, and we can debug

@euuurgh
Copy link

euuurgh commented Feb 12, 2024

Ok, I have spent the last hours trying to somehow make my vision work, but here is the problem:

TigerVNC works great, but is not 3d accelerated
Sunshine would probably have a incredible latency, but can not run headless.
I tried very long and very hard to make both X11 and Wayland run headless, but nothing I tried also worked with sunshine

This is why I am once again posting here: 3d acceleration is a big part of modern computing, and I am actually shocked no normal VNC client supports Vulkan.
If anyone has the skills to implement it, I would be eternally grateful

@bphinz
Copy link
Member

bphinz commented Feb 12, 2024 via email

@CendioOssman
Copy link
Member

TurboVNC works the same as TigerVNC, so I'm afraid that will not improve things.

Perhaps you are thinking of VirtualGL, TurboVNC's sister project? That will allow you to accelerate OpenGL on both TigerVNC and TurboVNC. No idea about Vulkan support though. @dcommander?

@CendioOssman
Copy link
Member

Anyway, the Vulkan issues seem to be application issues. Software Vulkan works fine for well-behaved applications. We've added a workaround for vulkaninfo's bug, and unfortunately vkcube needs to be fixed on its end.

3D acceleration is an entirely different beast, and we have #1626 for that. So I'm going to go ahead and close this issue as done.

@dcommander
Copy link
Contributor

VirtualGL/virtualgl#37 has more details. nVidia's Vulkan implementation does something VirtualGL-like if it detects that it is running in an X proxy, so it will be GPU-accelerated in TigerVNC. However, that implementation unfortunately doesn't allow you to select the GPU, the last time I checked. Other Vulkan drivers probably won't allow GPU acceleration in Xvnc at all. I spent a great deal of time trying to figure out a way to interpose Vulkan in the same way that VGL interposes GLX and EGL/X11. The main problem is that Vulkan interfaces with the X server at the driver level, not at the API level, so I can't interpose a few function calls and redirect rendering to another X display or to a DRI device, as VirtualGL does. It would be necessary to add VirtualGL-like functionality to an existing Vulkan driver, such as Mesa. I am happy to embark on that mission if someone forks over the labor costs, but those costs would be well into five figures in US dollars. The end goal would be to re-implement VirtualGL as an extension of Mesa so that VGL ships its own GLX, EGL, and Vulkan vendor libraries to provide the VGL front end, but the VGL back end would use GPU-specific vendor libraries. This is just a vision at this point, and I haven't explored the technical or licensing issues that might arise. Even if it is possible, it would undoubtedly be messy, and I question whether it is worth putting that much labor into X11 when the labor might be better spent figuring out how to do a Wayland VNC server. Theoretically, Wayland already has the ability to do GPU-accelerated remote display.

Referring to kasmtech/KasmVNC#193, however, the main problem with Wayland from a VNC server's point of view is that there is no single Wayland compositor. You essentially have a different compositor depending on which window manager family you decide to use. Until there is more convergence around the set of Wayland extensions that a remote desktop server would need, any attempt at a Wayland remote desktop server would be tied to a specific family of compositors, such as Weston or wlroots or GNOME. Even if that weren’t the case, moving to Wayland raises questions regarding whether it is time to abandon RFB, a protocol designed around the limitations of 1980s machines just as X11 was designed around the limitations of those machines, and use a more modern protocol that has seamless window capabilities (which would eliminate the need to use a server-side window manager at all.) The remote desktop paradigm has always been clunky. What users really want is network transparency: to run applications remotely and have them behave as if they were local. Wayland has the technical infrastructure necessary to do that, with GPU acceleration, but it would take a great deal of work to make it happen. (I’m not even sure if the necessary remote display protocol even exists.) It may not even be feasible to get funding for all of that as an open source project. The aforementioned idea regarding GPU-accelerated Vulkan in Xvnc is a smaller project, but it may not be feasible to get funding for that either.

End of the day, this all ties back into the problem that, to most Linux infrastructure developers these days, remote display is either an afterthought or isn’t even considered at all. People seem to be acting as if Linux still has a chance to capture more than 3% of the desktop market, when the truth is that it’s a server O/S and needs to be treated like one. That means having remote desktop capabilities at least as good as Microsoft’s.

Probably more information than you wanted, but hopefully it goes a long way toward explaining why this isn’t a simple problem, even for someone who solved the same problem with OpenGL, and why a potential solution to this problem leads to a cascade of questions regarding how long the open source community can reasonably prop up 1980s display technologies.

@dcommander
Copy link
Contributor

NOTE: I have a machine with both an AMD Radeon Pro WX2100 and an nVidia Quadro P620. If I select the Quadro in vkcube, everything works fine in Xvnc. If I select the Radeon Pro, I get the same complaint about the lack of a DRI3 extension. There doesn't seem to be a way to force it to use software Vulkan. (I tried the aforementioned VK_ICD_FILENAMES trick, but for some reason, that doesn't work on my system. It just complains: "Cannot find a compatible Vulkan installable client driver (ICD).")

Referring to the links in #1674 (comment), DRI3 in Xvnc is not an easy problem to solve. Kasm's implementation is easy enough to port into TigerVNC or TurboVNC, and in fact, I have done so in an experimental branch of TurboVNC that I haven't pushed to GitHub. The problem, however, is that it creates Pixmaps in system memory and synchronizes them with their associated GPU buffers on a schedule (60 times/second), rather than as needed, so it has a lot of overhead and doesn't perform very well when compared to VirtualGL. I am also skeptical as to whether that approach is fully conformant. (It seems like a mixed 3D/X11 rendering workload might break it.) I spent some time trying to figure out how to reduce the overhead and/or synchronize the GPU buffers as needed but wasn't able to. I was able to use the Xvnc X11 hooks to determine when to synchronize the pixels from a GBM buffer object into the corresponding DRI3-managed Pixmap, but those hooks were insufficient to determine when to synchronize the pixels from the Pixmap back into the corresponding BO. The BO is apparently read outside of X11, in the 3D driver, so it would probably be necessary to hook into the various 3D rendering APIs in order to perform that synchronization on an as-needed basis. At that point, the solution would look a lot like VirtualGL.

That effort led me to question why we couldn't just create the Xvnc framebuffer in GPU memory (tl;dr: you can't without an Xorg driver and the other infrastructure associated with a full-blown non-virtual X server), and answering that question led me down the same rabbit hole of questioning whether a Wayland VNC server would be a simpler solution to all of the above. The DRI3 feature would also be useless for nVidia GPU users, so if I did implement it, I would both have to apologize for it as well as disable it by default. It seems like the only purpose for it (that isn't already covered by VirtualGL) would be to get GPU-accelerated Vulkan in Xvnc with non-nVidia GPUs.

@dcommander
Copy link
Contributor

I went ahead and cleaned up my experimental port of Kasm's DRI3 feature and pushed it into the dev branch of TurboVNC. It is in the 3.2 Evolving pre-release build if anyone wants to play with it. (You enable it by passing -drinode /dev/dri/renderDXXX to /opt/TurboVNC/bin/vncserver.) It definitely does allow GPU acceleration with Vulkan if you are using Mesa-based drivers, including AMDGPU. However, VirtualGL is still generally faster and has a better feature set for OpenGL applications, particularly professional applications.

@DocMAX
Copy link
Author

DocMAX commented May 18, 2024

Getting a compile error just about the new DRI3...:
grafik

Strange, searching "repo:TurboVNC/turbovnc xvnc_dri3_sync_pixmaps_to_bos" in GitHub returns no results!?

@CendioOssman
Copy link
Member

Issues compiling TurboVNC are probably best discussed in the TurboVNC forums.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

7 participants