Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynamically expose NVIDIA X.Org X11 display server libraries and configure the container correctly #563

Open
ehfd opened this issue Jun 24, 2024 · 0 comments
Assignees

Comments

@ehfd
Copy link
Contributor

ehfd commented Jun 24, 2024

Also refer: NVIDIA/libnvidia-container#118

This issue is because X11 graphical libraries are not pushed into the Windows Subsystem for Linux (WSL) and thus a full-fledged X11 server is not possible to be deployed.

Because RM_VERSION in WSL and regular Linux tend to be different, it is also not possible to download the driver libraries inside the container and unpack them.

This will also benefit regular Linux container environments.


On top of PR #548, it would be ideal if it's possible to push nvidia_drv.so and libglxserver_nvidia.so.* into /usr/lib/x86_64-linux-gnu/nvidia/xorg/ or (in a different notation) libRoot + /nvidia/xorg/ inside the container, regardless of where it was found, and the container toolkit to generate a /usr/share/X11/xorg.conf.d/10-nvidia.conf file with the following content regardless of whether 10-nvidia.conf exists in the host (mind the module path between distributions and architectures):

Section "OutputClass"
    Identifier "nvidia"
    MatchDriver "nvidia-drm"
    Driver "nvidia"
    Option "AllowEmptyInitialConfiguration"
    ModulePath "/usr/lib/x86_64-linux-gnu/nvidia/xorg"
EndSection

The above is the default behavior for the nvidia-driver-550 from Ubuntu APT and I generally like this approach a lot.


nvidia-xconfig (and also possibly nvidia-config, combined with the NVIDIA GTK libraries and libnvidia-wayland-client.so which are dependencies) should also be pushed into the container for a full X11 experience.

Then, the X11 aspect will be solved and we developers can call it a day.

Also, injecting 32-bit libraries is definitely desirable for usage with Wine/Proton/etc.


Notes:

  --x-prefix=X-PREFIX
      The prefix under which the X components of the NVIDIA driver will be installed; the default is '/usr/X11R6'
      unless nvidia-installer detects that X.Org >= 7.0 is installed, in which case the default is '/usr'.  Only under
      rare circumstances should this option be used.

  --xfree86-prefix=XFREE86-PREFIX
      This is a deprecated synonym for --x-prefix.

  --x-module-path=X-MODULE-PATH
      The path under which the NVIDIA X server modules will be installed.  If this option is not specified,
      nvidia-installer uses the following search order and selects the first valid directory it finds: 1) `X
      -showDefaultModulePath`, 2) `pkg-config --variable=moduledir xorg-server`, or 3) the X library path (see the
      '--x-library-path' option) plus either 'modules' (for X servers older than X.Org 7.0) or 'xorg/modules' (for
      X.Org 7.0 or later).

  --x-library-path=X-LIBRARY-PATH
      The path under which the NVIDIA X libraries will be installed.  If this option is not specified, nvidia-installer
      uses the following search order and selects the first valid directory it finds: 1) `X -showDefaultLibPath`, 2)
      `pkg-config --variable=libdir xorg-server`, or 3) the X prefix (see the '--x-prefix' option) plus 'lib' on 32bit
      systems, and either 'lib64' or 'lib' on 64bit systems, depending on the installed Linux distribution.

  --x-sysconfig-path=X-SYSCONFIG-PATH
      The path under which X system configuration files will be installed.  If this option is not specified,
      nvidia-installer uses the following search order and selects the first valid directory it finds: 1) `pkg-config
      --variable=sysconfigdir xorg-server`, or 2) /usr/share/X11/xorg.conf.d.

For example, the above (visible through ./nvidia-installer -A after sh NVIDIA-Linux-x86_64-550.78.run -x) is what causes the issue here when someone installs the NVIDIA driver without any X-related libraries installed to a host meant to be a K8s node.

WARNING: nvidia-installer was forced to guess the X library path '/usr/lib64' and X module path
           '/usr/lib64/xorg/modules'; these paths were not queryable from the system.  If X fails to find the NVIDIA X
           driver module, please install the `pkg-config` utility and the X.Org SDK/development package for your
           distribution and reinstall the driver.

Leading to:

/usr/lib64/xorg/modules/extensions/libglxserver_nvidia.so
/usr/lib64/xorg/modules/extensions/libglxserver_nvidia.so.550.78
/usr/lib64/xorg/modules/drivers/nvidia_drv.so

For container hosts, neither X (provided by xserver-xorg in Ubuntu) nor the pkg-config for xorg-server (provided by xserver-xorg-dev in Ubuntu) tend to be typically installed (especially for K8s clusters).

If they exist in Ubuntu:
X -showDefaultModulePath: /usr/lib/xorg/modules
pkg-config --variable=moduledir xorg-server: /usr/lib/xorg/modules
X -showDefaultLibPath: /usr/lib/x86_64-linux-gnu
pkg-config --variable=libdir xorg-server: /usr/lib/x86_64-linux-gnu
pkg-config --variable=sysconfigdir xorg-server: /usr/share/X11/xorg.conf.d

Similarly for libglvnd: pkg-config --variable=datadir libglvnd: /usr/share

These are the candidate "environment variables" we are looking for, at least for .run installers. This is not guaranteed when a distro-provided NVIDIA driver package is installed instead of the .run file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants