Also check /usr/bin/ path for nvidia cards e.g. in WSL #18

RJKeevil · 2024-12-02T13:51:51Z

Use an additional path to search for nvidia GPUs

janpfeifer · 2024-12-03T07:12:46Z

So this function is called if a cuda PJRT plugin was found, but it is not sure if there is an actual GPU card installed.

The use case is: a demo docker with all the PJRTs installed shouldn't attempt to run a cuda PJRT if it is running on a computer with no GPUs.

Looking at /usr/bin/nvidia will only detect that the nvidia programs are installed, not whether there is an actual GPU card installed.

Now looking at /dev/nvidia* seems not to be fail proof either ... Let's chat later maybe we could look at:

ls -ld /sys/module/nvidia*
ls -ld /sys/bus/pci/drivers/nvidia*

Is any of those not empty in your container set up ?

In the meantime, I'm adding documentation and logging to the function, including logging of the work around: providing the absolute path to the cuda pjrt. See #19

RJKeevil · 2024-12-03T08:07:48Z

Both of these paths are empty, i think theres some wizardry with Docker Desktop and WSL2 where the container somehow just delegates to the cuda drivers on the host OS. nvidia-smi is added to path, perhaps issuing that command is a reasonably generic way to see if Cuda is present in a system?

janpfeifer · 2024-12-03T08:12:20Z

I hesitate making the test depend on the installation of nvidia-smi. For instance, the demo docker doesn't contain it, even though it works with NVidia CUDA. Also, I'm not sure about distribution rights of these nvidia tools. The legalese is not clear to me ... but maybe it's an option. Let me search around for alternatives in Windows WSL.

RJKeevil · 2024-12-03T08:22:43Z

Agreed, I dont think it should depend on it; the current check could still look for nvidia files but calling nvidia-smi could be a fallback for this case? I've looked further in the container, only other evidence I can find for the presence of cuda is the presence of /usr/lib/wsl/drivers/nv_dispi.inf_amd64_adf5a840df867035

janpfeifer · 2024-12-03T08:27:20Z

I posted the question in NVidia forums:

https://forums.developer.nvidia.com/t/how-to-detect-the-presence-of-a-gpu-card-in-windows-wsl-wsl2/315453

janpfeifer · 2024-12-03T08:28:48Z

Yes, checking if nvidia-smi is available and then executing it to check is a very viable option. Do you want to implement that ?

RJKeevil force-pushed the nvidia-detection branch from 6b7aa2e to 0c1cfc0 Compare December 3, 2024 09:09

Backup check of nvidia-smi to detect nvidia cards e.g. in WSL

fd2179a

RJKeevil force-pushed the nvidia-detection branch from 0c1cfc0 to fd2179a Compare December 3, 2024 09:22

janpfeifer merged commit c8c81d9 into gomlx:main Dec 3, 2024
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Also check /usr/bin/ path for nvidia cards e.g. in WSL #18

Also check /usr/bin/ path for nvidia cards e.g. in WSL #18

RJKeevil commented Dec 2, 2024

janpfeifer commented Dec 3, 2024

RJKeevil commented Dec 3, 2024 •

edited

Loading

janpfeifer commented Dec 3, 2024

RJKeevil commented Dec 3, 2024

janpfeifer commented Dec 3, 2024

janpfeifer commented Dec 3, 2024

Also check /usr/bin/ path for nvidia cards e.g. in WSL #18

Also check /usr/bin/ path for nvidia cards e.g. in WSL #18

Conversation

RJKeevil commented Dec 2, 2024

janpfeifer commented Dec 3, 2024

RJKeevil commented Dec 3, 2024 • edited Loading

janpfeifer commented Dec 3, 2024

RJKeevil commented Dec 3, 2024

janpfeifer commented Dec 3, 2024

janpfeifer commented Dec 3, 2024

RJKeevil commented Dec 3, 2024 •

edited

Loading