I'm not sure where the problem lies, so I'll start out with where I'm coming from...
I built a machine 2 years ago to support the Folding@Home project using F32 and an Nvidia 2060 card.
It's been running non-stop ever since. But I want to update the system to the latest and greatest release and drivers (since it appears that my old? CUDA drivers have stopped working with F@H)
So first I updated all my packages using dnf and attempted a reboot (into the new kernel) and that's where the first problem appeared. The boot process hangs with the last message on the screen being "Notify NFS peers of a restart"
(I don't have any NFS configured on this machine.)
So I rebooted with my old working kernel and examined /var/log/messages and I see: Jan 1 12:36:10 localhost sh[972]: (bad exit status: 2) Jan 1 12:36:10 localhost sh[4378]: Error! Bad return status for module build on kernel: 5.11.22-100.fc32.x86_64 (x86_64) Jan 1 12:36:10 localhost sh[4378]: Consult /var/lib/dkms/nvidia/440.82/build/make.log for more information.
Looking at that file, I see a variety of errors basically telling me there's incompatibilities between the stuff Nvidia uses to get itself 'linked in' at boot time, and what the kernel is now providing.
If I can't get to a GUI screen to be able to fetch and install newer NVIDIA drivers how can I get my system updated?
Suggestions welcome. TIA Fulko
Hi.
On Sat, 01 Jan 2022 12:52:46 -0500 Fulko Hew wrote:
But I want to update the system to the latest and greatest release and drivers (since it appears that my old? CUDA drivers have stopped working with F@H)
Can you confirm that you need CUDA and not only the nvidia drivers ?
If yes, I suggest to not use the rpmfusion repositories since they don't provide CUDA. See below.
The boot process hangs with the last message on the screen being "Notify NFS peers of a restart"
When the GUI doesn't start, those messages are not relevant and this is not a hang.
Jan 1 12:36:10 localhost sh[4378]: Consult /var/lib/dkms/nvidia/440.82/build/make.log for more information.
Looking at that file, I see a variety of errors basically telling me there's incompatibilities between the stuff Nvidia uses to get itself 'linked in' at boot time, and what the kernel is now providing.
If I can't get to a GUI screen to be able to fetch and install newer NVIDIA drivers how can I get my system updated?
Switch to a textual console with Ctlr-Alt-F2 (or F3 F4 ...) or ssh to the machine if you can.
Then, if you need CUDA:
- uninstall the current version - if you installed it whith a nvidiaXXX.run script, run: nvidiaXXX.run --uninstall - otherwise try: dnf remove *nvidia* *cuda*
- follow the instructions at: https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=...
On Sat, Jan 1, 2022 at 1:07 PM Francis.Montagnac@inria.fr wrote:
Can you confirm that you need CUDA and not only the nvidia drivers ?
If yes, I suggest to not use the rpmfusion repositories since they don't provide CUDA. See below.
They don't? Then what are the xorg-x11-drv-nvidia-cuda and xorg-x11-drv-nvidia-cuda-libs packages provided by rpmfusion?
On 01/01/2022 17:52, Fulko Hew wrote:
I'm not sure where the problem lies, so I'll start out with where I'm coming from...
I built a machine 2 years ago to support the Folding@Home project using F32 and an Nvidia 2060 card.
It's been running non-stop ever since. But I want to update the system to the latest and greatest release and drivers (since it appears that my old? CUDA drivers have stopped working with F@H)
So first I updated all my packages using dnf and attempted a reboot (into the new kernel) and that's where the first problem appeared. The boot process hangs with the last message on the screen being "Notify NFS peers of a restart"
(I don't have any NFS configured on this machine.)
So I rebooted with my old working kernel and examined /var/log/messages and I see: Jan 1 12:36:10 localhost sh[972]: (bad exit status: 2) Jan 1 12:36:10 localhost sh[4378]: Error! Bad return status for module build on kernel: 5.11.22-100.fc32.x86_64 (x86_64) Jan 1 12:36:10 localhost sh[4378]: Consult /var/lib/dkms/nvidia/440.82/build/make.log for more information.
Looking at that file, I see a variety of errors basically telling me there's incompatibilities between the stuff Nvidia uses to get itself 'linked in' at boot time, and what the kernel is now providing.
If I can't get to a GUI screen to be able to fetch and install newer NVIDIA drivers how can I get my system updated?
Suggestions welcome. TIA Fulko
There have been several posts in the last few days arising from the shift of some nvidia cards to 'legacy' status. Cards supported by the 470 series driver but not by 495 now have to add the 470xx tag to the rpmfusion package name.
The GTX2060 is listed as supported by 495, but there won't be F32 packages for it. It would probably be best to upgrade to F35 (or F34) to get them; that process now seems to be straightforward and reliable.
I found that the rpmfusion cuda package was not installed by the default akmod build process; "dnf install xorg-x11-drv-nvidia-cuda" should work for you.
HTH
John P
On Jan 1, 2022, at 16:59, John Pilkington johnpilk222@gmail.com wrote:
There have been several posts in the last few days arising from the shift of some nvidia cards to 'legacy' status. Cards supported by the 470 series driver but not by 495 now have to add the 470xx tag to the rpmfusion package name.
That may be true, but that shouldn’t prevent the kernel module from *building*, which this thread is about.
My suggestion is to make sure you are booted into the kernel that exactly matches the version of “kernel-devel” you have installed. As long as you have an internet connection, you can fix that with the text login, you don’t need a graphical login for that. You can kick off a new dkms build too from there too.
— Jonathan Billings
Fulko Hew wrote on 2022/01/02 2:52:
I'm not sure where the problem lies, so I'll start out with where I'm coming from...
I built a machine 2 years ago to support the Folding@Home project using F32 and an Nvidia 2060 card.
If I can't get to a GUI screen to be able to fetch and install newer NVIDIA drivers how can I get my system updated?
Suggestions welcome. TIA Fulko
Well, Fedora 32 (as well as rpmfusion 32) is already EOL on 2021 May, even Fedora 33 is already EOL on 2021 November, no no updates will be received for Fedora 32 or 33.
I suggest to upgrade your system to at least Fedora 34 first.
Regards, Mamoru
John Pilkington wrote:
There have been several posts in the last few days arising from the shift of some nvidia cards to 'legacy' status. Cards supported by the 470 series driver but not by 495 now have to add the 470xx tag to the rpmfusion package name.
The GTX2060 is listed as supported by 495, but there won't be F32 packages for it. It would probably be best to upgrade to F35 (or F34) to get them; that process now seems to be > straightforward and reliable.
hi,
F34 has them, I believe. would consider upgrading to it _at least_ for packages...
cheers, slade
Hi
On Sat, 01 Jan 2022 13:26:47 -0700 Jerry James wrote:
On Sat, Jan 1, 2022 at 1:07 PM Francis.Montagnac@inria.fr wrote:
Can you confirm that you need CUDA and not only the nvidia drivers ?
If yes, I suggest to not use the rpmfusion repositories since they don't provide CUDA. See below.
They don't? Then what are the xorg-x11-drv-nvidia-cuda and xorg-x11-drv-nvidia-cuda-libs packages provided by rpmfusion?
This is only the CUDA driver. It may not be sufficient for this purpose.
Installing them takes less than 2 M of disk space. The full cuda 6.3 G
On 02/01/2022 09:39, Francis.Montagnac@inria.fr wrote:
Hi
On Sat, 01 Jan 2022 13:26:47 -0700 Jerry James wrote:
On Sat, Jan 1, 2022 at 1:07 PM Francis.Montagnac@inria.fr wrote:
Can you confirm that you need CUDA and not only the nvidia drivers ?
If yes, I suggest to not use the rpmfusion repositories since they don't provide CUDA. See below.
They don't? Then what are the xorg-x11-drv-nvidia-cuda and xorg-x11-drv-nvidia-cuda-libs packages provided by rpmfusion?
This is only the CUDA driver. It may not be sufficient for this purpose.
Installing them takes less than 2 M of disk space. The full cuda 6.3 G
Running F34, and using the packages from rpmfusion, I just tried
dnf erase *nvidia*
It offered to remove 16 packages, freeing 681 M. I typed n.
Regards,
John P
On 02/01/2022 19:07, John Pilkington wrote:
On 02/01/2022 09:39, Francis.Montagnac@inria.fr wrote:
Hi
On Sat, 01 Jan 2022 13:26:47 -0700 Jerry James wrote:
On Sat, Jan 1, 2022 at 1:07 PM Francis.Montagnac@inria.fr wrote:
Can you confirm that you need CUDA and not only the nvidia drivers ?
If yes, I suggest to not use the rpmfusion repositories since they don't provide CUDA. See below.
They don't? Then what are the xorg-x11-drv-nvidia-cuda and xorg-x11-drv-nvidia-cuda-libs packages provided by rpmfusion?
This is only the CUDA driver. It may not be sufficient for this purpose.
Installing them takes less than 2 M of disk space. The full cuda 6.3 G
Running F34, and using the packages from rpmfusion, I just tried
dnf erase *nvidia*
It offered to remove 16 packages, freeing 681 M. I typed n.
It isn't clear if you are saying this a problem or just informational.
In my case, I get only 12 and they all seem rational.
Removing: akmod-nvidia-470xx x86_64 3:470.94-1.fc35 @rpmfusion-nonfree-updates 22 k kmod-nvidia-470xx-5.15.10-200.fc35.x86_64 x86_64 3:470.94-1.fc35 @@commandline 103 M kmod-nvidia-470xx-5.15.11-200.fc35.x86_64 x86_64 3:470.94-1.fc35 @@commandline 103 M kmod-nvidia-470xx-5.15.12-200.fc35.x86_64 x86_64 3:470.94-1.fc35 @@commandline 103 M nvidia-persistenced x86_64 3:495.46-1.fc35 @rpmfusion-nonfree-updates 50 k nvidia-settings-470xx x86_64 3:470.94-1.fc35 @rpmfusion-nonfree-updates 4.5 M xorg-x11-drv-nvidia-470xx x86_64 3:470.94-1.fc35 @rpmfusion-nonfree-updates 55 M xorg-x11-drv-nvidia-470xx-cuda x86_64 3:470.94-1.fc35 @rpmfusion-nonfree-updates 4.7 M xorg-x11-drv-nvidia-470xx-cuda-libs x86_64 3:470.94-1.fc35 @rpmfusion-nonfree-updates 138 M xorg-x11-drv-nvidia-470xx-kmodsrc x86_64 3:470.94-1.fc35 @rpmfusion-nonfree-updates 24 M xorg-x11-drv-nvidia-470xx-libs x86_64 3:470.94-1.fc35 @rpmfusion-nonfree-updates 322 M Removing unused dependencies: egl-wayland x86_64 1.1.9-3.fc35 @updates
Freed space: 856 M
On 02/01/2022 11:42, Ed Greshko wrote:
On 02/01/2022 19:07, John Pilkington wrote:
On 02/01/2022 09:39, Francis.Montagnac@inria.fr wrote:
Hi
On Sat, 01 Jan 2022 13:26:47 -0700 Jerry James wrote:
On Sat, Jan 1, 2022 at 1:07 PM Francis.Montagnac@inria.fr wrote:
Can you confirm that you need CUDA and not only the nvidia drivers ?
If yes, I suggest to not use the rpmfusion repositories since they don't provide CUDA. See below.
They don't? Then what are the xorg-x11-drv-nvidia-cuda and xorg-x11-drv-nvidia-cuda-libs packages provided by rpmfusion?
This is only the CUDA driver. It may not be sufficient for this purpose.
Installing them takes less than 2 M of disk space. The full cuda 6.3 G
Running F34, and using the packages from rpmfusion, I just tried
dnf erase *nvidia*
It offered to remove 16 packages, freeing 681 M. I typed n.
It isn't clear if you are saying this a problem or just informational.
In my case, I get only 12 and they all seem rational.
Removing: akmod-nvidia-470xx x86_64 3:470.94-1.fc35 @rpmfusion-nonfree-updates 22 k kmod-nvidia-470xx-5.15.10-200.fc35.x86_64 x86_64 3:470.94-1.fc35 @@commandline 103 M kmod-nvidia-470xx-5.15.11-200.fc35.x86_64 x86_64 3:470.94-1.fc35 @@commandline 103 M kmod-nvidia-470xx-5.15.12-200.fc35.x86_64 x86_64 3:470.94-1.fc35 @@commandline 103 M nvidia-persistenced x86_64 3:495.46-1.fc35 @rpmfusion-nonfree-updates 50 k nvidia-settings-470xx x86_64 3:470.94-1.fc35 @rpmfusion-nonfree-updates 4.5 M xorg-x11-drv-nvidia-470xx x86_64 3:470.94-1.fc35 @rpmfusion-nonfree-updates 55 M xorg-x11-drv-nvidia-470xx-cuda x86_64 3:470.94-1.fc35 @rpmfusion-nonfree-updates 4.7 M xorg-x11-drv-nvidia-470xx-cuda-libs x86_64 3:470.94-1.fc35 @rpmfusion-nonfree-updates 138 M xorg-x11-drv-nvidia-470xx-kmodsrc x86_64 3:470.94-1.fc35 @rpmfusion-nonfree-updates 24 M xorg-x11-drv-nvidia-470xx-libs x86_64 3:470.94-1.fc35 @rpmfusion-nonfree-updates 322 M Removing unused dependencies: egl-wayland x86_64 1.1.9-3.fc35 @updates
Freed space: 856 M
My reply was intended to give some evidence about what the rpmfusion packages provided, and to hint, in response to the preceding post, that it might give a complete version of the cuda set. I posted a list of installed nvidia packages earlier, but have not yet found a way of pointing straight to that post in the HyperKitty archive. So here's another block of text. A few of the packages are, I suppose, not really relevant to the precise topic. The 'command line' ones are products of the akmod process.
{{{
[root@HPFed john]# dnf erase *nvidia* No match for argument: nvidia-bug-report.log.gz No packages marked for removal. Dependencies resolved. Nothing to do. Complete! [root@HPFed john]# dnf erase *nvidia* Dependencies resolved. ====================================================================================================================================================== Package Architecture Version Repository Size ====================================================================================================================================================== Removing: akmod-nvidia-470xx x86_64 3:470.94-1.fc34 @rpmfusion-nonfree-updates 22 k kmod-nvidia-470xx x86_64 3:470.94-1.fc34 @rpmfusion-nonfree-updates 0 kmod-nvidia-470xx-5.15.10-100.fc34.x86_64 x86_64 3:470.94-1.fc34 @@commandline 44 M kmod-nvidia-470xx-5.15.11-100.fc34.x86_64 x86_64 3:470.94-1.fc34 @@commandline 44 M kmod-nvidia-470xx-5.15.12-100.fc34.x86_64 x86_64 3:470.94-1.fc34 @@commandline 44 M nvidia-persistenced x86_64 3:495.46-1.fc34 @rpmfusion-nonfree-updates 50 k nvidia-settings-470xx x86_64 3:470.94-1.fc34 @rpmfusion-nonfree-updates 4.6 M nvidia-texture-tools x86_64 2.1.2-1.fc34 @fedora 1.2 M xorg-x11-drv-nvidia-470xx x86_64 3:470.94-1.fc34 @rpmfusion-nonfree-updates 55 M xorg-x11-drv-nvidia-470xx-cuda x86_64 3:470.94-1.fc34 @rpmfusion-nonfree-updates 4.7 M xorg-x11-drv-nvidia-470xx-cuda-libs x86_64 3:470.94-1.fc34 @rpmfusion-nonfree-updates 138 M xorg-x11-drv-nvidia-470xx-devel x86_64 3:470.94-1.fc34 @rpmfusion-nonfree-updates 46 xorg-x11-drv-nvidia-470xx-kmodsrc x86_64 3:470.94-1.fc34 @rpmfusion-nonfree-updates 24 M xorg-x11-drv-nvidia-470xx-libs x86_64 3:470.94-1.fc34 @rpmfusion-nonfree-updates 322 M xorg-x11-drv-nvidia-470xx-power x86_64 3:470.94-1.fc34 @rpmfusion-nonfree-updates 2.3 k Removing unused dependencies: egl-wayland x86_64 1.1.7-1.fc34 @updates 58 k
Transaction Summary ====================================================================================================================================================== Remove 16 Packages
Freed space: 681 M Is this ok [y/N]: n Operation aborted. [root@HPFed john]# exit
}}}
On 02/01/2022 20:15, John Pilkington wrote:
My reply was intended to give some evidence about what the rpmfusion packages provided, and to hint, in response to the preceding post, that it might give a complete version of the cuda set. I posted a list of installed nvidia packages earlier, but have not yet found a way of pointing straight to that post in the HyperKitty archive. So here's another block of text. A few of the packages are, I suppose, not really relevant to the precise topic. The 'command line' ones are products of the akmod process.
Thanks for that explanation. I must have missed the list you posted earlier.
====================================================================================================================================================== Package Architecture Version Repository Size ====================================================================================================================================================== Removing: akmod-nvidia-470xx x86_64 3:470.94-1.fc34 @rpmfusion-nonfree-updates 22 k kmod-nvidia-470xx x86_64 3:470.94-1.fc34 @rpmfusion-nonfree-updates 0 kmod-nvidia-470xx-5.15.10-100.fc34.x86_64 x86_64 3:470.94-1.fc34 @@commandline 44 M kmod-nvidia-470xx-5.15.11-100.fc34.x86_64 x86_64 3:470.94-1.fc34 @@commandline 44 M kmod-nvidia-470xx-5.15.12-100.fc34.x86_64 x86_64 3:470.94-1.fc34 @@commandline 44 M nvidia-persistenced x86_64 3:495.46-1.fc34 @rpmfusion-nonfree-updates 50 k nvidia-settings-470xx x86_64 3:470.94-1.fc34 @rpmfusion-nonfree-updates 4.6 M nvidia-texture-tools x86_64 2.1.2-1.fc34 @fedora 1.2 M xorg-x11-drv-nvidia-470xx x86_64 3:470.94-1.fc34 @rpmfusion-nonfree-updates 55 M xorg-x11-drv-nvidia-470xx-cuda x86_64 3:470.94-1.fc34 @rpmfusion-nonfree-updates 4.7 M xorg-x11-drv-nvidia-470xx-cuda-libs x86_64 3:470.94-1.fc34 @rpmfusion-nonfree-updates 138 M xorg-x11-drv-nvidia-470xx-devel x86_64 3:470.94-1.fc34 @rpmfusion-nonfree-updates 46 xorg-x11-drv-nvidia-470xx-kmodsrc x86_64 3:470.94-1.fc34 @rpmfusion-nonfree-updates 24 M xorg-x11-drv-nvidia-470xx-libs x86_64 3:470.94-1.fc34 @rpmfusion-nonfree-updates 322 M xorg-x11-drv-nvidia-470xx-power x86_64 3:470.94-1.fc34 @rpmfusion-nonfree-updates 2.3 k Removing unused dependencies: egl-wayland x86_64 1.1.7-1.fc34 @updates 58 k
I don't have nvidia-texture-tools, xorg-x11-drv-nvidia-470xx-power or xorg-x11-drv-nvidia-470xx-devel installed.
In checking them out they seem unnecessary in my use case. But it is good to know of the existence of nvidia-texture-tools and xorg-x11-drv-nvidia-470xx-power
-- Did 황준호 die?
On 02/01/2022 14:43, Ed Greshko wrote:
On 02/01/2022 20:15, John Pilkington wrote:
My reply was intended to give some evidence about what the rpmfusion packages provided, and to hint, in response to the preceding post, that it might give a complete version of the cuda set. I posted a list of installed nvidia packages earlier, but have not yet found a way of pointing straight to that post in the HyperKitty archive. So here's another block of text. A few of the packages are, I suppose, not really relevant to the precise topic. The 'command line' ones are products of the akmod process.
Thanks for that explanation. I must have missed the list you posted earlier.
======================================================================================================================================================
Package Architecture Version Repository Size
Removing: akmod-nvidia-470xx x86_64 3:470.94-1.fc34 @rpmfusion-nonfree-updates 22 k kmod-nvidia-470xx x86_64 3:470.94-1.fc34 @rpmfusion-nonfree-updates 0 kmod-nvidia-470xx-5.15.10-100.fc34.x86_64 x86_64 3:470.94-1.fc34 @@commandline 44 M kmod-nvidia-470xx-5.15.11-100.fc34.x86_64 x86_64 3:470.94-1.fc34 @@commandline 44 M kmod-nvidia-470xx-5.15.12-100.fc34.x86_64 x86_64 3:470.94-1.fc34 @@commandline 44 M nvidia-persistenced x86_64 3:495.46-1.fc34 @rpmfusion-nonfree-updates 50 k nvidia-settings-470xx x86_64 3:470.94-1.fc34 @rpmfusion-nonfree-updates 4.6 M nvidia-texture-tools x86_64 2.1.2-1.fc34 @fedora 1.2 M xorg-x11-drv-nvidia-470xx x86_64 3:470.94-1.fc34 @rpmfusion-nonfree-updates 55 M xorg-x11-drv-nvidia-470xx-cuda x86_64 3:470.94-1.fc34 @rpmfusion-nonfree-updates 4.7 M xorg-x11-drv-nvidia-470xx-cuda-libs x86_64 3:470.94-1.fc34 @rpmfusion-nonfree-updates 138 M xorg-x11-drv-nvidia-470xx-devel x86_64 3:470.94-1.fc34 @rpmfusion-nonfree-updates 46 xorg-x11-drv-nvidia-470xx-kmodsrc x86_64 3:470.94-1.fc34 @rpmfusion-nonfree-updates 24 M xorg-x11-drv-nvidia-470xx-libs x86_64 3:470.94-1.fc34 @rpmfusion-nonfree-updates 322 M xorg-x11-drv-nvidia-470xx-power x86_64 3:470.94-1.fc34 @rpmfusion-nonfree-updates 2.3 k Removing unused dependencies: egl-wayland x86_64 1.1.7-1.fc34 @updates 58 k
I don't have nvidia-texture-tools, xorg-x11-drv-nvidia-470xx-power or xorg-x11-drv-nvidia-470xx-devel installed.
In checking them out they seem unnecessary in my use case. But it is good to know of the existence of nvidia-texture-tools and xorg-x11-drv-nvidia-470xx-power
I have no idea how useful they might be. I think I installed them via dnfdragora, the graphical UI for dnf, after a search for nvidia-related packages. But it needs a working graphical environment, and creating its cache feels very slow...