People,
I have been having fairly regular hangs of the whole system - sometimes I have a little warning of a rapidly slowing system and I can systematically exit from all the apps and reboot cleanly - but mostly not and I have to hit the reset button. I used to think it was overloading the (fast) system (with 32GB RAM) with too many Chrome windows and tabs but it happens occasionally when the system is very lightly loaded too. I have new RAM and have checked it with the Pro version of MemTest86 a couple of times so I don't think it is RAM. I provide the last lines of the last six /var/log/messages file just prior to the hang - there is a consistent "SERVICE_STOP pid=1 uid=0" command that occurs but I don't know if that is normal or not . .
Any suggestions about further debugging would be greatly appreciated!
Regards,
Phil.
Aug 9 22:00:45 localhost nm-dispatcher[7265]: req:3 'down' [virbr0-nic]: new request (3 scripts) Aug 9 22:00:45 localhost nm-dispatcher[7265]: req:3 'down' [virbr0-nic]: start running ordered scripts... Aug 9 22:00:48 localhost systemd[1]: systemd-hostnamed.service: Succeeded. Aug 9 22:00:48 localhost audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=systemd-hostnamed comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? a Aug 9 22:00:54 localhost dhclient[1199]: XMT: Solicit on enp0s31f6, interval 35050ms. Aug 9 22:00:54 localhost dhclient[1199]: RCV: Advertise message on enp0s31f6 from fe80::7e8b:caff:fece:f620. Aug 9 22:00:54 localhost systemd[1]: fprintd.service: Succeeded. Aug 9 22:00:54 localhost audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=fprintd comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? term Aug 9 22:00:55 localhost systemd[1]: NetworkManager-dispatcher.service: Succeeded. Aug 9 22:00:55 localhost audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=NetworkManager-dispatcher comm="systemd" exe="/usr/lib/systemd/systemd" host 1:messages_hang1
unix 5466,1 Bot Aug 12 07:30:12 phil audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=pmlogger_daily-poll comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? add Aug 12 07:30:12 phil audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=pmlogger_daily-poll comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr Aug 12 07:28:54 localhost systemd[1]: Started Check PMIE instances are running. Aug 12 07:28:54 localhost audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=pmie_check comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? Aug 12 07:28:54 localhost audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=pmie_check comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? t Aug 12 07:30:12 localhost systemd[1]: Starting Poll log processing... Aug 12 07:30:12 localhost systemd[1]: pmlogger_daily-poll.service: Succeeded. Aug 12 07:30:12 localhost systemd[1]: Started Poll log processing. Aug 12 07:30:12 localhost audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=pmlogger_daily-poll comm="systemd" exe="/usr/lib/systemd/systemd" hostname= Aug 12 07:30:12 localhost audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=pmlogger_daily-poll comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? 2:messages_hang2
unix 1408,1 Bot Aug 13 03:03:35 localhost dnf[32442]: RPM Fusion for Fedora 30 - Free 12 kB/s | 16 kB 00:01 Aug 13 03:03:36 localhost dnf[32442]: RPM Fusion for Fedora 30 - Nonfree - Updates 14 kB/s | 14 kB 00:01 Aug 13 03:03:37 localhost dnf[32442]: RPM Fusion for Fedora 30 - Nonfree - Updates 32 kB/s | 42 kB 00:01 Aug 13 03:03:38 localhost dnf[32442]: RPM Fusion for Fedora 30 - Nonfree 12 kB/s | 15 kB 00:01 Aug 13 03:03:38 localhost dnf[32442]: vivaldi 55 kB/s | 2.9 kB 00:00 Aug 13 03:03:39 localhost dnf[32442]: Metadata cache created. Aug 13 03:03:39 localhost systemd[1]: dnf-makecache.service: Succeeded. Aug 13 03:03:39 localhost systemd[1]: Started dnf makecache. Aug 13 03:03:39 localhost audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=dnf-makecache comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr Aug 13 03:03:39 localhost audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=dnf-makecache comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr= 3:messages_hang3
unix 2685,1 Bot Aug 25 13:55:35 phil audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=pmlogger_check comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? te Aug 25 13:58:34 phil systemd[1]: Starting Check PMIE instances are running... Aug 25 13:58:34 phil systemd[1]: pmie_check.service: Succeeded. Aug 25 13:58:34 phil systemd[1]: Started Check PMIE instances are running. Aug 25 13:58:34 phil audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=pmie_check comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? termi Aug 25 13:58:34 phil audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=pmie_check comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? termin Aug 25 13:58:34 phil systemd[1]: pmie_check.service: Succeeded. Aug 25 13:58:34 phil systemd[1]: Started Check PMIE instances are running. Aug 25 13:58:34 phil audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=pmie_check comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? termi Aug 25 13:58:34 phil audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=pmie_check comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? termin 4:messages_hang4
unix 572,1 Bot Aug 27 06:30:16 phil audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=pmlogger_daily-poll comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? add Aug 27 06:30:16 phil audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=pmlogger_daily-poll comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr Aug 27 06:48:36 phil systemd[1]: Starting dnf makecache... Aug 27 06:48:36 phil dnf[27178]: Metadata cache refreshed recently. Aug 27 06:48:36 phil systemd[1]: dnf-makecache.service: Succeeded. Aug 27 06:48:36 phil systemd[1]: Started dnf makecache. Aug 27 06:48:36 phil audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=dnf-makecache comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? te Aug 27 06:48:36 phil audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=dnf-makecache comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? ter ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^ Aug 27 06:48:36 phil audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=dnf-makecache comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? ter 5:messages_hang5
unix 3551,1 Bot Aug 27 21:28:04 phil audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=pmie_check comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? termi Aug 27 21:28:04 phil audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=pmie_check comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? termin Aug 27 21:28:13 phil systemd[1]: systemd-hostnamed.service: Succeeded. Aug 27 21:28:13 phil audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=systemd-hostnamed comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? Aug 27 21:28:04 phil systemd[1]: Started Check PMIE instances are running. Aug 27 21:28:04 phil audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=pmie_check comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? termi Aug 27 21:28:04 phil audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=pmie_check comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? termin Aug 27 21:28:13 phil systemd[1]: systemd-hostnamed.service: Succeeded. Aug 27 21:28:13 phil audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=systemd-hostnamed comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? 6:messages_hang6
unix 2643,1 Bot "messages_hang6" 2643L, 316449C
On 19-08-27 08:07:17, Philip Rhoades wrote:
People,
...there is a consistent "SERVICE_STOP pid=1 uid=0" command that occurs but I don't know if that is normal or not . .
...
That just means that systemd (pid 1) did something that auditd reported. There are lots of them. E.g., your:
Aug 9 22:00:48 localhost audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=systemd-hostnamed comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? a
is about systemd doing something with unit=systemd-hostnamed.
I have a System Load Monitor applet and a Disk Load Monitor in my taskbar, so I can see if memory is filling up or the disk is busy. Still, I think you have some other problem if the system completely hangs instead of just getting very very slow.
Tony,
On 2019-08-28 07:09, Tony Nelson wrote:
On 19-08-27 08:07:17, Philip Rhoades wrote:
People,
...there is a consistent "SERVICE_STOP pid=1 uid=0" command that occurs but I don't know if that is normal or not . .
...
That just means that systemd (pid 1) did something that auditd reported. There are lots of them. E.g., your:
Aug 9 22:00:48 localhost audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=systemd-hostnamed comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? a
is about systemd doing something with unit=systemd-hostnamed.
Right - I thought that was probably the case - thanks.
I have a System Load Monitor applet and a Disk Load Monitor in my taskbar, so I can see if memory is filling up or the disk is busy.
Right - I have 32GB RAM and the same of swap and I hardly ever see the swap being used . .
Still, I think you have some other problem if the system completely hangs instead of just getting very very slow.
Yes, me too . . but how to determine what the problem is . . it looks like whatever it is, messages related to it are not being recorded . .
Regards,
Phil.
On Tue, 27 Aug 2019 at 23:40, Philip Rhoades phil@pricom.com.au wrote:
Tony,
On 2019-08-28 07:09, Tony Nelson wrote:
On 19-08-27 08:07:17, Philip Rhoades wrote:
People,
...there is a consistent "SERVICE_STOP pid=1 uid=0" command that occurs but I don't know if that is normal or not . .
...
That just means that systemd (pid 1) did something that auditd reported. There are lots of them. E.g., your:
Aug 9 22:00:48 localhost audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=systemd-hostnamed comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? a
is about systemd doing something with unit=systemd-hostnamed.
Right - I thought that was probably the case - thanks.
I have a System Load Monitor applet and a Disk Load Monitor in my taskbar, so I can see if memory is filling up or the disk is busy.
Right - I have 32GB RAM and the same of swap and I hardly ever see the swap being used . .
Still, I think you have some other problem if the system completely hangs instead of just getting very very slow.
Yes, me too . . but how to determine what the problem is . . it looks
like whatever it is, messages related to it are not being recorded . .
A rule of thumb is that random failures indicates hardware problems while reproducible failures indicate software. These days software problems often generate log entries, so absence of log entries points to hardware. Also, many the same hardware failure affects a particular model, so it is worth searching for problem reports for your model.
I ended up with a "free" ThinkPad that was behaving much as you describe. The cooling system was full of dust, but that model was easy to open up to expose the heatsink, fan, and ducts so I just cleaned it up and used it for many years.
Visual inspection sometimes reveals damage that is hard to detect with software. I once had a system where networking seemed fine except NFS was failing -- visual inspection of the network card revealed fried components. I've had a number of systems with failed capacitors (image) http://www.fixya.com/fullimage.html?src=http://i.fixya.net/uploads/images/25980851-y0it05ineeq3t1lxjrw0uiu3-1-0.jpg or failed connectors (easily disconnected, some even fell off as system was being moved).
Use a utility to display temperatures from various sensors, e.g., lm_sensors https://www.ostechnix.com/view-cpu-temperature-linux/
Sniff the power supply for burnt smells. You may have to take the cover off the power supply to look for damaged components. Inexpensive power supply testers are available.