On Mon, Aug 12, 2019 at 6:31 PM David Airlie airlied@redhat.com wrote:
On Sun, Aug 11, 2019 at 2:57 AM Georg Sauthoff mail@georg.so wrote:
On Fri, Aug 09, 2019 at 03:50:43PM -0600, Chris Murphy wrote: [..]
Problem and thesis statement: Certain workloads, such as building webkitGTK from source, results in heavy swap usage eventually leading to the system becoming totally unresponsive. Look into switching from disk based swap, to swap on a ZRAM device.
Summary of findings (restated, but basically the same as found at [2]): Test system, Macbook Pro, Intel Core i7-2820QM (4/8 cores), 8GiB RAM, Samsung SSD 840 EVO, Fedora Rawhide Workstation. Test case, build WebKitGTK from source.
[..]
To avoid such issues I disable swap on my machines. I really don't see the point of having a swap partition if you have 16 or 32 GiB RAM. Even with 8 GiB I disable swap.
Disabling swap doesn't avoid the issues, it can in fact make them worse.
If you have apps allocate memory they don't always OOM before the kernel tries to evict text pages, but since SSDs are fast it then tries to pull back in those text pages before realising (that is what most of the latest rounds of articles has been about). Something like firefox runs with no swap, starts to need more memory than the system has, parts of firefox executable get paged out, but then are needed for firefox to use the RAM, and round in circles it goes.
Having swap is still in this day and age better for your system that not having it.
I agree that it's better to have swap for incidental swap purposes, rather than random things just getting abruptly hit with oom. I say random, because I see the oom_score_adj is the same for every process other than systemd-udev, auditd, sshd, and dbus. Plausibly the shell could get oom killed without warning, taking out the entire user session, all apps, and all the build processes.
I just discovered in the log from yesterday, that iotop was subject to oom killer, rather than one of the large cc1plus processes, which is what I'd previously consistently witnessed. So iotop and cc1plus must be in the ballpark oom score wise and oom killer just so happens to pick one or the other. iotop going away relieved just enough memory that nothing else was subject to oom killer, and yet processes were clearly resource starved nevertheless: the GUI was frozen, but then also other processes had already been dying due to timeouts, for example:
Aug 11 18:26:57 fmac.local systemd[1]: sssd-kcm.service: Control process exited, code=killed, status=15/TERM Aug 11 18:26:57 fmac.local systemd[1]: sssd-kcm.service: Failed with result 'timeout'.
Aug 11 18:27:00 fmac.local systemd[1]: systemd-journald.service: State 'stop-sigterm' timed out. Killing. Aug 11 18:27:00 fmac.local systemd[1]: systemd-journald.service: Killing process 31010 (systemd-journal) with signal SIGKILL. Aug 11 18:27:00 fmac.local systemd[1]: systemd-journald.service: Main process exited, code=killed, status=9/KILL
This is like a train wreck where there are all sorts of interesting sub failures happening. At one point I think, well we need better oom scores so the truly lowest important process is killed off. But upon big picture scrutiny, the system is failing before oom killer has been triggered. Processes are dying with timeouts. The GUI including the mouse pointer is frozen, even when swap is half full. Practically speaking, it's a goner the moment the mouse pointer froze the very first time. I might tolerate some stuttering here and there, but minutes of frozen state? Nah - not interested in seeing if this is another 5 minutes of choke, or 5 days.
And that's the bad side of swap is when the system is more than incidentally using it, and is depending on it. And apparently nothing is on a deadline timer if things can just start timing out on their own, including the system journal! That was a surprise to see. If it was that hung up, maybe I can't trust the journal entry times or order, maybe important entries were lost.