(Oops, sorry, re-post because I messed up the threading.)
I'm not a developer, nor do I pretend to understand the nuances of memory management. But I signed up for this list just to say "thanks" to all the devs and others that are finally discussing what I consider to be one of the biggest problems with Linux on the desktop.
My experience with desktop Linux distros with SSDs when a few processes start to leak memory, or if I launch a new program when my system is right at the limits, is a full system hang where only the mouse occasionally moves jerkily, and I can't switch to a virtual terminal. I recently learned the SysRq trick to evoke the OOM killer, but I personally think that the kernel should deal with that, not the user. As unfortunate as it is for the OOM killer to have to randomly kill something, I am of the opinion that the OS should *never* lock up, period. I would strongly prefer that one application get killed instead of losing all my applications and working data because of a necessary hard reboot.
I don't know if this helps or not, but anecdotally I started see this issue *after* SSDs became more common, i.e. I don't think I ever experienced it with spinning rust. Maybe something to do with the vastly faster I/O of an SSD, which allows it to more quickly saturate the RAM before the OOM killer has time to react?
Also, I've had relatively low memory KVM guests running on a VPS under very high load, and they never lockup. The OOM killer does occasionally kick in, but the affected daemon or systemd service restarts and it's amazingly undramatic. It appears that this issue only occurs with Xorg (and I imagine Wayland) and "desktop" usage.
As for the problem of the randomness of the OOM killer, couldn't it be made to take into account the PID and/or how long the process has been running? Normally Xorg (and I assume Wayland stuff) gets started before the other desktop programs that tend to consume a lot of memory. So if it's a higher PID and/or has been running for less time, give it a higher score for killability.
In my experience on a system with 8GB of RAM and an SSD, the amount of swap space makes no difference. I've tried with no swap space, with 2GB, with 8GB, etc, and it still hangs under high memory usage. I've also tried tuning a lot of sysctl parameters such as vm.swappiness, vm.vfs_cache_pressure, and vm.min_free_kbytes, to no avail.
Don't know if this helps, but here are some additional discussions of Linux unresponsiveness under low memory situations from a layman's perspective: - osnews.com/story/130117/kde-usability-and-productivity-are-we-there-yet/ (in the comments) - unix.stackexchange.com/questions/373312/oom-killer-doesnt-work-properly-leads-to-a-frozen-os - bbs.archlinux.org/viewtopic.php?id=233843 - askubuntu.com/questions/432809/why-is-kswapd0-running-on-a-computer-with-no-swap/432827#432827 - unix.stackexchange.com/questions/24625/how-to-completely-disable-swap/24646#24646
Thanks again to everyone for looking into this!