John M. Harris Jr wrote:
Userspace isn't dead when a system is thrashing. Your software is
still
running. If it gets killed, you're most likely going to lose your data.
The thing is, there are various levels of thrashing. In some cases, the
system is so busy that you have no chance to bring it back to responsiveness
for many minutes, up to hours. (Other than hitting the Reset or Power
button, of course.) I have had cases where not even sshd would respond. (The
fact that login has been blocking on D-Bus since the introduction of
systemd-logind does not help either. Login timeouts are something that was
just never happening in the past, now they are common under heavy load.)
That said, I do not see how the EarlyOOM heuristic, which allows, depending
on the exact settings, something like 80-90% of swap to be used IN ADDITION
to 90+% RAM (and will only start doing anything if BOTH RAM and swap are
full) can prevent thrashing in any reliable way. My thrashing scenarios have
had much less swap than that used. (I have twice as much swap than RAM, so
when the EarlyOOM heuristics trigger, my programs are already trying to use
almost 3 times as much RAM as is actually available!)
Kevin Kofler