On Sun, Jun 07, 2020 at 05:25:15PM -0600, Chris Murphy wrote:
> This is not generally true, only if RAM gets so tight that
applications
> start competing for swap.
> This is why I've proposed test cases testing exactly that, as for
> the case of persistent swap I'd expect the outcome to be a clear win for
> disk swap. (Although this can in some cases also be seen as bug, as this
> would be applications not really using the allocated space)
I don't follow this. Where are the proposed test cases? And also in
what case are you saying disk swap is a clear win?
I was referencing the testcases from the email before that, but your
webkitgtk compile might also work for that.
What I described as persistent swap is stuff that gets swapped out and
not swapped back in for hours or days.
> Until about 95% mem usage I'd expect the disk swap case to
win, as it
> should behave the same as no swap (with matching swappiness values)
Why would disk based swap win? In this example, where there's been no
page outs, the zram device isn't using any memory. Again, it is not a
preallocation.
Yes, its a quite boring example, but I've included it for completeness
as a border case. This is just the few megabytes it needs preallocated,
whilst swap is not in use at all.
> At 150% memory usage assuming a 2:1 compression ratio this would
mean:
> - disk swap:
> has to write 4G to disk initially, and for reading swap another 4G
> (12G total traffic - 4G initial, 4G swapping out and 4G swapping in)
> - zram, assuming 4G zram swap:
> has to write 8G to zram initially, and for reading the data swap 16G
> (24G total traffic - 8G initial, 8G swapping out and 8G swapping in)
swap contains anonymous pages, so I'm not sure what you mean by
initial. Whether these pages are internet or typed in or come from
persistent storage - it's a wash between disk or zram swap so it can
be ignored.
I was calculating it from the viewpoint of data, e.g. paging out a
certain amount of data, and paging it in again. "Initial" would be the
amount of data when paging in.
What is definitely different is that I thought of 1 or 2 processes
eating away memory, but not of many thrashing swap. For those it is
definitely not possible to recover from it once thrashing has started.
Also I don't understand any of your math,how you start with a 4G
zram
swap but have 8G. I think you're confused. The cap of 4GiB is the
device size. The actual amount of RAM it uses will be less due to
compression. The zram device size is not the amount of memory used.
And in no case is there a preallocation of memory unless the zram
device is used. It is easy to get confused, by the way. That was my
default state for days upon first stumbling on this.
I assumed a 2:1 compression rate, so the zram swap holds 8G of data in a
4G zram device. I've calculated with filling the zram device to the
max, so it will use the full 4G. (the 4G limit was arbitrarily chosen)
This task only succeeds with ~12+G of disk based swap. Which is just
not realistic. It's a clearly overcommitted and thus contrived test.
This sounds like it's just failing earlier. But it's still a test case.
But I love it and hate it at the same time. More realistic is to not
use defaults, and set the number of jobs manually to 6. And in this
case, zram based swap consistently beats disk based swap.
Which makes sense because pretty much all of the inactive pages are
going to be needed at some point by the compile or they are dropped.
Following the compile there aren't a lot of inactive pages left, and
I'm not sure they're even related to the compile at all.
Especially for a compile those pages are needed quite soon, so thrashing
occurs earlier too. For this it makes a lot of sense that zram is a big
benefit for it.
When I reached the memory limit my usecase was usually having chrome and
firefox open, with firefox having about 500 open tabs, so most of the
data could stay in swap until I triggered swap in, which is very
different from a compiling.
Even under manual control we've got examples of the GUI becoming
completely stuck. Long threads in devel@ based on this Workstation
working group issue - with the same name. So just search archives for
interactivity. Or maybe webkitgtk.
I'm afraid I've read most of those, I usually read all mails to devel@.
So far it seemed mostly like exceptions, but it might also be a specific
configuration on my systems and this issue is more widespread.
earlyoom will kill in such a case even if you can't. It's
configurable
and intentionally simplistic, based on memory and swap free
percentage.
I don't have any experience with it, as I use the time from slowdown
until OOM to try to manage the issue myself, usually successful.
But as mentioned above, I might have a specialized usecase, so my
experience might not reflect the average users' experience.
All the best,
David