The swapon(8) man page says:
The swap file implementation in the kernel expects to be able to write to the file directly, without the assistance of the filesystem. This is a problem on files with holes or on copy-on-write files on filesystems like Btrfs.
As I'm getting OOM errors when I try to run a VM, it looks like I need a swap file or partition. I'd prefer to use part of my (large) SSD for this, but currently it's entirely formatted as BTRFS. Do I need to resize the BTRFS partition rather than using a swap file?
poc
On 11/5/20 9:02 AM, Patrick O'Callaghan wrote:
The swapon(8) man page says:
The swap file implementation in the kernel expects to be able to write to the file directly, without the assistance of the filesystem. This is a problem on files with holes or on copy-on-write files on filesystems like Btrfs.
As I'm getting OOM errors when I try to run a VM, it looks like I need a swap file or partition. I'd prefer to use part of my (large) SSD for this, but currently it's entirely formatted as BTRFS. Do I need to resize the BTRFS partition rather than using a swap file?
Yes. But have you enabled zram yet for swap?
On Thu, 2020-11-05 at 10:47 -0800, Samuel Sieb wrote:
On 11/5/20 9:02 AM, Patrick O'Callaghan wrote:
The swapon(8) man page says:
The swap file implementation in the kernel expects to be able to write to the file directly, without the assistance of the filesystem. This is a problem on files with holes or on copy-on-write files on filesystems like Btrfs.
As I'm getting OOM errors when I try to run a VM, it looks like I need a swap file or partition. I'd prefer to use part of my (large) SSD for this, but currently it's entirely formatted as BTRFS. Do I need to resize the BTRFS partition rather than using a swap file?
Yes. But have you enabled zram yet for swap?
This is a clean install, so it's enabled by default and has reserved 4GB. I think that's what's actually causing the OOMs. I have 16GB of RAM. On F32 I could run an 8GB VM with hugepages, i.e. dedicated memory, plus normal stuff including multiple browser tabs etc. and never had a problem. Now as soon as I try to start the VM it gets OOM errors, and multiple Firefox tabs are failing and have to be restarted.
poc
On 11/5/20 2:23 PM, Patrick O'Callaghan wrote:
On Thu, 2020-11-05 at 10:47 -0800, Samuel Sieb wrote:
Yes. But have you enabled zram yet for swap?
This is a clean install, so it's enabled by default and has reserved 4GB. I think that's what's actually causing the OOMs. I have 16GB of RAM. On F32 I could run an 8GB VM with hugepages, i.e. dedicated memory, plus normal stuff including multiple browser tabs etc. and never had a problem. Now as soon as I try to start the VM it gets OOM errors, and multiple Firefox tabs are failing and have to be restarted.
Check what's using the memory. zram doesn't *reserve* memory. It doesn't use any memory until you start swapping out. Try increasing the zram size. I would suggest at least 12GB. On my 12GB laptop, I have it set to 12GB.
On Thu, 2020-11-05 at 15:23 -0800, Samuel Sieb wrote:
Check what's using the memory. zram doesn't *reserve* memory. It doesn't use any memory until you start swapping out. Try increasing the zram size. I would suggest at least 12GB. On my 12GB laptop, I have it set to 12GB.
This is interesting, I was running F33 with the the following configurations.
BTRFS RAID1 with two subvolumes / and home
The laptop has 32GB of RAM and 4GB ZRAM
I was experiencing delayed reads especially when I am trying to open a file from any application (gedit, TeXstudio, etc.) for example, when I click file open, I will have to wait over 30 seconds before nautilus opens so that I can select the file.
To see if it was related with filesystem I switch to my traditional installation with RAID1, LVM and XFS filesystem and I haven't experience the similar behaviour when I was using BTRFS. I will try your recommendations and see if it resolves the issue.
On Thu, 2020-11-05 at 15:23 -0800, Samuel Sieb wrote:
On 11/5/20 2:23 PM, Patrick O'Callaghan wrote:
On Thu, 2020-11-05 at 10:47 -0800, Samuel Sieb wrote:
Yes. But have you enabled zram yet for swap?
This is a clean install, so it's enabled by default and has reserved 4GB. I think that's what's actually causing the OOMs. I have 16GB of RAM. On F32 I could run an 8GB VM with hugepages, i.e. dedicated memory, plus normal stuff including multiple browser tabs etc. and never had a problem. Now as soon as I try to start the VM it gets OOM errors, and multiple Firefox tabs are failing and have to be restarted.
Check what's using the memory. zram doesn't *reserve* memory. It doesn't use any memory until you start swapping out. Try increasing the zram size. I would suggest at least 12GB. On my 12GB laptop, I have it set to 12GB.
Trying to get my head around that. You mean all of your RAM is potentially usable as compressed swap? How does that work? Surely it can never reach that limit?
poc
On 11/6/20 3:36 AM, Patrick O'Callaghan wrote:
On Thu, 2020-11-05 at 15:23 -0800, Samuel Sieb wrote:
On 11/5/20 2:23 PM, Patrick O'Callaghan wrote:
On Thu, 2020-11-05 at 10:47 -0800, Samuel Sieb wrote:
Yes. But have you enabled zram yet for swap?
This is a clean install, so it's enabled by default and has reserved 4GB. I think that's what's actually causing the OOMs. I have 16GB of RAM. On F32 I could run an 8GB VM with hugepages, i.e. dedicated memory, plus normal stuff including multiple browser tabs etc. and never had a problem. Now as soon as I try to start the VM it gets OOM errors, and multiple Firefox tabs are failing and have to be restarted.
Check what's using the memory. zram doesn't *reserve* memory. It doesn't use any memory until you start swapping out. Try increasing the zram size. I would suggest at least 12GB. On my 12GB laptop, I have it set to 12GB.
Trying to get my head around that. You mean all of your RAM is potentially usable as compressed swap? How does that work? Surely it can never reach that limit?
No, that's the uncompressed size. In general, the compression is at least 3:1 so the 12GB of swap takes up a maximum of 4GB of RAM. My zram config appears to be a little confused at this point, but here's what one device looks like: # zramctl NAME ALGORITHM DISKSIZE DATA COMPR TOTAL STREAMS MOUNTPOINT /dev/zram1 lz4 5G 4.9G 1.3G 1.4G 4
It's currently storing 4.9GB of swap data using 1.4GB of RAM.
On Fri, 2020-11-06 at 10:59 -0800, Samuel Sieb wrote:
On 11/6/20 3:36 AM, Patrick O'Callaghan wrote:
On Thu, 2020-11-05 at 15:23 -0800, Samuel Sieb wrote:
On 11/5/20 2:23 PM, Patrick O'Callaghan wrote:
On Thu, 2020-11-05 at 10:47 -0800, Samuel Sieb wrote:
Yes. But have you enabled zram yet for swap?
This is a clean install, so it's enabled by default and has reserved 4GB. I think that's what's actually causing the OOMs. I have 16GB of RAM. On F32 I could run an 8GB VM with hugepages, i.e. dedicated memory, plus normal stuff including multiple browser tabs etc. and never had a problem. Now as soon as I try to start the VM it gets OOM errors, and multiple Firefox tabs are failing and have to be restarted.
Check what's using the memory. zram doesn't *reserve* memory. It doesn't use any memory until you start swapping out. Try increasing the zram size. I would suggest at least 12GB. On my 12GB laptop, I have it set to 12GB.
Trying to get my head around that. You mean all of your RAM is potentially usable as compressed swap? How does that work? Surely it can never reach that limit?
No, that's the uncompressed size. In general, the compression is at least 3:1 so the 12GB of swap takes up a maximum of 4GB of RAM. My zram config appears to be a little confused at this point, but here's what one device looks like: # zramctl NAME ALGORITHM DISKSIZE DATA COMPR TOTAL STREAMS MOUNTPOINT /dev/zram1 lz4 5G 4.9G 1.3G 1.4G 4
It's currently storing 4.9GB of swap data using 1.4GB of RAM.
OK.
poc
On 11/5/20 4:23 PM, Patrick O'Callaghan wrote:
This is a clean install, so it's enabled by default and has reserved 4GB. I think that's what's actually causing the OOMs. I have 16GB of RAM. On F32 I could run an 8GB VM with hugepages, i.e. dedicated memory, plus normal stuff including multiple browser tabs etc. and never had a problem. Now as soon as I try to start the VM it gets OOM errors, and multiple Firefox tabs are failing and have to be restarted.
poc
maybe is:
https://fedoraproject.org/wiki/Changes/KDEEarlyOOM
try:
sudo systemctl disable earlyoom
and
restart
On Thu, 2020-11-05 at 17:46 -0600, Gabriel Ramirez wrote:
On 11/5/20 4:23 PM, Patrick O'Callaghan wrote:
This is a clean install, so it's enabled by default and has reserved 4GB. I think that's what's actually causing the OOMs. I have 16GB of RAM. On F32 I could run an 8GB VM with hugepages, i.e. dedicated memory, plus normal stuff including multiple browser tabs etc. and never had a problem. Now as soon as I try to start the VM it gets OOM errors, and multiple Firefox tabs are failing and have to be restarted.
poc
maybe is:
https://fedoraproject.org/wiki/Changes/KDEEarlyOOM
try:
sudo systemctl disable earlyoom
Yes, I may do that. However I'd also like to tune the system so it's not needed. It was never a problem on F32.
poc
On Thu, Nov 5, 2020, 3:24 PM Patrick O'Callaghan pocallaghan@gmail.com wrote:
This is a clean install, so it's enabled by default and has reserved 4GB. I think that's what's actually causing the OOMs. I have 16GB of RAM. On F32 I could run an 8GB VM with hugepages, i.e. dedicated memory, plus normal stuff including multiple browser tabs etc. and never had a problem. Now as soon as I try to start the VM it gets OOM errors, and multiple Firefox tabs are failing and have to be restarted.
Oof!
One 8G VM , and how many FF tabs? What VM guest?
You can certainly experiment with a custom zram-generator.conf and xhange the limit to 16G. You'll get an 8G zram device in this case, if you make no other changes to the configuration.
I'm curious at what minimum zram allocation this problem doesn't happen. 6G, 8G, 10G?
-- Chris Murphy
On Wed, 2020-11-11 at 01:12 -0700, Chris Murphy wrote:
On Thu, Nov 5, 2020, 3:24 PM Patrick O'Callaghan pocallaghan@gmail.com wrote:
This is a clean install, so it's enabled by default and has reserved 4GB. I think that's what's actually causing the OOMs. I have 16GB of RAM. On F32 I could run an 8GB VM with hugepages, i.e. dedicated memory, plus normal stuff including multiple browser tabs etc. and never had a problem. Now as soon as I try to start the VM it gets OOM errors, and multiple Firefox tabs are failing and have to be restarted.
Oof!
One 8G VM , and how many FF tabs? What VM guest?
The guest is Windows 10. I have since configured a large swapfile on my SSD (with BTRFS) and the improvement has been dramatic, both in performance and in OOMs (which have vanished completely). There's still a zram device as well (4GB) but I've just left it untouched for now.
You can certainly experiment with a custom zram-generator.conf and xhange the limit to 16G. You'll get an 8G zram device in this case, if you make no other changes to the configuration.
I'm curious at what minimum zram allocation this problem doesn't happen. 6G, 8G, 10G?
Maybe I'll get round to testing that, but for now it works and I'm not keen to mess with it.
poc
You can use swapfile on btrfs with nocow
See: https://btrfs.wiki.kernel.org/index.php/FAQ#Does_btrfs_support_swap_files.3F
And
https://wiki.archlinux.org/index.php/btrfs#Swap_file for instructions
On 11/5/20 9:43 PM, Qiyu Yan wrote:
You can use swapfile on btrfs with nocow
See: https://btrfs.wiki.kernel.org/index.php/FAQ#Does_btrfs_support_swap_files.3F
So it's something new with 5.x kernels. I wonder how that works with the checksumming.
Yes, only available on 5.x+ kernel and only for one disk btrfs volume.
And for btrfs, nocow means no checksumming. Because you can't keep atomicity with csum when not using cow.
On Fri, 2020-11-06 at 05:43 +0000, Qiyu Yan wrote:
You can use swapfile on btrfs with nocow
See: https://btrfs.wiki.kernel.org/index.php/FAQ#Does_btrfs_support_swap_files.3F
And
https://wiki.archlinux.org/index.php/btrfs#Swap_file for instructions
Great. I was actually wondering about precisely that, so I'm glad to see it's supported with the right incantations.
poc
On Fri, 2020-11-06 at 11:18 +0000, Patrick O'Callaghan wrote:
On Fri, 2020-11-06 at 05:43 +0000, Qiyu Yan wrote:
You can use swapfile on btrfs with nocow [...] https://wiki.archlinux.org/index.php/btrfs#Swap_file for instructions
Great. I was actually wondering about precisely that, so I'm glad to see it's supported with the right incantations.
Trying to follow the instructions in the above URL, but clearly I'm misunderstanding something:
# btrfs sub create /SWAP Create subvolume '//SWAP' # btrfs property set /SWAP compression no # btrfs sub show /SWAP root/SWAP Name: SWAP UUID: 1d4d839a-fd4e-d345-8d3f-ea858af71982 Parent UUID: - Received UUID: - Creation time: 2020-11-06 11:41:37 +0000 Subvolume ID: 1311 Generation: 14657 Gen at creation: 14547 Parent ID: 257 Top level ID: 257 Flags: - Snapshot(s): # cd /SWAP # truncate -s 0 ./swapfile # chattr +C /SWAP chattr: Invalid argument while setting flags on /SWAP
What am I missing?
poc
Patrick O'Callaghan pocallaghan@gmail.com 于2020年11月6日周五 下午8:14写道:
On Fri, 2020-11-06 at 11:18 +0000, Patrick O'Callaghan wrote:
On Fri, 2020-11-06 at 05:43 +0000, Qiyu Yan wrote:
You can use swapfile on btrfs with nocow [...] https://wiki.archlinux.org/index.php/btrfs#Swap_file for instructions
Great. I was actually wondering about precisely that, so I'm glad to see it's supported with the right incantations.
Trying to follow the instructions in the above URL, but clearly I'm misunderstanding something:
# btrfs sub create /SWAP Create subvolume '//SWAP' # btrfs property set /SWAP compression no # btrfs sub show /SWAP root/SWAP Name: SWAP UUID: 1d4d839a-fd4e-d345-8d3f-ea858af71982 Parent UUID: - Received UUID: - Creation time: 2020-11-06 11:41:37 +0000 Subvolume ID: 1311 Generation: 14657 Gen at creation: 14547 Parent ID: 257 Top level ID: 257 Flags: - Snapshot(s): # cd /SWAP # truncate -s 0 ./swapfile # chattr +C /SWAP chattr: Invalid argument while setting flags on /SWAP
Try setting +C flag on the file rather than the subvolume itself
What am I missing?
poc _______________________________________________ users mailing list -- users@lists.fedoraproject.org To unsubscribe send an email to users-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org
On Fri, 2020-11-06 at 20:27 +0800, Qiyu Yan wrote:
Patrick O'Callaghan pocallaghan@gmail.com 于2020年11月6日周五 下午8:14写道:
On Fri, 2020-11-06 at 11:18 +0000, Patrick O'Callaghan wrote:
On Fri, 2020-11-06 at 05:43 +0000, Qiyu Yan wrote:
You can use swapfile on btrfs with nocow [...] https://wiki.archlinux.org/index.php/btrfs#Swap_file for instructions
Great. I was actually wondering about precisely that, so I'm glad to see it's supported with the right incantations.
Trying to follow the instructions in the above URL, but clearly I'm misunderstanding something:
# btrfs sub create /SWAP Create subvolume '//SWAP' # btrfs property set /SWAP compression no # btrfs sub show /SWAP root/SWAP Name: SWAP UUID: 1d4d839a-fd4e-d345-8d3f-ea858af71982 Parent UUID: - Received UUID: - Creation time: 2020-11-06 11:41:37 +0000 Subvolume ID: 1311 Generation: 14657 Gen at creation: 14547 Parent ID: 257 Top level ID: 257 Flags: - Snapshot(s): # cd /SWAP # truncate -s 0 ./swapfile # chattr +C /SWAP chattr: Invalid argument while setting flags on /SWAP
Try setting +C flag on the file rather than the subvolume itself
# chattr +C swapfile chattr: Invalid argument while setting flags on swapfile
poc
Okay, I think I get it now, just don't set btrfs property set /SWAP compression no seems that this lead to something strange, while I can't tell why but it is totally fine to do so, since nocow means no datacsum and no compression
On Fri, 2020-11-06 at 23:03 +0800, Qiyu Yan wrote:
Okay, I think I get it now, just don't set btrfs property set /SWAP compression no seems that this lead to something strange, while I can't tell why but it is totally fine to do so, since nocow means no datacsum and no compression
OK, that seems to have worked. Trying it now.
poc
On Fri, 6 Nov 2020 at 08:42, Patrick O'Callaghan pocallaghan@gmail.com wrote:
On Fri, 2020-11-06 at 20:27 +0800, Qiyu Yan wrote:
Patrick O'Callaghan pocallaghan@gmail.com 于2020年11月6日周五 下午8:14写道:
On Fri, 2020-11-06 at 11:18 +0000, Patrick O'Callaghan wrote:
On Fri, 2020-11-06 at 05:43 +0000, Qiyu Yan wrote:
You can use swapfile on btrfs with nocow [...] https://wiki.archlinux.org/index.php/btrfs#Swap_file for
instructions
Great. I was actually wondering about precisely that, so I'm glad to see it's supported with the right incantations.
Trying to follow the instructions in the above URL, but clearly I'm misunderstanding something:
# btrfs sub create /SWAP Create subvolume '//SWAP' # btrfs property set /SWAP compression no # btrfs sub show /SWAP root/SWAP Name: SWAP UUID: 1d4d839a-fd4e-d345-8d3f-ea858af71982 Parent UUID: - Received UUID: - Creation time: 2020-11-06 11:41:37 +0000 Subvolume ID: 1311 Generation: 14657 Gen at creation: 14547 Parent ID: 257 Top level ID: 257 Flags: - Snapshot(s): # cd /SWAP # truncate -s 0 ./swapfile # chattr +C /SWAP chattr: Invalid argument while setting flags on /SWAP
Try setting +C flag on the file rather than the subvolume itself
# chattr +C swapfile chattr: Invalid argument while setting flags on swapfile
Mybe we are no lnpger meant to use chattr: Manual page btrfs-property(8) says:
"btrfs property provides an unified and user-friendly method to tune different btrfs properties instead of using the traditional method like chattr(1) or lsattr(1)."
On Fri, 2020-11-06 at 12:50 -0400, George N. White III wrote:
# chattr +C swapfile chattr: Invalid argument while setting flags on swapfile
Mybe we are no lnpger meant to use chattr: Manual page btrfs- property(8) says:
"btrfs property provides an unified and user-friendly method to tune different btrfs properties instead of using the traditional method like chattr(1) or lsattr(1)."
I had already looked at btrfs-property when originally trying to set this up. Unfortunately it doesn't have options relating to COW (or if it does they aren't documented). Seems strange to me, but there it is. I don't know of any other way to do this apart from chattr.
poc
On Thu, Nov 5, 2020, 10:03 AM Patrick O'Callaghan pocallaghan@gmail.com wrote:
The swapon(8) man page says:
The swap file implementation in the kernel expects to be able to write to the file directly, without the assistance of the filesystem. This is a problem on files with holes or on copy-on-write files on filesystems like Btrfs.
As I'm getting OOM errors when I try to run a VM, it looks like I need a swap file or partition. I'd prefer to use part of my (large) SSD for this, but currently it's entirely formatted as BTRFS. Do I need to resize the BTRFS partition rather than using a swap file?
There's several options depending on the workload.
swapfile on Btrfs performs the same as a swap partition in my testing. The main difference other than the limitations in 'man 5 btrfs' is an additional set of steps are needed to make it possible to do hibernation.
The swapfile must not be snapshot. If it is, COW applies and the swapfile can no longer be activated. There's several ways to avoid this if you want to snapshot root (but not the swapfile). Create a /swap subvolume, or /var/swap subvolume. Since btrfa snapshots are not recursive, either of these prevents a snapshot of root from snapshotting the swapfile.
if you aren't doing any snapshotting at all then it doesn't matter you can probably make the swapfile most anywhere.
I've got some work to do to figure out if there's a more elegant way to do it. And of course, encryption and SELinux implications.
The longer story is to automatically create the swap files on demand dynamically in the proper location. And in that case possibly use zswap, instead of swap on zram.
It's a bit esoteric, but there is a way to track sysfs memory.stat for page faults. And the zram driver tracks some statistics that could be helpful in figuring out situations where zram is getting full of seldom used dirty pages that are just taking up memory. In those kinds of workloads it's probably more beneficial to use conventional disk-based swap. That's because disk space swap fully evicts the page, freeing up all of that memory, whereas with the zram based swap we are still consuming some memory, in effect it's a partial eviction.
One thing to watch out for with new installs, it's possible we will see cases where there's average or below average RAM and heavy Firefox usage. And it could be easier to go below the low water threshold on zram-based swap, leading to early home issuing SIGTERM to a Firefox tab. This does get logged.
I think the first recommendation is to create a custom zram-generator configuration and bump the size of the zram device from 50% ram to 75% ram. While 100% is normally OK, it might be a use case better off with disk based swap. It just depends on the workload.
Most users should be ok with the defaults. It's what I'm using most of the time for over one year now. But I'm always on the lookout for any issues.
-- Chris Murphy
On Wed, 2020-11-11 at 01:04 -0700, Chris Murphy wrote:
On Thu, Nov 5, 2020, 10:03 AM Patrick O'Callaghan pocallaghan@gmail.com wrote:
The swapon(8) man page says:
The swap file implementation in the kernel expects to be able to write to the file directly, without the assistance of the filesystem. This is a problem on files with holes or on copy-on-write files on filesystems like Btrfs.
As I'm getting OOM errors when I try to run a VM, it looks like I need a swap file or partition. I'd prefer to use part of my (large) SSD for this, but currently it's entirely formatted as BTRFS. Do I need to resize the BTRFS partition rather than using a swap file?
There's several options depending on the workload.
swapfile on Btrfs performs the same as a swap partition in my testing. The main difference other than the limitations in 'man 5 btrfs' is an additional set of steps are needed to make it possible to do hibernation.
The swapfile must not be snapshot. If it is, COW applies and the swapfile can no longer be activated. There's several ways to avoid this if you want to snapshot root (but not the swapfile). Create a /swap subvolume, or /var/swap subvolume. Since btrfa snapshots are not recursive, either of these prevents a snapshot of root from snapshotting the swapfile.
if you aren't doing any snapshotting at all then it doesn't matter you can probably make the swapfile most anywhere.
I've got some work to do to figure out if there's a more elegant way to do it. And of course, encryption and SELinux implications.
The longer story is to automatically create the swap files on demand dynamically in the proper location. And in that case possibly use zswap, instead of swap on zram.
It's a bit esoteric, but there is a way to track sysfs memory.stat for page faults. And the zram driver tracks some statistics that could be helpful in figuring out situations where zram is getting full of seldom used dirty pages that are just taking up memory. In those kinds of workloads it's probably more beneficial to use conventional disk-based swap. That's because disk space swap fully evicts the page, freeing up all of that memory, whereas with the zram based swap we are still consuming some memory, in effect it's a partial eviction.
One thing to watch out for with new installs, it's possible we will see cases where there's average or below average RAM and heavy Firefox usage. And it could be easier to go below the low water threshold on zram-based swap, leading to early home issuing SIGTERM to a Firefox tab. This does get logged.
I think the first recommendation is to create a custom zram-generator configuration and bump the size of the zram device from 50% ram to 75% ram. While 100% is normally OK, it might be a use case better off with disk based swap. It just depends on the workload.
Most users should be ok with the defaults. It's what I'm using most of the time for over one year now. But I'm always on the lookout for any issues.
Thanks. For the moment I'm not doing hibernation (it doesn't work properly when you have a VM guest with GPU passthrough, basically because the GPU state is not preserved) but I'll keep your notes on file if I ever change my mind.
poc