Hi,
Fedora uses systemd-journald for system logging. By default it is a persistent log kept on /var, and uses up to 4G disk space, although in certain circumstances it can go a bit higher. See 'man journald.conf' for details.
Example:
Sep 27 07:26:05 fovo.local systemd-journald[602]: System Journal (/var/log/journal/$machine_id) is 385.9M, max 4.0G, 3.6G free.
In this example Fedora 37 Workstation system, logging is happening since August 20, is about 10M/day of journal accumulation, or 1.12 years of journals before garbage collection begins.
Exactly what will trigger garbage collection depends on the system. There are quite a few knobs for adjusting various aspects of retention and how granular the garbage collection will be. e.g. it's common to see 64M system journal files that contain weeks of entries. It's also possible to limit the journal file size, thus improving granularity whether to retain a bit more or less than the ideal amount.
Some folks use services with verbose or debug logging. 4G might only be a few months of logs in such a case. Whereas other folks have a small root device in which even the smaller of 10% or 4G can be quite a lot and in certain cases is not a hard limit.
Also note that on Btrfs with compression enabled, the stored amount is quite a bit less. Like all of user space, systemd-journald sees the uncompressed file sizes, so its retention behavior hasn't changed as a result of btrfs compression. What has changed is we're only (physically) storing about 1/3 of whatever the max retention is on a given system.
The obvious bike-shedding questions are: Is 4G is too much or too little? If so what amount it should be? Is size still the correct approach? Or should we consider a max retention time? And if so, what would it be and how granular should it be?
Also, what's the scope? Is a change needed Fedora-wide, in a manner that's upstreamable? That could prove difficult because any change will negatively impact other use cases, not least of which is what the upgrade behavior should be if it'll involve trimming journals. Are the current defaults optimal for most use cases most of the time? There will be a higher burden of persuasion to get a Fedora-wide change, rather than optimizing for just desktops.
But that isn't intended to limit the discussion to just the desktop case. Just to be aware that the broader and grander the change, the more consideration of the consequences there needs to be, i.e. less bike shedding.
More background and discussion upstream and Workstation working group issues. [1]
[1] https://pagure.io/fedora-workstation/issue/213 https://github.com/systemd/systemd/issues/17382
-- Chris Murphy
FWIW (probably not much), I have run into an issue with regard to the default journal size being too large on Fedora Server when running a bunch of systemd-nspawn containers each with sshd and fail2ban enabled. When I reboot a bunch of the containers at once (or the whole hypervisor), fail2ban really seemed to bog things down and use a lot of CPU time (re)scanning the journals for failed ssh attempts to (re)ban the IP addresses. In my case, I worked around the issue with the following. The real problem might be with my fail2ban configuration or something else. But it might be something to consider when thinking about what would be a good size/time limit for the journal.
# cat /etc/systemd/system/fail2ban.service.d/override.conf [Service] ExecStartPre=/usr/bin/journalctl --vacuum-time=1months
On Tue, Sep 27, 2022, at 10:38 AM, Gregory Bartholomew wrote:
FWIW (probably not much), I have run into an issue with regard to the default journal size being too large on Fedora Server when running a bunch of systemd-nspawn containers each with sshd and fail2ban enabled. When I reboot a bunch of the containers at once (or the whole hypervisor), fail2ban really seemed to bog things down and use a lot of CPU time (re)scanning the journals for failed ssh attempts to (re)ban the IP addresses. In my case, I worked around the issue with the following. The real problem might be with my fail2ban configuration or something else. But it might be something to consider when thinking about what would be a good size/time limit for the journal.
# cat /etc/systemd/system/fail2ban.service.d/override.conf [Service] ExecStartPre=/usr/bin/journalctl --vacuum-time=1months
What about modifying /etc/systemd/journald.conf:
MaxFileSec=1week MaxRetentionSec=5week
This should result in at least 4 weeks of journal entries, i.e. it would delete a journal file once entries reach 5 weeks old, but since the journal files are rotated weekly, it should mean a given journal file won't have more than a week's worth of entries. So you'd have between 4-5 weeks worth of entries at any given time.
What about modifying /etc/systemd/journald.conf:
MaxFileSec=1week MaxRetentionSec=5week
This should result in at least 4 weeks of journal entries, i.e. it would delete a journal file once entries reach 5 weeks old, but since the journal files are rotated weekly, it should mean a given journal file won't have more than a week's worth of entries. So you'd have between 4-5 weeks worth of entries at any given time.
Thanks for the tip. That does look like a better solution and I'll do that for my containers. Although, since I don't want it to hinder future updates of /etc/systemd/journald.conf, I'll put those lines in /etc/systemd/journald.conf.d/override.conf.
On Tue, Sep 27, 2022, at 10:59 AM, Gregory Bartholomew wrote:
What about modifying /etc/systemd/journald.conf:
MaxFileSec=1week MaxRetentionSec=5week
This should result in at least 4 weeks of journal entries, i.e. it would delete a journal file once entries reach 5 weeks old, but since the journal files are rotated weekly, it should mean a given journal file won't have more than a week's worth of entries. So you'd have between 4-5 weeks worth of entries at any given time.
Thanks for the tip. That does look like a better solution and I'll do that for my containers. Although, since I don't want it to hinder future updates of /etc/systemd/journald.conf, I'll put those lines in /etc/systemd/journald.conf.d/override.conf.
I hadn't considered the container case at all, that containers running systemd-journald would have their own journals and retention policy. I wonder if the container default should have volatile journals? Or forward the journals to the host by default? But yes I can see how many containers each thinking they have a 4G cap could quickly become a problem.
On Tue, Sep 27, 2022, at 10:59 AM, Chris Murphy wrote:
I hadn't considered the container case at all, that containers running systemd-journald would have their own journals and retention policy. I wonder if the container default should have volatile journals? Or forward the journals to the host by default? But yes I can see how many containers each thinking they have a 4G cap could quickly become a problem.
To be fair, my situation might be a bit unorthodox. These systemd-nspawn containers were created using `dnf --installroot ...` (the sort of method documented here: http://0pointer.net/blog/systemd-for-administrators-part-xxi.html). And some of them were created by copying the root filesystems of what were formerly qemu virtual machines. I'm not sure how common that sort of thing is for people to do. As long as it is easy for admins to tweak things for their use case, it might not be something that really needs a special default.
Hi,
On September 27, 2022 6:13:48 PM UTC, Chris Murphy lists@colorremedies.com wrote:
On Tue, Sep 27, 2022, at 10:59 AM, Gregory Bartholomew wrote:
What about modifying /etc/systemd/journald.conf:
MaxFileSec=1week MaxRetentionSec=5week
This should result in at least 4 weeks of journal entries, i.e. it would delete a journal file once entries reach 5 weeks old, but since the journal files are rotated weekly, it should mean a given journal file won't have more than a week's worth of entries. So you'd have between 4-5 weeks worth of entries at any given time.
Thanks for the tip. That does look like a better solution and I'll do that for my containers. Although, since I don't want it to hinder future updates of /etc/systemd/journald.conf, I'll put those lines in /etc/systemd/journald.conf.d/override.conf.
I hadn't considered the container case at all, that containers running systemd-journald would have their own journals and retention policy. I wonder if the container default should have volatile journals? Or forward the journals to the host by default? But yes I can see how many containers each thinking they have a 4G cap could quickly become a problem.
Note that the majority of containers are not running journald. Only init-type containers under podman or nspawn containers have their own journal. All others will simply log to the container runtime's log (which can be journald, but needn't be).
On Tue, Sep 27, 2022 at 10:12:57AM -0600, Chris Murphy wrote:
Hi,
Fedora uses systemd-journald for system logging. By default it is a persistent log kept on /var, and uses up to 4G disk space, although in certain circumstances it can go a bit higher. See 'man journald.conf' for details.
snip
The obvious bike-shedding questions are: Is 4G is too much or too little? If so what amount it should be? Is size still the correct approach? Or should we consider a max retention time? And if so, what would it be and how granular should it be?
In context of modern physical machines, 4G is probably barely noticeable for most people, given common physical disks measure 100's of GBs as a starting point.
Some people run Fedora on pretty old hardware where disk sizes may be more limited.
Virtual machines are probably the place with the biggest disk usage constraints where, 4GB could be pretty impactful when a VM may only have a few 10's of GB of storage purchased.
You mentioned '10%' earlier, is that is another existing limit that's already applied, in addition to the 4GB absolute size limit ? If so that'd obviously benefit the small disk scenarios. A relative limit is going to be way oversized for large disk scenarios though.
Both absolute and relative size limits look complementary and neccessary.
I wonder if max retention is actually useful at all though, at least for generic out of the box usage
For systems with low rate of logging, the size of the journal will grow slowly enough that max retention won't have a notable impact for along time.
For systems with high rate of logging, a generic max retention probably won't be aggressive enough to constrain the disk usage quickly enough to stop problems arising.
Max rentention time doesn't take into account available disk storage in any way.
While there might a sweet spot, its effectiveness looks to be somewhat limited, narrow in scope & unlikely to please a broad enough userbase. IOW, combination of abs+rel size limits look a more generally effective OOTB setting to avoid storage over use.
Max retention time looks most relevant/useful as a mechanism for implementing organizational policies on data record keeping times, and quite site specific.
With regards, Daniel
On Tue, Sep 27, 2022, at 12:13 PM, Daniel P. Berrangé wrote:
On Tue, Sep 27, 2022 at 10:12:57AM -0600, Chris Murphy wrote:
Hi,
Fedora uses systemd-journald for system logging. By default it is a persistent log kept on /var, and uses up to 4G disk space, although in certain circumstances it can go a bit higher. See 'man journald.conf' for details.
snip
The obvious bike-shedding questions are: Is 4G is too much or too little? If so what amount it should be? Is size still the correct approach? Or should we consider a max retention time? And if so, what would it be and how granular should it be?
In context of modern physical machines, 4G is probably barely noticeable for most people, given common physical disks measure 100's of GBs as a starting point.
Dual-boot is pretty common, and so are 128G NVMe drives in new laptops. So it's "sufficiently not rare" that Fedora is being installed into less than 50G that it needs to be accounted for.
Some people run Fedora on pretty old hardware where disk sizes may be more limited.
Virtual machines are probably the place with the biggest disk usage constraints where, 4GB could be pretty impactful when a VM may only have a few 10's of GB of storage purchased.
Agree. It is possibly similar to the small storage cheap dual-boot baremetal case.
You mentioned '10%' earlier, is that is another existing limit that's already applied, in addition to the 4GB absolute size limit ?
Yes. SystemMaxUse= defaults to 10% of the file system size, capped to 4G. I'm not certain this is a hard limit, i.e. I think if journals take up just under 4G and a new journal file can be created, it's allowed to grow to the max size which I think is 128M (I've only ever seen 128M sized journals, so it's anecdotal evidence not man page or code based). So it could plausibly grow to ~4.1GiB?
If so that'd obviously benefit the small disk scenarios. A relative limit is going to be way oversized for large disk scenarios though.
Both absolute and relative size limits look complementary and neccessary.
Currently SystemKeepFree= is 15% of file system size. Once free space goes below that limit, systemd will stop growing its usage, but won't reduce its usage.
I wonder if max retention is actually useful at all though, at least for generic out of the box usage
For systems with low rate of logging, the size of the journal will grow slowly enough that max retention won't have a notable impact for along time.
For systems with high rate of logging, a generic max retention probably won't be aggressive enough to constrain the disk usage quickly enough to stop problems arising.
Max rentention time doesn't take into account available disk storage in any way.
Correct, it allows a significant float based on usage, consider space consumption relatively less important than the value reduction of the journal entries over time. This is what rsyslogd does, which I think its default retention is two weeks (?) and is configurable. If you think most users most of the time have no need or expectation of needing journal entries beyond X weeks, then you'd presumably be a proponent of a relatively more dominant retention time policy (while still allowing for the max use limit).
While there might a sweet spot, its effectiveness looks to be somewhat limited, narrow in scope & unlikely to please a broad enough userbase. IOW, combination of abs+rel size limits look a more generally effective OOTB setting to avoid storage over use.
Max retention time looks most relevant/useful as a mechanism for implementing organizational policies on data record keeping times, and quite site specific.
True but also pretty common in the era before systemd-journald in which it really was predominately time based rentention.
On Tue, 27 Sep 2022 12:31:11 -0600 "Chris Murphy" lists@colorremedies.com wrote:
[..]
The obvious bike-shedding questions are: Is 4G is too much or too little? If so what amount it should be? Is size still the correct approach? Or should we consider a max retention time? And if so, what would it be and how granular should it be?
In context of modern physical machines, 4G is probably barely noticeable for most people, given common physical disks measure 100's of GBs as a starting point.
Dual-boot is pretty common, and so are 128G NVMe drives in new laptops. So it's "sufficiently not rare" that Fedora is being installed into less than 50G that it needs to be accounted for.
The original PinePhone only comes with a 16GB eMMC. Using 4GB for journal on that would for sure be insane.
Allan.
On Tue, Sep 27, 2022, at 7:14 PM, Allan via devel wrote:
On Tue, 27 Sep 2022 12:31:11 -0600 "Chris Murphy" lists@colorremedies.com wrote:
The original PinePhone only comes with a 16GB eMMC. Using 4GB for journal on that would for sure be insane.
The root file system for this device might be around 15G, therefore max journal size is 1.5G. But also stops growing its usage once the journal uses more than 15% free space. The 4G cap is the high end cap which applies when the file system is > 40G.
If the disk space was unlimited, I'd love to keep the journal forever. Since I don't have unlimited storage, I prefer to be space limited rather then time limited.
IOW the journal entries are typically useless even before they are recorded, but when there is some troubleshooting required, the more historical entries are available, the better.
Vít
Dne 27. 09. 22 v 18:12 Chris Murphy napsal(a):
Hi,
Fedora uses systemd-journald for system logging. By default it is a persistent log kept on /var, and uses up to 4G disk space, although in certain circumstances it can go a bit higher. See 'man journald.conf' for details.
Example:
Sep 27 07:26:05 fovo.local systemd-journald[602]: System Journal (/var/log/journal/$machine_id) is 385.9M, max 4.0G, 3.6G free.
In this example Fedora 37 Workstation system, logging is happening since August 20, is about 10M/day of journal accumulation, or 1.12 years of journals before garbage collection begins.
Exactly what will trigger garbage collection depends on the system. There are quite a few knobs for adjusting various aspects of retention and how granular the garbage collection will be. e.g. it's common to see 64M system journal files that contain weeks of entries. It's also possible to limit the journal file size, thus improving granularity whether to retain a bit more or less than the ideal amount.
Some folks use services with verbose or debug logging. 4G might only be a few months of logs in such a case. Whereas other folks have a small root device in which even the smaller of 10% or 4G can be quite a lot and in certain cases is not a hard limit.
Also note that on Btrfs with compression enabled, the stored amount is quite a bit less. Like all of user space, systemd-journald sees the uncompressed file sizes, so its retention behavior hasn't changed as a result of btrfs compression. What has changed is we're only (physically) storing about 1/3 of whatever the max retention is on a given system.
The obvious bike-shedding questions are: Is 4G is too much or too little? If so what amount it should be? Is size still the correct approach? Or should we consider a max retention time? And if so, what would it be and how granular should it be?
Also, what's the scope? Is a change needed Fedora-wide, in a manner that's upstreamable? That could prove difficult because any change will negatively impact other use cases, not least of which is what the upgrade behavior should be if it'll involve trimming journals. Are the current defaults optimal for most use cases most of the time? There will be a higher burden of persuasion to get a Fedora-wide change, rather than optimizing for just desktops.
But that isn't intended to limit the discussion to just the desktop case. Just to be aware that the broader and grander the change, the more consideration of the consequences there needs to be, i.e. less bike shedding.
More background and discussion upstream and Workstation working group issues. [1]
[1] https://pagure.io/fedora-workstation/issue/213 https://github.com/systemd/systemd/issues/17382
-- Chris Murphy _______________________________________________ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
Heads up that I'm trying to get https://github.com/systemd/systemd/pull/22998 in before the next systemd release which should reduce the journal size by +- 50% in a way that will be taken into account by journald's retention logic (unlike the btrfs compression).
Also, as soon as there's a kernel API to query compressed file size I'll update journald's retention logic to use that so we can take the actual file size into account when making retention decisions.
Cheers,
Daan De Meyer
________________________________________ From: Chris Murphy lists@colorremedies.com Sent: 27 September 2022 17:12 To: fedora devel Subject: limiting the (systemd) journal size
!-------------------------------------------------------------------| This Message Is From an External Sender
|-------------------------------------------------------------------!
Hi,
Fedora uses systemd-journald for system logging. By default it is a persistent log kept on /var, and uses up to 4G disk space, although in certain circumstances it can go a bit higher. See 'man journald.conf' for details.
Example:
Sep 27 07:26:05 fovo.local systemd-journald[602]: System Journal (/var/log/journal/$machine_id) is 385.9M, max 4.0G, 3.6G free.
In this example Fedora 37 Workstation system, logging is happening since August 20, is about 10M/day of journal accumulation, or 1.12 years of journals before garbage collection begins.
Exactly what will trigger garbage collection depends on the system. There are quite a few knobs for adjusting various aspects of retention and how granular the garbage collection will be. e.g. it's common to see 64M system journal files that contain weeks of entries. It's also possible to limit the journal file size, thus improving granularity whether to retain a bit more or less than the ideal amount.
Some folks use services with verbose or debug logging. 4G might only be a few months of logs in such a case. Whereas other folks have a small root device in which even the smaller of 10% or 4G can be quite a lot and in certain cases is not a hard limit.
Also note that on Btrfs with compression enabled, the stored amount is quite a bit less. Like all of user space, systemd-journald sees the uncompressed file sizes, so its retention behavior hasn't changed as a result of btrfs compression. What has changed is we're only (physically) storing about 1/3 of whatever the max retention is on a given system.
The obvious bike-shedding questions are: Is 4G is too much or too little? If so what amount it should be? Is size still the correct approach? Or should we consider a max retention time? And if so, what would it be and how granular should it be?
Also, what's the scope? Is a change needed Fedora-wide, in a manner that's upstreamable? That could prove difficult because any change will negatively impact other use cases, not least of which is what the upgrade behavior should be if it'll involve trimming journals. Are the current defaults optimal for most use cases most of the time? There will be a higher burden of persuasion to get a Fedora-wide change, rather than optimizing for just desktops.
But that isn't intended to limit the discussion to just the desktop case. Just to be aware that the broader and grander the change, the more consideration of the consequences there needs to be, i.e. less bike shedding.
More background and discussion upstream and Workstation working group issues. [1]
[1] https://pagure.io/fedora-workstation/issue/213 https://github.com/systemd/systemd/issues/17382
-- Chris Murphy _______________________________________________ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
On Di, 27.09.22 10:12, Chris Murphy (lists@colorremedies.com) wrote: 65;6800;1c
The obvious bike-shedding questions are: Is 4G is too much or too little? If so what amount it should be? Is size still the correct approach? Or should we consider a max retention time? And if so, what would it be and how granular should it be?
Also, what's the scope? Is a change needed Fedora-wide, in a manner that's upstreamable? That could prove difficult because any change will negatively impact other use cases, not least of which is what the upgrade behavior should be if it'll involve trimming journals. Are the current defaults optimal for most use cases most of the time? There will be a higher burden of persuasion to get a Fedora-wide change, rather than optimizing for just desktops.
But that isn't intended to limit the discussion to just the desktop case. Just to be aware that the broader and grander the change, the more consideration of the consequences there needs to be, i.e. less bike shedding.
More background and discussion upstream and Workstation working group issues. [1]
BTW, if you can make a good case for this, consider submitting this change upstream instead of keeping this specific to Fedora. The values we default to are just some values I came up with which 10y or so and made rough sense to me, but the idea was always to tweak them as we learn how things behave in real life. So, if there's good case to be made to tweak them in some way we can certainly consider making that upstream.
(i.e. if you compiled a good list of pros/cons and have some specific values to propose please submit a github issue upstream about this, and we can look into this.)
Lennart
-- Lennart Poettering, Berlin