Hello,
I wanted to share this with you and if you point me in the right direction under what section to add something like this in the wiki I will do so!
After running my system a while it continued to slow down and at some point it was so extreme that a simple copy operation would render the mouse unmovable. Note that this is an i7, 8G Ram, SSD etc.
Changing to BFQ schedule, tune BFQ parameters, disabling swapping, compiling the kernel with the Muqss scheduler, nothing did help. It is not a hardware issue!
It seems by default fstrim is disabled from systemd and by default nothing formats it with discard option.
Running fstrim / && fstrim /home && systemctl enable fstrim.service changed the performance a dozen fold.
Regards, Damian
On Thu, Dec 12, 2019 at 2:07 PM Damian Ivanov damianatorrpm@gmail.com wrote:
Hello,
I wanted to share this with you and if you point me in the right direction under what section to add something like this in the wiki I will do so!
After running my system a while it continued to slow down and at some point it was so extreme that a simple copy operation would render the mouse unmovable. Note that this is an i7, 8G Ram, SSD etc.
Changing to BFQ schedule, tune BFQ parameters, disabling swapping, compiling the kernel with the Muqss scheduler, nothing did help. It is not a hardware issue!
It seems by default fstrim is disabled from systemd and by default nothing formats it with discard option.
Running fstrim / && fstrim /home && systemctl enable fstrim.service changed the performance a dozen fold.
This is a balancing act, because there are devices that perform better with an occasional fstrim, and other devices that misbehave. Most devices don't benefit from it, but then an even large pool of devices aren't hurt either.
For the near term, there are enough devices that lack support for queued trim that discard mount option by default is probably not a good idea. In case of a non-queued trim supporting drive, it basically stalls in workloads where there are a lot of file system changes. Newer drives support queued trim.
And still other drives have firmware bugs where trim results in data corruption or loss. Which is a better default? Exposing a significant minority of users to slowing devices, or exposing a small group to data loss? But these days most of those devices have been identified, and either have firmware fixes, removed from production use, or have been blacklisted in the kernel for trim (i.e. trim is not passed down to those devices).
It's reasonable to consider enabling fstrim.service by default. This would cause trim to be issued once per week. But I think it should be proposed as a system wide change so that it gets the necessary visibility. Ubuntu does enable it by default for a few releases now; I don't know for sure if openSUSE enables it by default.
Hello Chris,
Thanks for the response.
What a data loss or data corruption is has to be carefully defined (whole partition or file based etc). As an indirect result of fstrim not being enabled I have experienced data loss: The performance was so degraded that I had tried setting commit=120 on all partitions and after pulling the plug since a simple copy operation may still result in kernel oops - I have only seen that one with the MuQss scheduler which I tried once - after completely freezing 2 minutes of changes may be lost. (Yes so slow copying files result in kernel oops - or permanently frozen system, at the maximum I left the computer for hours it was still doing intensively something so that the heat was rising and the system under heavy load, mentioning hardware weariness, still talking a simple short copy operation here, which now takes less than five minutes - with journalctl's last boot words being that the libinput could't process events of the razer mouse).
So as an indirect result of fstrim not being enabled I have experienced data loss.
There also can be a distinction between defaults in RHEL and Fedora. Though I don't think that enterprise SSD or VM's are affected by fstrim causing data issues.
Regards, Damian
On Thu, Dec 12, 2019 at 11:30 PM Chris Murphy lists@colorremedies.com wrote:
On Thu, Dec 12, 2019 at 2:07 PM Damian Ivanov damianatorrpm@gmail.com wrote:
Hello,
I wanted to share this with you and if you point me in the right direction under what section to add something like this in the wiki I will do so!
After running my system a while it continued to slow down and at some point it was so extreme that a simple copy operation would render the mouse unmovable. Note that this is an i7, 8G Ram, SSD etc.
Changing to BFQ schedule, tune BFQ parameters, disabling swapping, compiling the kernel with the Muqss scheduler, nothing did help. It is not a hardware issue!
It seems by default fstrim is disabled from systemd and by default nothing formats it with discard option.
Running fstrim / && fstrim /home && systemctl enable fstrim.service changed the performance a dozen fold.
This is a balancing act, because there are devices that perform better with an occasional fstrim, and other devices that misbehave. Most devices don't benefit from it, but then an even large pool of devices aren't hurt either.
For the near term, there are enough devices that lack support for queued trim that discard mount option by default is probably not a good idea. In case of a non-queued trim supporting drive, it basically stalls in workloads where there are a lot of file system changes. Newer drives support queued trim.
And still other drives have firmware bugs where trim results in data corruption or loss. Which is a better default? Exposing a significant minority of users to slowing devices, or exposing a small group to data loss? But these days most of those devices have been identified, and either have firmware fixes, removed from production use, or have been blacklisted in the kernel for trim (i.e. trim is not passed down to those devices).
It's reasonable to consider enabling fstrim.service by default. This would cause trim to be issued once per week. But I think it should be proposed as a system wide change so that it gets the necessary visibility. Ubuntu does enable it by default for a few releases now; I don't know for sure if openSUSE enables it by default.
-- Chris Murphy _______________________________________________ desktop mailing list -- desktop@lists.fedoraproject.org To unsubscribe send an email to desktop-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/desktop@lists.fedoraproject.or...
On Thu, Dec 12, 2019 at 11:05 PM Damian Ivanov damianatorrpm@gmail.com wrote:
Hello Chris,
Thanks for the response.
What a data loss or data corruption is has to be carefully defined (whole partition or file based etc). As an indirect result of fstrim not being enabled I have experienced data loss: The performance was so degraded that I had tried setting commit=120 on all partitions and after pulling the plug since a simple copy operation may still result in kernel oops - I have only seen that one with the MuQss scheduler which I tried once - after completely freezing 2 minutes of changes may be lost. (Yes so slow copying files result in kernel oops - or permanently frozen system, at the maximum I left the computer for hours it was still doing intensively something so that the heat was rising and the system under heavy load, mentioning hardware weariness, still talking a simple short copy operation here, which now takes less than five minutes - with journalctl's last boot words being that the libinput could't process events of the razer mouse).
So as an indirect result of fstrim not being enabled I have experienced data loss.
There also can be a distinction between defaults in RHEL and Fedora. Though I don't think that enterprise SSD or VM's are affected by fstrim causing data issues.
I consider all of these examples are buggy device behavior. TRIM is supposed to be an optimization, not a requirement. And the optimization shouldn't cause any corruption or data loss. It should be true the manufacturer offers a firmware update to fix all of these problems, including yours. What you're describing for your case is that it's mandatory, or the experience is so bad that it's essentially unusable.
The problem with some devices hanging on TRIM does kinda bug me. That's in something of a gray area, whether it's a bug or bad design, or just a consequence of the technology at the time. I'd say discard mount option should not be considered, but fstrim.timer (once per week) could be considered. That would mean instead of frequent small hangs, there'd be a bigger one once a week - I have no way of assessing how noticeable it is, it would be make/model and workload specific.
It might be worth doing a system wide change proposal to enable fstrim.timer by default so it has the proper visibility. There was a feature a few releases ago to enable discard pass down with LUKS encrypted volumes. It was approved and it is in place, but its not taken advantage of because neither fstrim.timer nor discard mount option are set by default.
OK I went ahead and did this. Thanks for the suggestion.
https://fedoraproject.org/wiki/Changes/EnableFSTrimTimer
-- Chris Murphy
Thank you. Is this something that could and/or supposed to be done by "community members"?
On Thu, Dec 19, 2019 at 7:54 PM Chris Murphy lists@colorremedies.com wrote:
OK I went ahead and did this. Thanks for the suggestion.
https://fedoraproject.org/wiki/Changes/EnableFSTrimTimer
-- Chris Murphy _______________________________________________ desktop mailing list -- desktop@lists.fedoraproject.org To unsubscribe send an email to desktop-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/desktop@lists.fedoraproject.or...
On Fri, Dec 20, 2019 at 1:09 PM Damian Ivanov damianatorrpm@gmail.com wrote:
Thank you. Is this something that could and/or supposed to be done by "community members"?
I'm not sure I completely understand the question.
The fstrim.timer unit is provided by util-linux for a while now; so you can just enable it now on Fedora 30/31:
sudo systemctl enable --now fstrim.timer
And it'll run fstrim.service once per week on all supporting file systems listed in /etc/fstab.
The proposal is still being discussed on devel@ list, whether it will happen in new F32 installations, and whether it would apply to upgrades from F30/F31. So there's some uncertainty.
desktop@lists.fedoraproject.org