Hi,
On 5/2/22 22:53, Chris Murphy wrote:
> On Mon, May 2, 2022 at 5:29 PM Jeremy Linton <jeremy.linton@arm.com> wrote:
>>
>> On 4/6/22 12:57, Neal Gompa wrote:
>> (trimming)
>>> * NVIDIA graphics
>>> * Broadcom wireless
>>>
>>> The former case is excessively common, and the latter case is fairly
>>> common with HP and Dell machines as well as some smaller OEMs. I
>>> literally helped someone this past week with both[1][2][3]. The
>>> Workstation WG has been tracking both issues for years now[4][5]. This
>>> situation is *worse* now because we have Fedora Linux preloaded on
>>> computers, and OEMs basically have to disable Secure Boot to make
>>> things "work". How's that for improving security?
>>
>> I too have been a bit surprised at some of the difficulties of
>> hibernate/secure boot on recent fedora releases. It seems people are
>> entirely unaware that ACPI/S3 standby is gone from most consumer
>> laptops, and the modern standby replacement implementations tend to work
>> very poorly WRT conserving battery with the lid closed in Linux.
>
> It's a kernel problem. I'm not sure to what degree upstream is aware
> of it. But there's not a lot we can do about it except file bugs and
> ask for improvement.
>
>
>>
>> So, on a recent fedora machine, it took me more than 4 hours to get a
>> hibernation file on btrfs plus LUKS encrypted partition working. The
>> documentation for that wasn't to be found anywhere on the fedora/RH
>> sites and required compiling a tool to do the block offset calculations
>> and manually adding the resume_offset options to grub/etc. All while
>> avoiding the mass of incorrect information found on the internet. And of
>> course it also requires disabling swap on zram (which was nonsense on
>> the machine anyway, given the disks are faster than it can
>> compress/decompress pages).
>
> I don't think it requires disabling swap on zram per se - from what
> I've been told the hibernation code knows it can't use it for the
> hibernation image, not least of which is it's not big enough for a
> contiguous write of the image. The issue might be that so much needs
> to be swapped out, to free ~50% RAM, which is used to create the
> hibernation image in memory before it's written out. We need a clear
> reproducer with logs and get it posted to the Linux memory management
> mailing list to see what's going wrong. Since zram is threaded, it's
> pretty unlikely drive writes are faster than memory writes with
> compression. LZO+RLE is computationally pretty cheap.
DMA is computationally free. and at >3GB/sec on modern hw with
sufficient queue depth or contiguous. And the RLE improvements only help
when the page is basically empty. AKA its a great chrome benchmark tool,
less so for real workloads.
Quite a lot of folks in the Fedora community do not have NVMe drives. If you have a real workload that's straightforward to reproduce better performance when swap on NVME plain partition vs zram, I'd like to give it a go. There'a always tradeoffs, the goal is to do a good job for most use cases, not optimize for a few.
>
>
>>
>> And of course the lockdown patches in the kernel still aren't smart
>> enough to be able to detect that the swapfile is actually encrypted, so
>> it also requires disabling secure boot (this IMHO is frankly
>> unacceptable, that one can't have both options enabled at the same time).
>
> Encryption isn't enough to ensure the image is valid. It needs to be
> signed. But in any case this is also upstream effort required,
> including discovering the offset via a standard API for all file
> systems.
Yes, I'm aware much of this is a kernel problem, but the point being
that in the meantime, most random machines people are buying at retail
won't last more than a day or so without being plugged in using fedora,
vs many days with windows because its hibernating with secure boot
turned on.
On a recently purchased Lenovo, s2idle is what's used with Windows 10 sleep set in the firmware setup. I see about 1% battery drop per hour. When switching it to Linux sleep, it uses ACPI S3, and I see maybe 1% battery drop per 8 hours. I don't know why. But until there's authenticated+encrypted hibernation images, I think there's not much to be done out of the box. Even once we have hibernation images, we can't depend on hibernation working for various reasons:
It's not good that hibernation doesn't work well right now. But it's possible to give users an even worse experience, including a false sense of security regarding their data when hibernation is used.