On Fri, May 27, 2022 at 06:05:27PM +0200, Zdenek Kabelac wrote:
Dne 27. 05. 22 v 17:39 Vivek Goyal napsal(a):
On Fri, May 27, 2022 at 04:59:38PM +0200, Zdenek Kabelac wrote:
Dne 27. 05. 22 v 16:50 Vivek Goyal napsal(a):
On Fri, May 27, 2022 at 04:42:25PM +0200, Zdenek Kabelac wrote:
Dne 27. 05. 22 v 14:20 Vivek Goyal napsal(a):
On Fri, May 27, 2022 at 02:45:14PM +0800, Tao Liu wrote: > If lvm2 thinp is enabled in kdump, lvm2-monitor.service is needed for > monitor and autoextend the size of thin pool. Otherwise the vmcore > dumped to a no-enough-space target will be incomplete and unable for > further analysis. > > In this patch, lvm2-monitor.service will be started before kdump-capture > .service for 2nd kernel, then be stopped in kdump post.d phase. So > the thin pool monitoring and size-autoextend can be ensured during kdump. > > Signed-off-by: Tao Liu ltao@redhat.com > --- > dracut-lvm2-monitor.service | 15 +++++++++++++++ > dracut-module-setup.sh | 16 ++++++++++++++++ > kexec-tools.spec | 2 ++ > 3 files changed, 33 insertions(+) > create mode 100644 dracut-lvm2-monitor.service > > diff --git a/dracut-lvm2-monitor.service b/dracut-lvm2-monitor.service This seems to be a copy of /lib/systemd/system/lvm2-monitor.service. Wondering if we can dirctly include that file in initramfs when generating image. But I am fuzzy on details of dracut implementation. It has been too long since I played with it. So Bao and kdump team will be best to comment on this.
This is quite interesting - monitoring should in fact never be started wthin 'ramdisk' so I'm acutlly wondering what is this service file doing there.
Design was to start 'monitoring' of devices just after switch to 'rootfs' - since running 'dmeventd' out of ramdisk does not make any sense at all.
Hi Zdenek,
In case of kdump, we save core dump from initramfs context and reboot back into primary kernel. And that's why this need of dm monitoring ( and thin pool auto extension) working from inside the initramfs context.
So IMHO this although does not look like the best approach. AFAIK the lvm.conf within ramdisk is also a modified version.
It looks like there should be a better alternative - like 'after' activation checking there is 'enough' room in thin-pool for use with thinLV - should be 'computable' and in case the size is not good enough - try to extend thin-pool prior use/mount of thinLV (size of space in thin-pool %DATA & %METATDATA and occupancy of %DATA thinLV could be obtained by 'lvs' tool)
One potential problem here is that we don't know what's the size of vmcore in advance. It gets filtered and saved and we dont know in advance, how many kernel pages will be there.
Is that still right, Bao?
Technically speaking, one could first run makedumpfile to just determine what will be size of vmcore and then actually save vmcore in second round. But that will double the filtering time.
You could likely 'stream/buffer' these kdump data in form of i.e. '4MiB ~ 128MiB' chunks (or any other suitable size which will be 'quick enough) and before each new write of such chunk just compare there is enough free space in thin-pool with lvs - should be still better then running 'dmeventd' in the background -
and gives you also the best control over the deadlock in case you run completely out-of-space (i.e. leaving enough room in thin-pool and avoiding full dump so user could still 'boot')
So if we fill up thin pool completely, it might fail to activate over reboot? I do remember there were issues w.r.t filling up thin pool compltely and it was not desired.
So above does not involve growing thin pool at all? Above just says, query currently available space in thin pool and when it is about to be full, stop writing to it? This is suboptimal if there is free space in underlying volume group.
Ok, this is going to be ugly given how kdump works right now. We have this config option core_collector where user can specify how vmcore should be saved (dd, cp, makedumpfile, .....)
None of these tools know about streaming and thin pool extension etc.
I guess one could think of making maekdumpfile aware of thin pool. But given there can be so many dump targets, it will be really ugly from design point of view. Embedding knowledge of a target in a generic filtering tool.
Alternatively we could probably write a tool of our own and pipe makedumpfile output to it. But then user will have to specify it in core_collector for thin pool targets only.
None of the solutions look clean or fit well into the current design.
Thanks Vivek
Since you will be only a single user of thinLV in initramfs - this should be reasonable straigforward to achieve.
Regards
Zdenek