On Fri, Mar 28, 2014 at 11:19:13AM +0100, HATAYAMA Daisuke wrote:
[..]
Multiple vmcores are not so a big problem if they are explained
enough
and users understant that. The current design discussed here assumes
users specify --split option explicitly in kdump.conf, so they know
multiple vmcores are generated. They never get suprized.
It is relatively harder to manage multiple files. And this will be
justified only if there is a huge benefit in creating multiple files
instead of one.
If you are trying to save files across different adaptors, it means
bringing up additional hardware in second kernel. If adaptors are of
different types, then different drivers are used and it contributes
to increased unreliability of dump operation.
We don't have any support where one can specify to bring up multiple
storage devices. So you must be carrying your own patches to make sure
different devices can be brought up. All the code has been written
keeping in mind that there will be single device to dump to and single
"path" with-in device. Now kdump.conf syntax and backend implementation
will get really complex if we try to support split files.
So there has to be really huge benefit to justify the need of supporting
split file mode.
[..]
>> The problem is that we now don't have a way to specify
the number of
>> parallelism in core_collector since we specify it in --split as the
>> number of vmcore arguments.
In the new mode where destination is single file, this new option can be
used.
In fact even with split, one should be able to use this option. For
example, if you are bringing up 6 cpus but writing 2 split files, then
you could use 4 threads doing filtering and compression while 2 threads doing
writing.
So --split kind of specifies IO level parallelism and only provides weak
hint for cpu level parallelism.
>
> We can do two things.
>
> - We can check of number of cpus available in second kernel in
> makedumpfile and makedumpfile can fork off threads accordingly.
>
> - Or we can create a new commandline arguments which specifies how
> many threads to fork off for compression. A user who will be modifying
> nr_cpus, can also modify this command line parameter.
>
> I think we can in fact have both. First will be the default behavior which
> can be overridden with an command line option.
>
It seems you assume new feature...
Yep, I am thinking of a new feature where vmcore is saved to a single
file but filtering and compression can happen in parallel depending on
number of cpus available.
[..]
>> Also, splitting dump into multiple vmcores has another merit
that it's
>> possible to parallelize even I/O into multiple disks. This is necessay
>> when we strongly need full dump.
I am curious in what cases do you need full dump? Do you enable it by
default for your customers? If not, when do you recommend them to capture
full dump instead of filtered one.
Capturing full dump on large multi tera byte machines is not practical.
It takes a very long time and after saving dump, sending those tera byte
files to support is a big headache.
>
> I can understand need of --split in some cases. But that will be useful
> only in select corner cases.
>
The usecase might be a mere corner case for you, it's important for
us. For example, we sometimes want to use it to debug complicated bugs
that relates to a wide range of kernel components, such as a bug
relevant to a flow of I/O among qemu/KVM guests and hosts (and so we
cannot filter out user-pace memory). In such case there's merit as a
final resort even if it needs a lot of disk space and time.
Ok, I get it. So nothing has crashed but if system is not performing well
you will ask customer to dump full memory and send them for analysis
so that you can traverse through the whole stack.
I am assuming that you are doing this to analyze performance issues?
Otherwise if it is guest crash, then just guest dump should be sufficient
and one does not have to take host full dump.
So do you enable it by default or you recommend it to specific customers
based on their need.
> If we enable writing to single file with multiple threads doing filtering
> and compression, this is going to be more useful, I think,
>
I of course understand it's useful, but we don't have now.
BTW, this is another topic, but if possible I want to extend
kexec-tools to treat multiple disks and support --split option.
That's lot of work. In the code everywhere it is assumed that there is
single device and single path. I really don't want to support all that
complication till we have proven that it is a huge win for most of the
people.
I think supporting intermediate mode of saving to single file while
harnessing cpu power for fitltering and compression will be much
easier. (If this gives us reasonable speedup).
[..]
I investigated a little more and found that I have to investigate
more
how to manage buffer for compression, and in detail, how to divide
processing into each thread. The design might not scale well due to
lock contension on the buffer.
It seems to me that writing multiple vmcores is not only good for ease
of implementation but also rather good as how to divide the processing.
It is only good if one is saving full dump. Majority of people might not
even require that. They probably want fully filtered dump but want to
do it fast on multi tera byte machine.
Anyway, benchmark will be needed to discuss this topic in detail.
Agreed. First we need to implement that new mode and see if it gives
us good performance or not.
Thanks
Vivek