[PATCH] Build in IPMI on x86

Josh Boyer jwboyer at fedoraproject.org
Wed Feb 19 13:50:46 UTC 2014


On Tue, Feb 18, 2014 at 7:03 PM, Prarit Bhargava <prarit at redhat.com> wrote:
>
>
> On 02/18/2014 02:39 PM, Matthew Garrett wrote:
>> On Tue, Feb 18, 2014 at 02:28:55PM -0500, Prarit Bhargava wrote:
>>
>>> The problem is that we've seen users (especially those using clusters) who do
>>> not want ipmi built in.  Their systems generate a tonne of ipmi traffic on their
>>> systems which they want to ignore.  Building IPMI into the kernel results in
>>> situations where processing these messages causes kipmi to climb to 100% for
>>> long periods of time.
>>
>> If the system firmware is sending messages then the default assumption
>> ought to be that it's doing so for a reason.
>
> It is -- it's likely sending health or power information back to the BMC.  But
> that's not the issue.  Clusters (and others) don't care about the BMC on a
> particular system so they disable IPMI.

Is there an option in the firmware to have it not send those reports?
I realize tweaking firmware options is somewhat of a pain, but if
you're doing your initial cluster setup it seems worthwhile to just
turn it off at the source...

>>> Maybe that can be solved through an 'ipmi=off' option, or maybe off should be
>>> the default state for handling of these messages?
>>
>> You can disable the various ipmi_si probings via the tryacpi, trydmi and
>> so on options.
>
> That's not intuitive.  The current options are awful; one has to specify three
> kernel parameters IIRC.  Keep it simple with "ipmi=off", and maybe a /sys
> variable to do it at runtime as well (although ... maybe the ipmi_si module
> parameters are available already?).

There's a number of options already.  force_kipmid and
kipmid_max_busy_us seem pretty relevant in addition to the ones
Matthew mentioned.

Or if you want a global off switch, I'm sure upstream would be happy
to take a patch for it with sufficient reasoning.

>>> In any case, I think you're going down the right path here by building this into
>>> the kernel but IMO there's still some upstream work to do so that we don't hit
>>> users with 100% kipmi usage and no way of avoiding it.
>>
>> Sending enough traffic to keep kipmid at 100% for extended periods of
>> time implies that there's a *lot* of traffic appearing. What's sending
>> it, and why?
>
> From the reports I've gathered, which are all from users *who don't want IPMI
> active on their systems*, it is some sort of health and power data about the
> system and the cluster.  (I'm sure that's a ELI5 to me specifically BTW ;))
>
> What kind of responses are expected?
>
> I'm not sure, TBH.  I don't think it really matters at the point that there is a
> huge amount of traffic.  I think the BMC is responding FWIW but the issue is
> that the amount of traffic overwhelms the system.
>
> Is the fact that we're
>> sending nothing back upsetting it?
>>
>
> I don't get that from the reports I've seen.  I think the issue is that there is
> just a huge volume of traffic on these systems.  Googling for "centos kipmi
> 100%" has a lot of hits; we seemingly made a bad choice when we built in IPMI.
> There is a glimmer of hope we can switch back to modular in RHEL.

Has anyone tested something more recent to see if this is still a
problem?  Upstream is taking the patch to default to =y, so keeping it
=m in Fedora is possible but it seems like we'd diverge for a use case
that is fairly infrequent for Fedora.

josh


More information about the kernel mailing list