On 01/08/14 12:19, Andrew Beekhof wrote:
On 8 Jan 2014, at 2:42 pm, Gao,Yan <ygao(a)suse.com> wrote:
> Hi Andrew,
> On 01/08/14 05:35, Andrew Beekhof wrote:
>>
>> On 8 Jan 2014, at 4:58 am, Gao,Yan <ygao(a)suse.com> wrote:
>>
>>> Hi Andrew, David,
>>>
>>> This is a scenario from an user:
>>> Two nodes with 64 DRBD resources, running the latest throttling code.
>>> The user wants as high as possible concurrency. So they started with the
>>> following configuration:
>>>
>>> LRMD_MAX_CHILDREN=64
>>
>> This is almost certainly a horribly inappropriate value to use.
>> How many cores do these boxes have? I'm guessing less than 16.
> Each of them has 24 cores.
Thats actually pretty respectable :)
Alas, it probably makes things worse for the cib - since the updates are likely taking
longer than the actual operations.
These guys are going to love the new cib code :)
Really looking forward to that
:-)
Regards,
Gao,Yan
>
>>
>>> load-threshold=0%
>>>
>>> 1. Start/Promote 64 DRBD resources [PASS].
>>>
>>> 2. Shutdown one node at 09:32:51. Quite some failures like the following
>>> are encountered when the notify actions invoke "crm_master":
>>>
>>> Jan 5 09:33:08 liona drbd(happy21-drbdclone)[6714]: ERROR: happy21:
>>> Called /usr/sbin/crm_master -Q -l reboot -v 10000
>>> Jan 5 09:33:08 liona drbd(happy21-drbdclone)[6714]: ERROR: happy21:
>>> Exit code 107
>>> Jan 5 09:33:08 liona drbd(happy21-drbdclone)[6714]: ERROR: happy21:
>>> Command output:
>>> Jan 5 09:33:08 liona lrmd[25655]: notice: operation_finished:
>>> happy21-drbdclone_notify_0:6714:stderr [ Could not establish cib_rw
>>> connection: Resource temporarily unavailable (11) ]
>>> Jan 5 09:33:08 liona lrmd[25655]: notice: operation_finished:
>>> happy21-drbdclone_notify_0:6714:stderr [ Error signing on to the CIB
>>> service: Transport endpoint is not connected ]
>>> ...
>>>
>>>
>>> Over 1 minute after the node was issued shutdown, the throttle code says:
>>>
>>> Jan 5 09:34:08 lionb crmd[13212]: notice: throttle_mode: High CIB
>>> load detected: 0.960333
>>>
>>> According to the code, cib_max_cpu is 0.95 here. While apparently,
>>> before the cib's load is founded to have exceeded 0.95, the cib has
>>> already been overloaded.
>>
>> The throttling code can only do so much.
>> By setting LRMD_MAX_CHILDREN=64, there are still around 128 updates queued for
the cib to process.
> Indeed.
>>
>>>
>>> Of course, one of the options here is to tune down "load-threshold"
to
>>> find an appropriate value for the deployment -- It might not be very
>>> easy to find an optimized value of it for the possible scenarios.
>>
>> I'll let David judge the patch, but not specifying ridiculous values for
LRMD_MAX_CHILDREN would be the best initial path forward.
>> Its fine to want "as high as possible concurrency", but subverting the
throttling code by setting unrealistic job limits achieves the opposite.
> Yes, agreed. We've been telling them tuning down the concurrency
> actually could speed up the overall process.
>
> Thanks a lot for your comments and suggestions!
>
> Regards,
> Gao,Yan
>
>>
>>>
>>> While the user sought a way to prevent such failures -- to lengthen the
>>> listen queue of libqb's IPC:
>>>
>>> diff -uNr libqb/lib/util_int.h libqbfio/lib/util_int.h
>>> --- libqb/lib/util_int.h
>>> 2013-10-23 08:44:54.000000000 -0600
>>> +++ libqbfio/lib/util_int.h
>>> 2014-01-06 13:12:18.471097320 -0700
>>> @@ -99,7 +99,7 @@
>>> */
>>> void qb_socket_nosigpipe(int32_t s);
>>>
>>> -#define SERVER_BACKLOG 5
>>> +#define SERVER_BACKLOG 128
>>>
>>> #ifndef UNIX_PATH_MAX
>>> #define UNIX_PATH_MAX 108
>>>
>>>
>>> And it did help. The failures are longer encountered in the tests.
>>>
>>> We'd want the user to tune down the thresholds since we believe the cib
>>> is being overloaded. Meanwhile, apparently, with the larger listen
>>> backlog, the cib requests get better chance to get response with some
>>> delay, instead of being rejected immediately. Do you think this change
>>> make sense?
>>>
>>> Regards,
>>> Gao,Yan
>>> --
>>> Gao,Yan <ygao(a)suse.com>
>>> Software Engineer
>>> China Server Team, SUSE.
>>> _______________________________________________
>>> quarterback-devel mailing list
>>> quarterback-devel(a)lists.fedorahosted.org
>>>
https://lists.fedorahosted.org/mailman/listinfo/quarterback-devel
>>
>>
>>
>> _______________________________________________
>> Pcmk-devel mailing list
>> Pcmk-devel(a)oss.clusterlabs.org
>>
http://oss-2.clusterlabs.org/mailman/listinfo/pcmk-devel
>>
>
> --
> Gao,Yan <ygao(a)suse.com>
> Software Engineer
> China Server Team, SUSE.
>
> _______________________________________________
> Pcmk-devel mailing list
> Pcmk-devel(a)oss.clusterlabs.org
>
http://oss-2.clusterlabs.org/mailman/listinfo/pcmk-devel
_______________________________________________
Pcmk-devel mailing list
Pcmk-devel(a)oss.clusterlabs.org
http://oss-2.clusterlabs.org/mailman/listinfo/pcmk-devel
--
Gao,Yan <ygao(a)suse.com>
Software Engineer
China Server Team, SUSE.