On 23 Nov 2021, at 23:40, James Chapman <jachapma(a)redhat.com>
wrote:
I have done some work on 389 ds connection management that I would appreciate the
community's feedback on, you can find attached a draft patch for review.
Problem statement
Currently the connection table (CT) contains an array of established connections that are
monitored for activity by a single process. As the number of established connections
increase, so too does the overhead of monitoring these connections. The single process
that monitors established connections for activity becomes a bottleneck, limiting the
ability of the server to handle new connections.
Solution
One solution to this problem is to segment the CT into smaller portions, with each
portion having a dedicated thread to monitor its connections. But, rather than divide the
CT into smaller portions, the approach I prefered was to add multiple active lists to the
CT, where each active list would have its own dedicated thread.
Benefit
With a single thread monitoring each CT active list, connections can be monitored in
parallel, removing the bottleneck mentioned above.
Instead of a single CT active list containing all established connections, there will be
multiple CT active lists that share the total number of established connections between
them.
With this change I noticed a ~20% increase in the number of connections per second the
server can handle.
This is good, it really does help us here. It would be better to move to epoll but I think
that would be too invasive and hard for the current connection code, as it would basically
be a rewrite.
But the multiple active lists I think is a much simpler idea, especially given we can only
have a single accept() anyway.
Could it also be worth changing how we monitor connections? Rather than iterating over the
CT, we have a connection on a "state" change issue that update to a channel, and
then the monitor thread aggregates all that info together to get a snapshot of the current
connection state?
Opens
I tested this change with 100, 500, 1k, 5k and 10k concurrent connections, I have found
that having two CT active lists is the optimal config. I think we should hardcode the CT
active list number to two and have it hidden from the user/sysadmin, or would it be better
as a configurable parameter?
Hardcode. Every single tunable setting is something that we then have to support til the
heat death of the universe because we have no way to "remove" support for
anything. In most cases no one will ever change, nor will they know the impact of changing
it to the same level we do.
See also - research that literally says most tunables go unused:
https://experts.illinois.edu/en/publications/hey-you-have-given-me-too-ma...
I'll review the code further later, but it is worth making this a PR instead?
Thanks
Jamie
<connection-table-multi-lists.patch>_______________________________________________
389-devel mailing list -- 389-devel(a)lists.fedoraproject.org
To unsubscribe send an email to 389-devel-leave(a)lists.fedoraproject.org
Fedora Code of Conduct:
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines:
https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives:
https://lists.fedoraproject.org/archives/list/389-devel@lists.fedoraproje...
Do not reply to spam on the list, report it:
https://pagure.io/fedora-infrastructure
--
Sincerely,
William Brown
Senior Software Engineer, Identity and Access Management
SUSE Labs, Australia