Hi Thierry,
not sure I agree with your concern:
As I understand things each listener thread is associated with an active
list
and active list links are in the middle of connection slot (which is IMHO
large enough to
to spawn on several cache lines () So I do not think we will really have
problems with cache line (related to the multiple listeners threads)
Splitting the CT would mean that a connection would be linked forever with
a listener and may lead to have one listener overloaded while some others
are idle.
The round robin (when opening the connection) solution limit this risks and
tend to spread the load over the CT at a price of cache reload when a slot
is reopen (which IMHO is a good compromise)
That said, with James design it is easy to test the "connection slot
associated with fixed listener thread) by replacing the round robin by a
modulo on slot index
A last point about the cache (the connection handling is not bound to a
listener but
always oscillate between listener thread and worker thread (so I suspect
that having
fixed or not listener will have litlle impact on the cache handling)
On Fri, Dec 10, 2021 at 11:40 AM Thierry Bordaz <tbordaz(a)redhat.com> wrote:
On 12/9/21 6:28 PM, James Chapman wrote:
Hi,
I didn't create a PR yet, here is a link to the issue -
https://github.com/389ds/389-ds-base/issues/4812
Thanks
On Wed, Nov 24, 2021 at 10:40 PM William Brown <william.brown(a)suse.com>
wrote:
>
>
> > On 24 Nov 2021, at 22:03, James Chapman <jachapma(a)redhat.com> wrote:
> >
> >
> >
> > On Tue, Nov 23, 2021 at 10:22 PM William Brown <william.brown(a)suse.com>
> wrote:
> >
> >
> > > On 23 Nov 2021, at 23:40, James Chapman <jachapma(a)redhat.com> wrote:
> > >
> > > I have done some work on 389 ds connection management that I would
> appreciate the community's feedback on, you can find attached a draft patch
> for review.
> > >
> > > Problem statement
> > > Currently the connection table (CT) contains an array of established
> connections that are monitored for activity by a single process. As the
> number of established connections increase, so too does the overhead of
> monitoring these connections. The single process that monitors established
> connections for activity becomes a bottleneck, limiting the ability of the
> server to handle new connections.
> > >
> > > Solution
> > > One solution to this problem is to segment the CT into smaller
> portions, with each portion having a dedicated thread to monitor its
> connections. But, rather than divide the CT into smaller portions, the
> approach I prefered was to add multiple active lists to the CT, where each
> active list would have its own dedicated thread.
>
James, I am really sorry to be back so late but I have a concern that
popup with multiple active lists within a shared CT.
The CT will remain shared, so I imagine that for example CT[1234] (slot
1234 of the connection table) will contain lists. Let's imagine you have 10
listeners (of established connections). Will those 10 threads access the
slot CT[1234] ?
If they do, then my concern is that when this slot be updated (lock taken
for example) then cache lines containing the lists will likely be
invalidated. So an listener accessing CT[1234] may impact another listener
running on another CPU. If this problem exists I think it could be mitigate
if we cache line align the list structure but it would likely be a waste of
space.
What is the main concern to split the CT into chunks and give a chunck to
each listener ? is multiple lists safer/easier to implement ?
regards
thierry
> >
> > > Benefit
> > > With a single thread monitoring each CT active list, connections can
> be monitored in parallel, removing the bottleneck mentioned above.
> > > Instead of a single CT active list containing all established
> connections, there will be multiple CT active lists that share the total
> number of established connections between them.
> > > With this change I noticed a ~20% increase in the number of
> connections per second the server can handle.
> >
> > This is good, it really does help us here. It would be better to move
> to epoll but I think that would be too invasive and hard for the current
> connection code, as it would basically be a rewrite.
> >
> > I did try epoll() a while ago, just to see if it performs better than
> PR_Poll(), but I ran into some issue with permissions of file descriptors,
> so I ditched it.
> >
> > But the multiple active lists I think is a much simpler idea,
> especially given we can only have a single accept() anyway.
> >
> > Could it also be worth changing how we monitor connections? Rather than
> iterating over the CT, we have a connection on a "state" change issue that
> update to a channel, and then the monitor thread aggregates all that info
> together to get a snapshot of the current connection state?
> >
> > Yes, I can look into this.
>
> Happy to review that too :)
>
> >
> > >
> > > Opens
> > > I tested this change with 100, 500, 1k, 5k and 10k concurrent
> connections, I have found that having two CT active lists is the optimal
> config. I think we should hardcode the CT active list number to two and
> have it hidden from the user/sysadmin, or would it be better as a
> configurable parameter?
> >
> > Hardcode. Every single tunable setting is something that we then have
> to support til the heat death of the universe because we have no way to
> "remove" support for anything. In most cases no one will ever change, nor
> will they know the impact of changing it to the same level we do.
> >
> > See also - research that literally says most tunables go unused:
> >
> >
>
https://experts.illinois.edu/en/publications/hey-you-have-given-me-too-ma...
> >
> > That makes sense alright.
> >
> > I'll review the code further later, but it is worth making this a PR
> instead?
> > Sure, I will harden the patch a bit and create a PR.
>
> No problem mate, great work :)
>
> >
> > Thanks for your feedback
> >
> > >
> > > Thanks
> > > Jamie
> > >
>
<connection-table-multi-lists.patch>_______________________________________________
> > > 389-devel mailing list -- 389-devel(a)lists.fedoraproject.org
> > > To unsubscribe send an email to
> 389-devel-leave(a)lists.fedoraproject.org
> > > Fedora Code of Conduct:
>
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
> > > List Guidelines:
>
https://fedoraproject.org/wiki/Mailing_list_guidelines
> > > List Archives:
>
https://lists.fedoraproject.org/archives/list/389-devel@lists.fedoraproje...
> > > Do not reply to spam on the list, report it:
>
https://pagure.io/fedora-infrastructure
> >
> > --
> > Sincerely,
> >
> > William Brown
> >
> > Senior Software Engineer, Identity and Access Management
> > SUSE Labs, Australia
> > _______________________________________________
> > 389-devel mailing list -- 389-devel(a)lists.fedoraproject.org
> > To unsubscribe send an email to 389-devel-leave(a)lists.fedoraproject.org
> > Fedora Code of Conduct:
>
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
> > List Guidelines:
https://fedoraproject.org/wiki/Mailing_list_guidelines
> > List Archives:
>
https://lists.fedoraproject.org/archives/list/389-devel@lists.fedoraproje...
> > Do not reply to spam on the list, report it:
>
https://pagure.io/fedora-infrastructure
> > _______________________________________________
> > 389-devel mailing list -- 389-devel(a)lists.fedoraproject.org
> > To unsubscribe send an email to 389-devel-leave(a)lists.fedoraproject.org
> > Fedora Code of Conduct:
>
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
> > List Guidelines:
https://fedoraproject.org/wiki/Mailing_list_guidelines
> > List Archives:
>
https://lists.fedoraproject.org/archives/list/389-devel@lists.fedoraproje...
> > Do not reply to spam on the list, report it:
>
https://pagure.io/fedora-infrastructure
>
> --
> Sincerely,
>
> William Brown
>
> Senior Software Engineer, Identity and Access Management
> SUSE Labs, Australia
> _______________________________________________
> 389-devel mailing list -- 389-devel(a)lists.fedoraproject.org
> To unsubscribe send an email to 389-devel-leave(a)lists.fedoraproject.org
> Fedora Code of Conduct:
>
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
> List Guidelines:
https://fedoraproject.org/wiki/Mailing_list_guidelines
> List Archives:
>
https://lists.fedoraproject.org/archives/list/389-devel@lists.fedoraproje...
> Do not reply to spam on the list, report it:
>
https://pagure.io/fedora-infrastructure
>
_______________________________________________
389-devel mailing list -- 389-devel(a)lists.fedoraproject.org
To unsubscribe send an email to 389-devel-leave(a)lists.fedoraproject.org
Fedora Code of Conduct:
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines:
https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives:
https://lists.fedoraproject.org/archives/list/389-devel@lists.fedoraproje...
Do not reply to spam on the list, report it:
https://pagure.io/fedora-infrastructure
_______________________________________________
389-devel mailing list -- 389-devel(a)lists.fedoraproject.org
To unsubscribe send an email to 389-devel-leave(a)lists.fedoraproject.org
Fedora Code of Conduct:
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines:
https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives:
https://lists.fedoraproject.org/archives/list/389-devel@lists.fedoraproje...
Do not reply to spam on the list, report it:
https://pagure.io/fedora-infrastructure