Hi,
the attached patch was confirmed to work, so the code review should be easy. But because it adds an index to objectSID, which all AD objects have, there are two catches: 1) How log the upgrade takes 2) How much the database grows
To test, I created an AD instance with 10.000 users and 10.000 groups (adcli is great for this type of testing btw). Fetched all groups and users with the old db, then ran the upgrade.
On a VM running on my local laptop (granted, it has an SSD drive), the upgrade takes 10 seconds. The default systemd startup timeout (DefaultTimeoutStartSec) seems to be 90 seconds, which sounds OK to me.
The size, though, has nearly doubled. This is backup before upgrade: [root@adclient ~]# du -sh /root/cache_win.trust.test.ldb 47M /root/cache_win.trust.test.ldb And this is the cache after I upgraded: [root@adclient ~]# du -sh /var/lib/sss/db/cache_win.trust.test.ldb 97M /var/lib/sss/db/cache_win.trust.test.ldb
Unfortunately, I don't see us having another option than doing the upgrade. For attributes that are indexed, an index miss means that ldb is not going to search sequentially, but just return not found -- so adding indexes for newly added entries is not possible.
I would like to document that for huge databases (tens of thousands of cached entries), the timeout might need to be raised during the upgrade and than for deployments whose cache resides in tmpfs, the cache size might grow.
Is everyone OK with this?
On 08/17/2015 10:35 PM, Jakub Hrozek wrote:
Hi,
the attached patch was confirmed to work, so the code review should be easy. But because it adds an index to objectSID, which all AD objects have, there are two catches: 1) How log the upgrade takes
My main concern was the heartbeat interval. But I just tested it by adding sleep(50) into your newly added upgrade function and it work fine with default heartbeat (10 seconds). So I guess monitor starts pinging the domain backend after it is fully initialized and not sooner.
2) How much the database grows
To test, I created an AD instance with 10.000 users and 10.000 groups (adcli is great for this type of testing btw). Fetched all groups and users with the old db, then ran the upgrade.
On a VM running on my local laptop (granted, it has an SSD drive), the upgrade takes 10 seconds. The default systemd startup timeout (DefaultTimeoutStartSec) seems to be 90 seconds, which sounds OK to me.
The size, though, has nearly doubled. This is backup before upgrade: [root@adclient ~]# du -sh /root/cache_win.trust.test.ldb 47M /root/cache_win.trust.test.ldb And this is the cache after I upgraded: [root@adclient ~]# du -sh /var/lib/sss/db/cache_win.trust.test.ldb 97M /var/lib/sss/db/cache_win.trust.test.ldb
Unfortunately, I don't see us having another option than doing the upgrade. For attributes that are indexed, an index miss means that ldb is not going to search sequentially, but just return not found -- so adding indexes for newly added entries is not possible.
I would like to document that for huge databases (tens of thousands of cached entries), the timeout might need to be raised during the upgrade and than for deployments whose cache resides in tmpfs, the cache size might grow.
Is everyone OK with this?
It should be in realease notes, but I am not sure if that is enough visible place, so I would suggest putting it to Wiki as well (https://fedorahosted.org/sssd/wiki/Troubleshooting) for the case when someone hits the systemd timeout.
ACK to the patch.
Michal
On Tue, Aug 18, 2015 at 05:31:43PM +0200, Michal Židek wrote:
On 08/17/2015 10:35 PM, Jakub Hrozek wrote:
Hi,
the attached patch was confirmed to work, so the code review should be easy. But because it adds an index to objectSID, which all AD objects have, there are two catches: 1) How log the upgrade takes
My main concern was the heartbeat interval. But I just tested it by adding sleep(50) into your newly added upgrade function and it work fine with default heartbeat (10 seconds). So I guess monitor starts pinging the domain backend after it is fully initialized and not sooner.
2) How much the database grows
To test, I created an AD instance with 10.000 users and 10.000 groups (adcli is great for this type of testing btw). Fetched all groups and users with the old db, then ran the upgrade.
On a VM running on my local laptop (granted, it has an SSD drive), the upgrade takes 10 seconds. The default systemd startup timeout (DefaultTimeoutStartSec) seems to be 90 seconds, which sounds OK to me.
The size, though, has nearly doubled. This is backup before upgrade: [root@adclient ~]# du -sh /root/cache_win.trust.test.ldb 47M /root/cache_win.trust.test.ldb And this is the cache after I upgraded: [root@adclient ~]# du -sh /var/lib/sss/db/cache_win.trust.test.ldb 97M /var/lib/sss/db/cache_win.trust.test.ldb
Unfortunately, I don't see us having another option than doing the upgrade. For attributes that are indexed, an index miss means that ldb is not going to search sequentially, but just return not found -- so adding indexes for newly added entries is not possible.
I would like to document that for huge databases (tens of thousands of cached entries), the timeout might need to be raised during the upgrade and than for deployments whose cache resides in tmpfs, the cache size might grow.
Is everyone OK with this?
It should be in realease notes, but I am not sure if that is enough visible place, so I would suggest putting it to Wiki as well (https://fedorahosted.org/sssd/wiki/Troubleshooting) for the case when someone hits the systemd timeout.
ACK to the patch.
* master: e61b0e41cb44004d2b260ad9d05802995f7bcb2e
On (19/08/15 18:15), Jakub Hrozek wrote:
On Tue, Aug 18, 2015 at 05:31:43PM +0200, Michal Židek wrote:
On 08/17/2015 10:35 PM, Jakub Hrozek wrote:
Hi,
the attached patch was confirmed to work, so the code review should be easy. But because it adds an index to objectSID, which all AD objects have, there are two catches: 1) How log the upgrade takes
My main concern was the heartbeat interval. But I just tested it by adding sleep(50) into your newly added upgrade function and it work fine with default heartbeat (10 seconds). So I guess monitor starts pinging the domain backend after it is fully initialized and not sooner.
2) How much the database grows
To test, I created an AD instance with 10.000 users and 10.000 groups (adcli is great for this type of testing btw). Fetched all groups and users with the old db, then ran the upgrade.
On a VM running on my local laptop (granted, it has an SSD drive), the upgrade takes 10 seconds. The default systemd startup timeout (DefaultTimeoutStartSec) seems to be 90 seconds, which sounds OK to me.
The size, though, has nearly doubled. This is backup before upgrade: [root@adclient ~]# du -sh /root/cache_win.trust.test.ldb 47M /root/cache_win.trust.test.ldb And this is the cache after I upgraded: [root@adclient ~]# du -sh /var/lib/sss/db/cache_win.trust.test.ldb 97M /var/lib/sss/db/cache_win.trust.test.ldb
Unfortunately, I don't see us having another option than doing the upgrade. For attributes that are indexed, an index miss means that ldb is not going to search sequentially, but just return not found -- so adding indexes for newly added entries is not possible.
I would like to document that for huge databases (tens of thousands of cached entries), the timeout might need to be raised during the upgrade and than for deployments whose cache resides in tmpfs, the cache size might grow.
Is everyone OK with this?
It should be in realease notes, but I am not sure if that is enough visible place, so I would suggest putting it to Wiki as well (https://fedorahosted.org/sssd/wiki/Troubleshooting) for the case when someone hits the systemd timeout.
ACK to the patch.
- master: e61b0e41cb44004d2b260ad9d05802995f7bcb2e
Shall we push this performance enhancement to stable branch(1.12) as well?
LS
On Thu, Sep 03, 2015 at 06:24:35AM +0200, Lukas Slebodnik wrote:
On (19/08/15 18:15), Jakub Hrozek wrote:
On Tue, Aug 18, 2015 at 05:31:43PM +0200, Michal Židek wrote:
On 08/17/2015 10:35 PM, Jakub Hrozek wrote:
Hi,
the attached patch was confirmed to work, so the code review should be easy. But because it adds an index to objectSID, which all AD objects have, there are two catches: 1) How log the upgrade takes
My main concern was the heartbeat interval. But I just tested it by adding sleep(50) into your newly added upgrade function and it work fine with default heartbeat (10 seconds). So I guess monitor starts pinging the domain backend after it is fully initialized and not sooner.
2) How much the database grows
To test, I created an AD instance with 10.000 users and 10.000 groups (adcli is great for this type of testing btw). Fetched all groups and users with the old db, then ran the upgrade.
On a VM running on my local laptop (granted, it has an SSD drive), the upgrade takes 10 seconds. The default systemd startup timeout (DefaultTimeoutStartSec) seems to be 90 seconds, which sounds OK to me.
The size, though, has nearly doubled. This is backup before upgrade: [root@adclient ~]# du -sh /root/cache_win.trust.test.ldb 47M /root/cache_win.trust.test.ldb And this is the cache after I upgraded: [root@adclient ~]# du -sh /var/lib/sss/db/cache_win.trust.test.ldb 97M /var/lib/sss/db/cache_win.trust.test.ldb
Unfortunately, I don't see us having another option than doing the upgrade. For attributes that are indexed, an index miss means that ldb is not going to search sequentially, but just return not found -- so adding indexes for newly added entries is not possible.
I would like to document that for huge databases (tens of thousands of cached entries), the timeout might need to be raised during the upgrade and than for deployments whose cache resides in tmpfs, the cache size might grow.
Is everyone OK with this?
It should be in realease notes, but I am not sure if that is enough visible place, so I would suggest putting it to Wiki as well (https://fedorahosted.org/sssd/wiki/Troubleshooting) for the case when someone hits the systemd timeout.
ACK to the patch.
- master: e61b0e41cb44004d2b260ad9d05802995f7bcb2e
Shall we push this performance enhancement to stable branch(1.12) as well?
LS
I think so, I will push it unless there is some oposition due to the upgrade taking potentially too long.
On Thu, Sep 03, 2015 at 08:19:58AM +0200, Jakub Hrozek wrote:
On Thu, Sep 03, 2015 at 06:24:35AM +0200, Lukas Slebodnik wrote:
On (19/08/15 18:15), Jakub Hrozek wrote:
On Tue, Aug 18, 2015 at 05:31:43PM +0200, Michal Židek wrote:
On 08/17/2015 10:35 PM, Jakub Hrozek wrote:
Hi,
the attached patch was confirmed to work, so the code review should be easy. But because it adds an index to objectSID, which all AD objects have, there are two catches: 1) How log the upgrade takes
My main concern was the heartbeat interval. But I just tested it by adding sleep(50) into your newly added upgrade function and it work fine with default heartbeat (10 seconds). So I guess monitor starts pinging the domain backend after it is fully initialized and not sooner.
2) How much the database grows
To test, I created an AD instance with 10.000 users and 10.000 groups (adcli is great for this type of testing btw). Fetched all groups and users with the old db, then ran the upgrade.
On a VM running on my local laptop (granted, it has an SSD drive), the upgrade takes 10 seconds. The default systemd startup timeout (DefaultTimeoutStartSec) seems to be 90 seconds, which sounds OK to me.
The size, though, has nearly doubled. This is backup before upgrade: [root@adclient ~]# du -sh /root/cache_win.trust.test.ldb 47M /root/cache_win.trust.test.ldb And this is the cache after I upgraded: [root@adclient ~]# du -sh /var/lib/sss/db/cache_win.trust.test.ldb 97M /var/lib/sss/db/cache_win.trust.test.ldb
Unfortunately, I don't see us having another option than doing the upgrade. For attributes that are indexed, an index miss means that ldb is not going to search sequentially, but just return not found -- so adding indexes for newly added entries is not possible.
I would like to document that for huge databases (tens of thousands of cached entries), the timeout might need to be raised during the upgrade and than for deployments whose cache resides in tmpfs, the cache size might grow.
Is everyone OK with this?
It should be in realease notes, but I am not sure if that is enough visible place, so I would suggest putting it to Wiki as well (https://fedorahosted.org/sssd/wiki/Troubleshooting) for the case when someone hits the systemd timeout.
ACK to the patch.
- master: e61b0e41cb44004d2b260ad9d05802995f7bcb2e
Shall we push this performance enhancement to stable branch(1.12) as well?
LS
I think so, I will push it unless there is some oposition due to the upgrade taking potentially too long.
Pushed to sssd-1-12: 676e8043930e383f9180d2ac4810655649c103ee
sssd-devel@lists.fedorahosted.org