Hi There,
We tried to dynamically a new schema dynamically using /usr/lib64/dirsrv/slapd-eldapp1/schema-reload.pl Unfortunately, (and unknown to us at the time) the objectClass definition misspelt a couple of the attribute names. The schema reload process should have picked that up and refused it, but it didn't and so proceeded to update entries using the new schema. That's when we started getting errors like the following in the error log:
[19/Jun/2020:10:28:08.390882389 -0700] - ERR - libdb - BDB0151 fsync: Input/output error [19/Jun/2020:10:28:08.399523527 -0700] - ERR - libdb - BDB0151 fsync: Input/output error [19/Jun/2020:10:28:08.404890880 -0700] - ERR - libdb - BDB0151 fsync: Input/output error [19/Jun/2020:10:28:08.430284251 -0700] - ERR - libdb - BDB0151 fsync: Input/output error [19/Jun/2020:10:28:08.466371449 -0700] - ERR - libdb - BDB0151 fsync: Input/output error [19/Jun/2020:10:28:08.495859651 -0700] - ERR - libdb - BDB0151 fsync: Input/output error [19/Jun/2020:10:28:08.522007224 -0700] - ERR - libdb - BDB0151 fsync: Input/output error [19/Jun/2020:10:28:08.546930415 -0700] - ERR - libdb - BDB4519 txn_checkpoint: failed to flush the buffer cache: Input/output error [19/Jun/2020:10:28:08.569781853 -0700] - CRIT - checkpoint_threadmain - Serious Error---Failed to checkpoint database, err=5 (Input/output error)
I tried restarting dirsrv and that's when it started giving errors about the unknown (misspelt) attributes in the new objectClass. I fixed those errors in the schema and restarting dirsrv. I saw the following message in the error log:
NOTICE - dblayer_start - Detected Disorderly Shutdown last time Directory Server was running, recovering database.
There was no further log, but the CPU utilization for ns-slapd was at 99.9% so I just let it run over night hoping that it wasn't stuck in a loop. But there was no improvement the next morning, so I ordered a RAM increase from 4 GB --> 16 GB hoping that would fix it, I let it run for a while with no indication of progress. I also tried to run db2ldif to try to dump the db to an ldif file, but got the same "recovering database" message. That's where it is now - I'll let it run for a few hours and hope it does something.
Would anyone be able to offer any further advice? Is there any way to see how it's getting along with the database recovery? Is this db well and truly hosed? Unfortunately this system was spec'd for development so no backups were running so recovery from backup is not an option.
Thanks, Trev
Trevor,
I have not seen this before, but I also have not seen what happens when you add invalid schema.
But to try and get the server back up and running try removing the /var/lib/dirsrv/slapd-YOUR_INSTANCE/db/__db.00* files. So make sure the ns-slapd process is not running, kill it if you have to, then remove those files and try starting it back up.
HTH,
Mark
On 6/25/20 1:42 PM, Fong, Trevor wrote:
Hi There,
We tried to dynamically a new schema dynamically using /usr/lib64/dirsrv/slapd-eldapp1/schema-reload.pl
Unfortunately, (and unknown to us at the time) the objectClass definition misspelt a couple of the attribute names.
The schema reload process should have picked that up and refused it, but it didn't and so proceeded to update entries using the new schema.
That's when we started getting errors like the following in the error log:
[19/Jun/2020:10:28:08.390882389 -0700] - ERR - libdb - BDB0151 fsync: Input/output error
[19/Jun/2020:10:28:08.399523527 -0700] - ERR - libdb - BDB0151 fsync: Input/output error
[19/Jun/2020:10:28:08.404890880 -0700] - ERR - libdb - BDB0151 fsync: Input/output error
[19/Jun/2020:10:28:08.430284251 -0700] - ERR - libdb - BDB0151 fsync: Input/output error
[19/Jun/2020:10:28:08.466371449 -0700] - ERR - libdb - BDB0151 fsync: Input/output error
[19/Jun/2020:10:28:08.495859651 -0700] - ERR - libdb - BDB0151 fsync: Input/output error
[19/Jun/2020:10:28:08.522007224 -0700] - ERR - libdb - BDB0151 fsync: Input/output error
[19/Jun/2020:10:28:08.546930415 -0700] - ERR - libdb - BDB4519 txn_checkpoint: failed to flush the buffer cache: Input/output error
[19/Jun/2020:10:28:08.569781853 -0700] - CRIT - checkpoint_threadmain
- Serious Error---Failed to checkpoint database, err=5 (Input/output
error)
I tried restarting dirsrv and that's when it started giving errors about the unknown (misspelt) attributes in the new objectClass.
I fixed those errors in the schema and restarting dirsrv.
I saw the following message in the error log:
NOTICE - dblayer_start - Detected Disorderly Shutdown last time Directory Server was running, recovering database.
There was no further log, but the CPU utilization for ns-slapd was at 99.9% so I just let it run over night hoping that it wasn't stuck in a loop.
But there was no improvement the next morning, so I ordered a RAM increase from 4 GB à16 GB hoping that would fix it, I let it run for a while with no indication of progress.
I also tried to run db2ldif to try to dump the db to an ldif file, but got the same "recovering database" message. That's where it is now - I'll let it run for a few hours and hope it does something.
Would anyone be able to offer any further advice?
Is there any way to see how it's getting along with the database recovery?
Is this db well and truly hosed?
Unfortunately this system was spec'd for development so no backups were running so recovery from backup is not an option.
Thanks,
Trev
389-users mailing list -- 389-users@lists.fedoraproject.org To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject....
389-users@lists.fedoraproject.org