A near realtime fs mirror application (for backup, written in Python, by Linux inotify)

Roc Zhou chowroc.z+l at gmail.com
Wed Oct 31 06:30:53 UTC 2007


Now I meet a strange problem.

After the first sync init, it enters to the realtime replication state. I
deployed them on 3 machines, and have run near half month. Suddenly one day,
a host, I don't know what's wrong, I found fs_mirror get the empty
records from its mirrord agent. In normal conditions, these records should
be:
"CREATE:/var/www/html"
"FWRITE:/var/www/html/index.php"
"DELETE:/var/www/html/temp"
"MOVE:('/var/www/html/aa', '/var/www/html/bb')"
...
But should be no empty records. This lead to fs_mirror to a dead infinite
loop.

I restart the fs_mirror from the broken point, but the problem remains,
after DEBUG I found the problem occurs at the same serial number every
time(I use Berkeley DB as the log record(wmLog), and serial numbers are the
keys), so I suspect that the problem is BDB, but I don't know how to test
and locate to the right place.

I tried to open the orignal db file in Python:
>>> import bsddb
>>> x = bsddb.btopen("/var/mirrord/wmlog"
>>> len(x)
623748
>>> x["6854"]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.5/bsddb/__init__.py", line 223, in __getitem__
    return _DeadlockWrap(lambda: self.db[key])  # self.db[key]
  File "/usr/lib/python2.5/bsddb/dbutils.py", line 62, in DeadlockWrap
    return function(*_args, **_kwargs)
  File "/usr/lib/python2.5/bsddb/__init__.py", line 223, in <lambda>
    return _DeadlockWrap(lambda: self.db[key])  # self.db[key]
KeyError: '6854'
>>> x[str(6854)]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.5/bsddb/__init__.py", line 223, in __getitem__
    return _DeadlockWrap(lambda: self.db[key])  # self.db[key]
  File "/usr/lib/python2.5/bsddb/dbutils.py", line 62, in DeadlockWrap
    return function(*_args, **_kwargs)
  File "/usr/lib/python2.5/bsddb/__init__.py", line 223, in <lambda>
    return _DeadlockWrap(lambda: self.db[key])  # self.db[key]
KeyError: '6854'
>>> x.first()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.5/bsddb/__init__.py", line 278, in first
    rv = _DeadlockWrap(self.dbc.first)
  File "/usr/lib/python2.5/bsddb/dbutils.py", line 62, in DeadlockWrap
    return function(*_args, **_kwargs)
_bsddb.DBNotFoundError: (-30990, 'DB_NOTFOUND: No matching key/data pair
found

Even I have stopped the mirrord daemon, the errors remain.

Then I tried to copy and move out the database file, and open the new
dbfile:
>>> import bsddb
>>> x = bsddb.btopen("/tmp/wmlog")
>>> len(x)
0
the length is 0, and getitem get the same errors above.

Why this occurs when I copy the db file? Especially the len() is 0?!

I can only restart the mirrord, to rebuild the BDB data file, and so far,
this problem does not occurs again.

I don't know why there is a occasional problem like this? Is there any one
be familiar with BDB can give me several advices?

Thanks.

-- 
------------------------------------------
My Projects:
 http://sourceforge.net/projects/crablfs
http://crablfs.sourceforge.net/
http://crablfs.sourceforge.net/#ru_data_man
http://crablfs.sourceforge.net/tree.html
http://cralbfs.sourceforge.net/sysadm_zh_CN.html
My Blog:
http://chowroc.blogspot.com/
http://hi.baidu.com/chowroc_z/
Looking for a space and platform to exert my originalities (for my
projects)...
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.fedoraproject.org/pipermail/users/attachments/20071031/b964bfc1/attachment-0001.html 


More information about the users mailing list