How to find a needle in a haystack?

aragonx at dcsnow.com aragonx at dcsnow.com
Tue May 18 20:49:34 UTC 2010


Hello all,

I need some ideas.

I have a backup server that contains 10 ext3 file systems each with 12
million files scattered randomly over 4000 directories.  The files average
size is 1MB.  Every day I expect to get 20 or so requests for files from
this archive.  The files were not stored in any logical structure that I
can use to narrow down the search.  This will be different moving forward
but it does not help me for the old data.  Additionally, every day data is
added and old data is removed to make space.

So, now that you know a little about the environment, I need ideas on how
to find the file I want to restore fast.

Using find on the partition is slow.

I thought about using find and piping the output to a file.  I started it
50 minutes ago and it still isn't done on a single partition.  Plus the
file is currently about 1.3GB and how would I maintain such a file?

Would putting the file names + path in a database be faster?

As always, any help would be greatly appreciated.

---
Will Y.


-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.



More information about the users mailing list