InstantMirror needs a rethink
chasd
chasd at silveroaks.com
Thu Jan 24 18:21:30 UTC 2008
> Today InstantMirror is pretty useful for home and small office
> mirrors,
> but its limitations make it unsustainable without manual
> intervention of
> the sysadmin.
I am using it now so our ~20 systems don't waste T-1 bandwidth.
> - Synchronization/locking of multiple connections downloading the same
> file is awkward and broken.
My use is low enough volume I haven't run into that.
> - There is no good way to clean up aborted tmp files.
Haven't had any.
> - There is no good way to know what are old files that need pruning.
With disk space relatively cheap here in the USA, and a new Fedora
every ~6 months, I just rm -rf the old release directories after I
migrate to the new version. I don't worry about multiple updates to
the same package, except for giant ones like OOo.
Another outgrowth of the Fedora release cycle is I usually only apply
security updates, or updates that fix specific problems I experience.
There is no sense for me to download and apply updates for hardware I
don't use, for example. I figure I'll pick up application updates in
6 months when the next release drops, I usually don't need the update
_right now_.
I don't need a rsync of a mirror, just a cache of the updates I
choose to apply because those specific updates will be applied across
multiple machines.
> - There is no good way of keeping track of the "Big Picture" of its
> own
> cache, "least recently used" knowing what files were unpopular locally
> and should be pruned.
I don't have a need for that functionality with my usage.
> Any thoughts?
Ignoring the temp file and multiple connection issues, the
synchronization part could be solved by InstantMirror writing some
type of log file or access popularity file. A separate cron script
could read in that data and prune the unpopular / duplicate files.
From a separate message :
> 1) Origin HTTP mirrors can be configured to serve "Cache-Control:
> max-age=0" in HTTP headers whenever they serve repodata/* files. This
> can become a standard recommendation for all Fedora mirrors. Does
> anyone know how to configure Apache to do this?
<Directory /var/ftp/pub/fedora/linux/releases/8/Everything/x86_64/os/
repodata>
Header always set Cache-Control: max-age=0
</Directory>
Probably the best way would be to put this in a .htaccess file for
each repodata directory as that directory is created. The .htaccess
file would have a local directory directive instead of a full path
( createrepo ? ). Otherwise the main apache config ( or a file in
conf.d ) would need to be updated / added each time a release is made
( or an arch is added).
> 2) Squid refresh_pattern can use a regex to override max-age=0 for
> repodata/* files. I haven't figured out exactly what the syntax is
> for
> this. Anybody know squid.conf?
refresh_pattern \/repodata\/.* 0 0% 0
> <hno> Apache do not have this same abstract internal layer, and
> writing
> a mod_disk_cache replacement which keeps a mirror type file structure
> should be pretty easy thing to do.
This seems to best leverage existing code / apps, although I am not
in a position to help here.
Charles Dostale
System Admin - Silver Oaks Communications
http://www.silveroaks.com/
824 17th Street, Moline IL 61265
More information about the devel
mailing list