Yum, Proxy Cache Safety, Storage Backend

Tim Lauridsen tim.lauridsen at googlemail.com
Thu Jan 24 14:30:47 UTC 2008


Les Mikesell wrote:
> Warren Togami wrote:
>> Les Mikesell wrote:
>>>
>>> Interesting, but it still requires custom setup for any 
>>> distro/version that the proxy admin would want to support. What I'd 
>>> really like to happen is for yum to just always prefer the same URL 
>>> when working through the same proxy so caching would work by default 
>>> without needing to be aware of the cache content.  This would work 
>>> automatically if the target was a single site, RRDNS, or geo-ip 
>>> managed DNS, but you probably can't arrange that for all the repo 
>>> mirrors. There has to be some clever way to get the same effect even 
>>> when using a mirrorlist - like making sure the mirrorlist itself is 
>>> cached and always picking the same entry so any client will use the 
>>> same URL that the mirrormanger gave to the first one that made a 
>>> request.  Of course you'd need a reasonable retry mechanism to pick 
>>> something else if this choice fails but I'd guess it would be a big 
>>> win in bandwidth use and load on the mirrors if it worked most of the 
>>> time to take advantage of existing local caches with no modifications.
>>>
>>
>> I just thought of a simple but gross solution for you.
>>
>> http://mirrors.fedoraproject.org/mirrorlist?repo=fedora-$releasever&arch=$basearch 
>>
>>
>> It sounds like you are using a transparent proxy.  Just redirect 
>> mirrors.fedoraproject.org to localhost at another port and serve files 
>> so the mirrorlist URL's hand back a single mirror of your choosing.
> 
> I think you are missing my point, which is that it would be a huge win 
> if yum automatically used typical existing caching proxies with no extra 
> setup on anyone's part, so that any number of people behind them would 
> get the cached packages without knowing about each other or that they 
> need to do something special to defeat the random URLs.  I used to run a 
> number of centos3 boxes in several locations and it always worked nicely 
> to just:
> http_proxy=http://my_proxy.domain:port  yum update
> pointing at a local squid because the mirrors used RRDNS so the URLs 
> were the same among the machines - and this would have happened 
> automatically with a transparent proxy  or on machines set to use a 
> proxy by default as they must be in many locations.  Since yum started 
> randomizing the requests with a mirrorlist, updates are a lot slower.

This has nothing to do with yum, it is the disto there decides how to
give access to mirrors, yum just does what the distro repo contig tell 
it to do.

"Don't blame the postman for the bad news" :)

> 
> Maybe yum needs to do some tricks with cache control headers or 
> appending random arguments to ensure the repo data is fresh, but there 
> has to be some way to make it re-use packages already downloaded in a 
> local proxy cache without any local changes.   We have several locations 
> where everyone in a large building has to use the same proxy to get out, 
> but the people who would be installing/updating their own linux boxes 
> would not know what anyone else is doing or be likely to coordinate the 
> choice of a URL if they had to change anything - and I'd guess that's a 
> common situation.

This thread is upside down IMHO.
The proxy don't do the right thing, so let us redesign yum, to fix the 
problems with the proxy. I seems to be the wrong approach IMHO.

Tim "Proxies is nothing but trouble" :)







More information about the devel mailing list