I've tried to find out some of the technical details of this.
publishes the current hash of repomd.xml, and also hashes of usually up to
two older repomd.xml files. You can see it here:
It's the <hash> tag in <file> and <alternates> tags.
Here's a nice graph showing how often our mirrors distribute current, or
The time restraints are defined here:
If the current push is older than 2 days, there should be no alternate hashes
older than 3 days. If the current push is younger, there can be one hash
arbitrarily old, but no further hashes older than 3 days. I hope I read the
code correctly (the docstring doesn't seem to match it exactly).
However, it also depends on how often metalink is regenerated, the old items
will not disappear on their own. I learned that all metalinks are
regenerated based on any of these fedmsg events:
So if there is no new push (in any repository), metalinks are not regenerated
and old hashes are not dropped. Theoretically releng could send out one of
those fedmsg events artificially to trigger metalink regeneration, if
Currently, there are no means to generate a new metalink with alternative
hashes disabled, or removing those alternatives from the metalink
intentionally at some point of time afterwards. That would require patching
our tools. This would of course lead to a larger load on our master mirror
and those mirrors which managed to get synced quickly, because that would
disqualify any other mirrors which are not completely current. But it could
get handy in some situations, unfortunately the tools do not allow it at the
The second part of the story is dnf. In dnf, metadata_expire= option defines
how often metalink is pulled again and new metadata are downloaded if the
cached metadata hash differs from the current hash in the metalink. However,
if the top-listed repository is not completely up-to-date (it contains
current-1 or current-2 metadata), but its hash is listed among alternate
hashes in the metalink, dnf is fine with that and does not attempt to query
different repos to retrieve the very current metadata. That means that as
long as the metalink contains some older hashes, and some repository offers
that older metadata, some users might not get latest metadata. The default
value for metadata_expire is 6 hours for stable updates.
So, the outcome of this exercise is:
If we want to be sure the latest updates are available to _all_ our users, we
need to wait until there are no older metadata hashes in the metalink and
then 6 more hours. There will be no older metadata hashes in the metalink
when 3 days pass since the push of the important update, *once* there is a
new push after that time (which will regenerate the metalink), or if releng
send out a fake event manually.
This is, uh, a) quite a long time and b) complex. I'll be very glad if you
can point out anything that I've described or understood wrong.
Taking all of this into account, would this be a reasonable idea?
1. At Go/No-Go voting time, all updates which block F-N release but belong to F-M (M<N)
release, must be already pushed stable. If this is not the case and it's the last
blocking issue, selected tasks (like copying compose trees into appropriate places) can be
performed, and Go/No-Go will be rescheduled to the day and time when it is expected that
those updates will have been pushed.
2. We will create a new mirrormanager script which will go through the specified
metalink(s) and remove all metadata hashes which are older than provided timestamp/hash.
3. If there are such updates as mentioned in point 1., RelEng will use this script to
remove old metadata alternatives from the metalink, which means only metadata from the day
this update was pushed or newer will be kept. In order to not increase mirror strain too
much, this doesn't need to be used immediately, but just shortly before the release
announcement (so that mirrors have time to sync latest packages, and the user load is
distributed among more mirrors including those with current-1 or current-2 trees as long
4. Once the script is run in point 3., we can post the release announcement in 6 hours.
I know there still one manual step involved (figuring out in which push the blocker update
was included), but I don't know how to better solve it, especially if we don't
want to wait for too long.
I would be interested in Infra/RelEng feedback for the technical part of this (CCing Kevin
and Dennis). Do you think this is reasonable solution, or am I completely off the track
here? Do you see any better options?