I'd like to have a discussion during this week's infrastructure meeting
about whether we need the project that will re-run failed gating tests
to be a service, or whether a library will do. There has been some
lively debate about this in IRC, but it didn't seem like there was
resolution and it would be helpful if we could make a decision together
so we can proceed with implementation. Would this week's infra meeting
be a good time to discuss this? Alternatively, we could discuss it on
this mailing list if desired.
I would like to change the setup of our mirror crawler and just wanted
to mention my planned changes here before working on them.
Currently we have two VMs which are crawling our mirrors. Each of the
machine is responsible for one half of the active mirrors. The crawl
starts every 12 hours on the first crawler and 6 hours later on the
second crawler. So every 6 hours one crawler is accessing the database.
Currently most of the crawling time is not spent crawling but updating
the database about which host has which directory up to date. With a
timeout of 4 hours per host we are hitting that timeout on some hosts
regularly and most of the time the database access is the problem.
What I would like to change is to crawl each category (Fedora Linux,
Fedora Other, Fedora EPEL, Fedora Secondary Arches, Fedora Archive)
separately and at different times and intervals.
We would not hit the timeout as often as now as only the information for
a single category has to be updated. We could scan 'Fedora Archive' only
once per day or every second day. We can scan 'Fedora EPEL' much more
often as it is usually really fast and get better data about the
My goal would be to distribute the scanning in such a way to decrease
the load on the database and to decrease the cases of mirror
auto-deactivation due to slow database accesses.
Let me know if you think that these planned changes are the wrong
direction of if you have other ideas how to improve the mirror crawling.
Neal, thanks for the feedback. After taking your comments into
consideration, here's version 2.
| ID | Compression type | Index size |
| Compressed Index | Compressed Dict |
| Chunk | Chunk | ==> More chunks
'\0ZCK1', identifies file as zchunk version 1 file
Type of compression used to compress dict and chunks
0 - Uncompressed
2 - zstd
This is a 64-bit unsigned integer containing the size of compressed
This is the index, which is described in the next section. The index
is compressed without a custom dictionary.
Compressed Dict (optional)
This is a custom dictionary used when compressing each chunk.
Because each chunk is compressed completely separately from the
others, the custom dictionary gives us much better overall
compression. The custom dictionary is compressed without a custom
dictionary (for obvious reasons).
This is a chunk of data, compressed with the custom dictionary
| Checksum type | Checksum of all data |
| Dict checksum | End of dict |
| Chunk checksum | End of chunk | ==> More
This is the type of checksum used to generate the checksums in the
0 = SHA-256
Checksum of all data
This is the checksum of the compressed dict and all the compressed
chunks, used to verify that the file is actually the same, even in
the unlikely event of a hash collision for one of the chunks
This is the checksum of the compressed dict, used to detect whether
two dicts are identical. If there is no dict, the checksum must be
End of dict
This is the location of the end of the dict starting from the end of
the index. This gives us the information we need to find and
decompress the dict. If there is no dict, the checksum must be all
This is the checksum of the compressed chunk, used to detect whether
any two chunks are identical.
End of chunk
This is the location of the end of the chunk starting from the end of
the index. This gives us the information we need to find and
decompress each chunk.
The index is designed to be able to be extracted from the file on the
server and downloaded separately, to facilitate downloading only the
parts of the file that are needed, but must then be re-embedded when
assembling the file so the user only needs to keep one file.
One of the (many) things that came out of the infra hackfest last week
was a few changes to our issue tracker. If you have ticket perms on the
fedora-infrastructure pagure instance, please follow these guidelines:
We have added several 'priority's:
🔥 URGENT 🔥
Waiting on Asignee
Waiting on Reporter
Waiting on External
By default, when a new issue is filed, it will get the "Needs Review" pri.
Once anyone with ticket privs touches the issue, they should change that
pri to the correct other state:
🔥 URGENT 🔥 - This should be set only for those things that are really
urgent and need immediate attention. Like a high SLE service down or
something preventing a release from happening or the like. Please do not
misuse this for low SLE things or things that don't need "all hands on
deck" to fix.
Needs Review - Default state before anyone has looked at the issue.
Next Meeting - Something that is to be discussed in the next weekly
Waiting on Asignee - This means we have accepted the ticket and have all
needed info to work on it, but it's waiting for cycles to actually do
the needed work. If the assignee is not set it's waiting for someone
with cycles to take it on and do it.
Waiting on Reporter - This means that we have asked the reporter for
more info or need something from them to move the task forward. The task
is stalled until the Reporter provides that.
Waiting on External - This means that the task/issue is waiting until
something else is done first. When using this pri we should be as
explicit as possible about what the thing is, ie 'Waiting for Fedora 28
GA release and this can be done the day after' or 'Waiting until FESCo
rules on this and gives the ok'.
Please if you touch any tickets set the pri accordingly and it will help
us out. I am also going to process the existing tickets and get them in
the right states.
Additionally, I created a default template for new issues. It reads:
* Describe what you need us to do:
* When do you need this? (YYYY/MM/DD)
* When is this no longer needed or useful? (YYYY/MM/DD)
* If we cannot complete your request, what is the impact?
These questions will help us know about the issue, but if anyone can
think of a clearer way to word them or things to add, please let me know.
Many thanks to mattdm for the ideas from the Council issue tracker,
hopefully it will help us set expectations better.
Libravatar announced this morning they are shutting down their service
on 2018 September 1; this affects applications that use Libravatar, such
as Pagure, Tahrir (Fedora Badges), and possibly others.
In their post, they mentioned they do not know of another FOSS image
hosting service like Libravatar. If hosting images ourselves is not
desired, Gravatar is the best viable alternative I know of.
Since the Infrastructure hackathon is next week, I thought this might be
timely to mention, since it affects most Fedora applications that use a
profile picture for contributors.
I will try and file RFEs on Pagure and Tahrir today, but other affected
services don't immediately come to mind. Feel free to file some too if
there are more.
Justin W. Flory
This week, the Fedora Infrastructure team is convening for a Hackathon
from April 9-13 at Fredericksburg, VA. The hackathon is intended to help
the team leap ahead for several critical Fedora and CentOS initiatives.
Dennis Otugo, a member of the CommOps team, interviewed members of the
Fedora Infrastructure team to ask what the goals for the hackathon are
and why it is needed.
Read more on the Community Blog:
Thanks Dennis for putting the interview together, and thanks to the
interviewees for their answers on short notice. :-)
Justin W. Flory