Our MirrorManager setup exports the current state of all mirrors every
hour at :30 to a protobuf based file which is then used by the
mirrorlist servers to answer the requests from yum and dnf.
The Python script requires up to 10GB of memory and takes between 35 and
50 minutes. The script does a lot of SQL queries and also some really
big SQL queries joining up to 6 large MirrorManager tables.
I have rewritten this Python script in Rust and now it only needs around
1 minute instead of 35 to 50 minutes and only 600MB instead of 10GB.
I think the biggest difference is that I am almost not doing any joins
in my SQL request. I download all the tables once and then I do a lot of
loops over the downloaded tables and this seems to be massively faster.
As the mirrorlist-server in Rust has proven to be extremely stable over
the last months we have been using it I would also like to replace the
mirrorlist protbuf input generation with my new Rust based code.
I am planing to try out the new protobuf file in staging in the next
days and would then try to get my new protobuf generation program into
Fedora. Once it is packaged I would discuss here how and if we want to
deploy in Fedora's infrastructure.
Having the possibility to generate the mirrorlist input data in about a
minute would significantly reduce the load on the database server and
enable us to react much faster if broken protobuf data has been synced
to the mirrorlist servers on the proxies.