[RFR #4562] Koschei - continuous integration in Koji

Thu Oct 16 09:14:01 UTC 2014

On 10/15/2014 09:31 PM, Kevin Fenzi wrote:
> ok, some general questions, please excuse me if they are dumb. ;) 
> 
> high level: 
> 
> * How well does it keep up currently? I know you are careful not to
>   overload koji, but I wonder if that means things like perl builds are
>   often behind because there are so many of them? 

Koji has more than enough resources to sustain current Koschei load
(~3000 packages).  Storage might become problematic if more packages are
added (scratch build results are kept for some period of time), but we
have a solution for that (see [2] or [3]).  If more Koji builders are
ever need then I think it sould be quite easy to add them, as long as
there is budget for that.

> * right now the service is opt-in right? Someone adds a group and
>   packages in that group and then when one of them changes it scratch
>   rebuilds the rest. Do you see a time/case when we could just make it
>   operate on all builds? that is, build foo is made, and it just does
>   all the things that buildrequire foo? 

For now only some subset of all packages is tracked by Koschei, but the
ultimate goal is to track all packages - they would be added
automatically after first build appears on Koji and removed when they
are blocked. What would be up to individuals is maintainig package
groups.  (One package can be in any number of groups.)

> * The notifications of failed builds currently are via fedmsg? We
>   should investigate adding this to FMN if it's not already there, so
>   anyone interested could be notified via that. 

fedmsg publishing is already operational as can be seen on [1]. FMN rule
has been recently added. The new FMN is not yet in production, but in
(hopefully near) future users will be able to enable email or IRC
notifications for buildability status of packages they are interested in.

> todo's/ideas: 
> 
> * Could this ever be a koji plugin? Or does it do too much on top of
>   that to ever be a plugin? 

Koschei has its own architecture and converting it to Koji plugin would
require substantial amount of work.  In other words, it should be
possible, but I don't see any good reason to do so.

> * Might it be possible to run on all the broken deps packages in
>   rawhide/branched? This would depend I guess on the compose process
>   generating fedmsgs with those package names, but if so it could tell
>   maintainers "hey, your package is broken in rawhide, but a simple
>   rebuild will fix it" (or any other group that just wants to go fix
>   them). 

This is an interesting idea.

A simillar feature was planned for future. The idea was that Koschei
could be resolving runtime dependencies of all packages besides just
build dependencies. Users would be able to see whether package is
installable and if yes, see its installation size with dependencies (in
terms of MB to download, MB installed size and package count). There
would be graphs showing how this changes in time. (We had a simillar POC
service runnig for a few months, see [4].)

We could extend this and make Koschei resolve runtime dependencies of
successful scratch builds it runs.  In case scratch build would fix
broken package in offcial repo, a fedmsg would be emited.

> * boost is another group of packages I could see this being useful for.
>   Perhaps it woul<d be worth reaching out to the boost maintainers?

I don't know specifics of boost packages, but we'll cosider any feature
request.

> * Could this be used to scratch build packages that are
>   ExcludeArch/ExclusiveArch with that removed? ie, to tell maintainers,
>   "hey, you exclude arm, but it builds ok, are you sure thats fine?"

This would require generating a new SRPM with ExcludeArch/ExclusiveArch
removed, which requires installing all build dependencies, so it should
be done by Koji as buildSRPMfromSCM task. This in turn requires Koschei
having ability to push to some branch in SCM or maintaining separate git
repo and changing Koji policy to allow scratch builds from it. And of
course this would have to be implemented in Koschei. Not impossible, but
looks like a lot of work for something that could be done manually by
running some script from time to time.

> 
> technical: 
> 
> * Can this application be load balanced any? Ie, if we have two of them
>   could they operate against the same db at the same time? 

To answer this question I need to elaborate more about Koschei
architecture. tl;dr yes, it can be load balanced well.

Koschei conisits of four systemd services, WSGI webapp and database.
Separate services don't need to communicate with each other - they just
need access to database and services they integrate with (like Koji or
fedmsg). They can be on separate machines and there can be muiltiple
instances of some of them running concurrently.

scheduler - schedules scratch builds and submits them to Koji.
Theoretically there could be many schedulers running concurrently, but
this is not needed as a single scheduler should be able to handle many
thousands of packages easily.

watcher - listens to fedmsg and updates database accordingly. It makes
sense to have only one watcher.

polling - periodically asks Koji about statuses of runnig scratch builds
and package statuses (this is fallback mechanism necessary in case
fedmsg message delivery fails). Only one polling service makes sense as
this is only fallback methanism and can be ran every hour or even less
often.

resolver - resolves build dependencies of all packages when new repo is
generated. Dep resolution is a CPU intensive task. Depending on number
of packages tracked this may take up to a few hours of CPU time
(estimate made for 100,000 pkgs, I'm thinking about future here).
Resolver service can be configured to use multiple threads and it should
scale linearly.

reporter (WSGI webapp running in Apache httpd with mod_wsgi) - provides
web UI for users. There can be muiltiple webapps runnig behing some HTTP
balancer if needed.

Database itself can be load balanced too (we are using PostgreSQL).

To sum up, all components either don't need load balancing or can be
load-balanced.

> * Are there any common sysadmin tasks we need to know about with the
>   instance? Is there any special process to start/stop/reinstall it? 

Installing Koschei is done by installing RPM package from Fedora or EPEL
repositories and coping a single config file.
Managing all services (starting, stopping, viewing logs etc.) is done
using standard system tools (systemctl, journalctl and so on).
There is nothing special to be done besides standard sysadmin stuff
(updating packages, viewing logs, backing up database and so on).

> * When there is koji maint, should we stop this service? How do we
>   gracefully do that and start it again? 

In this case you can stop Koschei services that communicate with Koji
(using systemctl stop koschei-<servicename>) and start them when
maintenance is over. Web UI will remain functional during that time, but
there will be no new build scheduled.

> Thats all I can think of right now. :) 

I hope this pretty long email answers your questions.

[1]
https://apps.fedoraproject.org/datagrepper/raw?topic=org.fedoraproject.prod.koschei.package.state.change
[2] https://bugzilla.redhat.com/show_bug.cgi?id=1130233
[3] https://fedorahosted.org/koji/ticket/284
[4] https://sochotni.fedorapeople.org/min_install/main.html

-- 
Mikolaj Izdebski
Software Engineer, Red Hat
IRC: mizdebsk