Justin Payne píše v Čt 22. 07. 2010 v 17:48 -0400:
On 07/22/2010 05:44 AM, Dan Horák wrote:
> Hi all,
>
> there are some changes that needs to be done on the Fedora/s390x
> infrastructure for various reasons and we should prepare a plan how and
> when to do them.
>
> Tasks
> -----
> 1. upgrade builders to something with kernel 2.6.32+ like EL-6 beta
> Yesterday a new glibc was built in primary Fedora, it drops some
> compatibility stuff for old kernels and now requires kernel 2.6.32+ on
> the builders.
I would like to realize it during Tue or Wed before starting a new
koji-shadow run
> 2. redesign resource allocation for the builders
> The builders (and the squid cache) don't behave very well when under
> full load. Timeouts when downloading the buildroots thru the cache are
> quite rare (but they happen every day). Other observed behaviour looks
> like a completely swapped out guest, it doesn't respond to ping, koji
> doesn't update its status on the hub and it can take minutes, maybe even
> tens of minutes, before the machine starts responding again.
> Unfortunately often this behaviour means stuck builds and manually
> restarting the builder daemon.
> I think part of this issue could be solved on the z/VM side (size of
> RAM, number of CPUs per builder), part could be tuning the koji
> configuration (max jobs per builder, max load, parallel make, ...). Also
> interesting would be to see performance/resource usage statistics from
> z/VM.
>
I'm open to suggestion I believe everyone has the specs on the build
lpar. If by some chance we need more memory/storage, we will need to
plead our case with Arlinton Bourne. If we are in need of disks for VM
paging volumes (I just added two more not long ago), I only have a
couple 3390's left available on that lpar.
well, I would like to first know what is the real bottleneck. I tried
disabling one builder and later even shutting it down and I think the
behaviour was better, but still not without problem.
And also there can (or rather should?) be a room for improvements in the
koji source code, because other apps can survive the phases without
network connection.
> 3. rebuild the storage on the hub
> The sub-optimal storage configuration is known for some time, but there
> is still room on the disks for few months of work (my guess).
>
During my backups of the hub, I have been contacted. I will forward the
message to relevant parties off-list.
hm, things are going to be more complicated than thought earlier, but
let's wait for Dennis or Mike
Dan