Hi all,
there are some changes that needs to be done on the Fedora/s390x infrastructure for various reasons and we should prepare a plan how and when to do them.
Tasks ----- 1. upgrade builders to something with kernel 2.6.32+ like EL-6 beta Yesterday a new glibc was built in primary Fedora, it drops some compatibility stuff for old kernels and now requires kernel 2.6.32+ on the builders.
2. redesign resource allocation for the builders The builders (and the squid cache) don't behave very well when under full load. Timeouts when downloading the buildroots thru the cache are quite rare (but they happen every day). Other observed behaviour looks like a completely swapped out guest, it doesn't respond to ping, koji doesn't update its status on the hub and it can take minutes, maybe even tens of minutes, before the machine starts responding again. Unfortunately often this behaviour means stuck builds and manually restarting the builder daemon. I think part of this issue could be solved on the z/VM side (size of RAM, number of CPUs per builder), part could be tuning the koji configuration (max jobs per builder, max load, parallel make, ...). Also interesting would be to see performance/resource usage statistics from z/VM.
3. rebuild the storage on the hub The sub-optimal storage configuration is known for some time, but there is still room on the disks for few months of work (my guess).
4. upgrade Koji to 1.4 Here we should be on par with the primary Fedora buildsystem.
Timing ------ 1. ASAP, let's say till end of next week, because it blocks building packages that were built after July 21 in primary koji 2. together with 1. 3. till end of August, earlier date means fewer data to sync 4. together with 3
Outages ------- 1. none, can be done one builder at a time 2. yes 3. yes 4. yes
Comments, ideas, corrections, etc. are welcome
Dan
On 07/22/2010 05:44 AM, Dan Horák wrote:
Hi all,
there are some changes that needs to be done on the Fedora/s390x infrastructure for various reasons and we should prepare a plan how and when to do them.
Tasks
- upgrade builders to something with kernel 2.6.32+ like EL-6 beta
Yesterday a new glibc was built in primary Fedora, it drops some compatibility stuff for old kernels and now requires kernel 2.6.32+ on the builders.
- redesign resource allocation for the builders
The builders (and the squid cache) don't behave very well when under full load. Timeouts when downloading the buildroots thru the cache are quite rare (but they happen every day). Other observed behaviour looks like a completely swapped out guest, it doesn't respond to ping, koji doesn't update its status on the hub and it can take minutes, maybe even tens of minutes, before the machine starts responding again. Unfortunately often this behaviour means stuck builds and manually restarting the builder daemon. I think part of this issue could be solved on the z/VM side (size of RAM, number of CPUs per builder), part could be tuning the koji configuration (max jobs per builder, max load, parallel make, ...). Also interesting would be to see performance/resource usage statistics from z/VM.
I'm open to suggestion I believe everyone has the specs on the build lpar. If by some chance we need more memory/storage, we will need to plead our case with Arlinton Bourne. If we are in need of disks for VM paging volumes (I just added two more not long ago), I only have a couple 3390's left available on that lpar.
- rebuild the storage on the hub
The sub-optimal storage configuration is known for some time, but there is still room on the disks for few months of work (my guess).
During my backups of the hub, I have been contacted. I will forward the message to relevant parties off-list.
- upgrade Koji to 1.4
Here we should be on par with the primary Fedora buildsystem.
Timing
- ASAP, let's say till end of next week, because it blocks building
packages that were built after July 21 in primary koji 2. together with 1. 3. till end of August, earlier date means fewer data to sync 4. together with 3
Outages
- none, can be done one builder at a time
- yes
- yes
- yes
Comments, ideas, corrections, etc. are welcome
Dan
s390x mailing list s390x@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/s390x
Justin Payne píše v Čt 22. 07. 2010 v 17:48 -0400:
On 07/22/2010 05:44 AM, Dan Horák wrote:
Hi all,
there are some changes that needs to be done on the Fedora/s390x infrastructure for various reasons and we should prepare a plan how and when to do them.
Tasks
- upgrade builders to something with kernel 2.6.32+ like EL-6 beta
Yesterday a new glibc was built in primary Fedora, it drops some compatibility stuff for old kernels and now requires kernel 2.6.32+ on the builders.
I would like to realize it during Tue or Wed before starting a new koji-shadow run
- redesign resource allocation for the builders
The builders (and the squid cache) don't behave very well when under full load. Timeouts when downloading the buildroots thru the cache are quite rare (but they happen every day). Other observed behaviour looks like a completely swapped out guest, it doesn't respond to ping, koji doesn't update its status on the hub and it can take minutes, maybe even tens of minutes, before the machine starts responding again. Unfortunately often this behaviour means stuck builds and manually restarting the builder daemon. I think part of this issue could be solved on the z/VM side (size of RAM, number of CPUs per builder), part could be tuning the koji configuration (max jobs per builder, max load, parallel make, ...). Also interesting would be to see performance/resource usage statistics from z/VM.
I'm open to suggestion I believe everyone has the specs on the build lpar. If by some chance we need more memory/storage, we will need to plead our case with Arlinton Bourne. If we are in need of disks for VM paging volumes (I just added two more not long ago), I only have a couple 3390's left available on that lpar.
well, I would like to first know what is the real bottleneck. I tried disabling one builder and later even shutting it down and I think the behaviour was better, but still not without problem.
And also there can (or rather should?) be a room for improvements in the koji source code, because other apps can survive the phases without network connection.
- rebuild the storage on the hub
The sub-optimal storage configuration is known for some time, but there is still room on the disks for few months of work (my guess).
During my backups of the hub, I have been contacted. I will forward the message to relevant parties off-list.
hm, things are going to be more complicated than thought earlier, but let's wait for Dennis or Mike
Dan
Hi,
I was wondering if there was a new repository for the RPMS that are being built on the Koji build system and are these RPMS actually being pushed out to a repo somewhere? Currently I have a FC12 system upgraded from the original Hercules disk images that have been built and was wondering if there would be a process I can use to upgrade to the later release of Fedora for s390x. I am currently pointing to this repo - but it does not look like if has been updated in quite a while (November of 2009).
http://archive.kernel.org/fedora-secondary/development/s390x/os/
Would it be possible to do an in place upgrade to the latest releases of Fedora once that build is complete.
Thanks in advance for the help. -Phil
-----Original Message----- From: s390x-bounces@lists.fedoraproject.org [mailto:s390x-bounces@lists.fedoraproject.org] On Behalf Of Dan Horák Sent: Monday, July 26, 2010 10:40 AM To: s390x@lists.fedoraproject.org Subject: Re: changes in the infrastructure
Justin Payne píše v Čt 22. 07. 2010 v 17:48 -0400:
On 07/22/2010 05:44 AM, Dan Horák wrote:
Hi all,
there are some changes that needs to be done on the Fedora/s390x infrastructure for various reasons and we should prepare a plan how and when to do them.
Tasks
- upgrade builders to something with kernel 2.6.32+ like EL-6 beta
Yesterday a new glibc was built in primary Fedora, it drops some compatibility stuff for old kernels and now requires kernel 2.6.32+ on the builders.
I would like to realize it during Tue or Wed before starting a new koji-shadow run
- redesign resource allocation for the builders The builders (and
the squid cache) don't behave very well when under full load. Timeouts when downloading the buildroots thru the cache are quite rare (but they happen every day). Other observed behaviour looks like a completely swapped out guest, it doesn't respond to ping, koji doesn't update its status on the hub and it can take minutes, maybe even tens of minutes, before the machine starts responding again. Unfortunately often this behaviour means stuck builds and manually restarting the builder daemon. I think part of this issue could be solved on the z/VM side (size of RAM, number of CPUs per builder), part could be tuning the koji configuration (max jobs per builder, max load, parallel make, ...). Also interesting would be to see performance/resource usage statistics from z/VM.
I'm open to suggestion I believe everyone has the specs on the build lpar. If by some chance we need more memory/storage, we will need to plead our case with Arlinton Bourne. If we are in need of disks for VM paging volumes (I just added two more not long ago), I only have a couple 3390's left available on that lpar.
well, I would like to first know what is the real bottleneck. I tried disabling one builder and later even shutting it down and I think the behaviour was better, but still not without problem.
And also there can (or rather should?) be a room for improvements in the koji source code, because other apps can survive the phases without network connection.
- rebuild the storage on the hub
The sub-optimal storage configuration is known for some time, but there is still room on the disks for few months of work (my guess).
During my backups of the hub, I have been contacted. I will forward the message to relevant parties off-list.
hm, things are going to be more complicated than thought earlier, but let's wait for Dennis or Mike
Dan
_______________________________________________ s390x mailing list s390x@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/s390x -- This message has been scanned for viruses and dangerous content.
Philip Pinto píše v Po 26. 07. 2010 v 12:10 -0400:
Hi,
I was wondering if there was a new repository for the RPMS that are being built on the Koji build system and are these RPMS actually being pushed out to a repo somewhere? Currently I have a FC12 system upgraded from the original Hercules disk images that have been built and was wondering if there would be a process I can use to upgrade to the later release of Fedora for s390x. I am currently pointing to this repo - but it does not look like if has been updated in quite a while (November of 2009).
http://archive.kernel.org/fedora-secondary/development/s390x/os/
Would it be possible to do an in place upgrade to the latest releases of Fedora once that build is complete.
The rawhide stuff is in quite good shape, broken dependencies are really rare. That means we could probably start preparing the repo for users.
The success rate of an in place upgrade will largely depend on how many packages (and their kind - server/desktop/development/...) are installed as the difference between the available pre F-12 stuff and today's rawhide is large. But personally I didn't tried an upgrade yet.
Dan
Dan Horák píše v Čt 22. 07. 2010 v 11:44 +0200:
Hi all,
there are some changes that needs to be done on the Fedora/s390x infrastructure for various reasons and we should prepare a plan how and when to do them.
Tasks
- upgrade builders to something with kernel 2.6.32+ like EL-6 beta
Yesterday a new glibc was built in primary Fedora, it drops some compatibility stuff for old kernels and now requires kernel 2.6.32+ on the builders.
done, due some changes python's urllib2 (or something else) I had to apply a workaround for kojid when it reads rpm header from network
- redesign resource allocation for the builders
The builders (and the squid cache) don't behave very well when under full load. Timeouts when downloading the buildroots thru the cache are quite rare (but they happen every day). Other observed behaviour looks like a completely swapped out guest, it doesn't respond to ping, koji doesn't update its status on the hub and it can take minutes, maybe even tens of minutes, before the machine starts responding again. Unfortunately often this behaviour means stuck builds and manually restarting the builder daemon. I think part of this issue could be solved on the z/VM side (size of RAM, number of CPUs per builder), part could be tuning the koji configuration (max jobs per builder, max load, parallel make, ...). Also interesting would be to see performance/resource usage statistics from z/VM.
switched to use only 3 builders, let's wait how it will behave
- rebuild the storage on the hub
The sub-optimal storage configuration is known for some time, but there is still room on the disks for few months of work (my guess).
- upgrade Koji to 1.4
Here we should be on par with the primary Fedora buildsystem.
done
5. update builder configs for dist-git
done
With these steps done we are on par with the primary Fedora and can continue building new stuff.
But a new issue have appeared when I started koji-shadow, could be related to the koji upgrade ...
Dan