On Sat, 21 Jun 2014 13:53:12 -0700 Colin Walters walters@verbum.org wrote:
On Sat, Jun 21, 2014, at 10:35 AM, Kevin Fenzi wrote:
Yeah, we can actually map an external IP into it if required, but I agree it isn't great to have the build host also serving external content.
Though, separate from the actual content, it would be nice if it had a webserver to output live compose status. That would be very low priority requests.
Well, we do have: https://apps.fedoraproject.org/releng-dash/ which is a simple page that pulls info from our fedmsg database.
If we get your composes emitting fedmsgs we can query for and watch for those too.
I think this is pretty much already in place with our /pub/alt/ space (that the kernel nodebug uses above).
This space is on a netapp. It is mirrored, but only by a few larger mirrors, most don't. It's served by our 5 download servers in phx2 and 3 on I2 in rdu2.
Cool. Would it make sense to sync COPR there as well?
Well, we have talked about it... but all of copr is a pretty large thing and only getting larger over time, so syncing could lead to slow propagation.
So, we have some dedicated storage coming in for copr. (dell equalogics unit). Which we can look at making a bit more HA with multiple frontends, etc.
So, we could sync the ostree stuff there as well... its a large number of small files? Or can you expand on the size and content it will have over time and how often updated?
Yep, lots of small (almost all immutable) files. At least until https://bugzilla.gnome.org/show_bug.cgi?id=721799 lands.
Here's some quick stats on the old repository:
# du -shc repo.old 46G repo.old srv/rpm-ostree <uid=0> # find repo.old | wc -l 560164
This however includes binary-level history over months for many packages. The base/core tree is 92 commits, from 2014-02-22 23:27:03 +0000 to 2014-06-04 18:48:16 +0000.
A major source of bloat is having to regenerate the initramfs each time: https://bugzilla.redhat.com/show_bug.cgi?id=1098457 92 commits * 20 MiB is almost 2G per tree, with 6 trees the initramfs images account for 25% of the size.
gnome-continuous has an optimization for this, if you look at https://git.gnome.org/browse/gnome-continuous/log/manifest.json you'll see the "initramfs-depends": true, we only regenerate it for that.
The *new* repository I just composed has just 2 commits for each tree so far:
# du -sh repo 3.5G repo # find repo | wc -l 168744
Also, the way I'd like this to work is to have separate "release" versus "integration" repositories. The release repository wouldn't have the full binary history, it gets recomposed
The amount of space used though is flexible because we can also prune/rebase the history.
The goal for how often updated would come down to the rate of change of the RPM repositories it tracks; the goal is when an RPM is updated, the tree regenerates.
Yeah, my only worry here is that we have too much churn, but it sounds like there's some mitigation there and you have already been working on reducing that ;)
kevin