mirroring for alternative content

Kevin Fenzi kevin at scrye.com
Thu Jun 26 16:52:02 UTC 2014


On Sat, 21 Jun 2014 13:53:12 -0700
Colin Walters <walters at verbum.org> wrote:

> On Sat, Jun 21, 2014, at 10:35 AM, Kevin Fenzi wrote:
> >
> > Yeah, we can actually map an external IP into it if required, but I
> > agree it isn't great to have the build host also serving external
> > content. 
> 
> Though, separate from the actual content, it would be nice if it had a
> webserver to output live compose status.  That would be very low
> priority requests.

Well, we do have: 
https://apps.fedoraproject.org/releng-dash/
which is a simple page that pulls info from our fedmsg database. 

If we get your composes emitting fedmsgs we can query for and watch for
those too.

> > I think this is pretty much already in place with our /pub/alt/
> > space (that the kernel nodebug uses above). 
> > 
> > This space is on a netapp. 
> > It is mirrored, but only by a few larger mirrors, most don't. 
> > It's served by our 5 download servers in phx2 and 3 on I2 in rdu2. 
> 
> Cool.  Would it make sense to sync COPR there as well?

Well, we have talked about it... but all of copr is a pretty large
thing and only getting larger over time, so syncing could lead to slow
propagation.

So, we have some dedicated storage coming in for copr. (dell equalogics
unit). Which we can look at making a bit more HA with multiple
frontends, etc. 

> > So, we could sync the ostree stuff there as well... its a large
> > number of small files? Or can you expand on the size and content it
> > will have over time and how often updated?
> 
> Yep, lots of small (almost all immutable) files.  At least until
> https://bugzilla.gnome.org/show_bug.cgi?id=721799 lands.
> 
> Here's some quick stats on the old repository:
> 
> # du -shc repo.old
> 46G     repo.old
> srv/rpm-ostree <uid=0>
> # find repo.old | wc -l
> 560164
> 
> This however includes binary-level history over months for many
> packages.  The base/core tree is 92 commits, from 2014-02-22 23:27:03
> +0000 to 2014-06-04 18:48:16 +0000.
> 
> A major source of bloat is having to regenerate the initramfs each
> time: https://bugzilla.redhat.com/show_bug.cgi?id=1098457
> 92 commits * 20 MiB is almost 2G per tree, with 6 trees the initramfs
> images account for 25% of the size.
> 
> gnome-continuous has an optimization for this, if you look at
> https://git.gnome.org/browse/gnome-continuous/log/manifest.json you'll
> see the "initramfs-depends": true, we only regenerate it for that.  
> 
> The *new* repository I just composed has just 2 commits for each tree
> so far:
> 
> # du -sh repo
> 3.5G    repo
> # find repo | wc -l
> 168744
> 
> Also, the way I'd like this to work is to have separate "release"
> versus "integration" repositories.   The release repository wouldn't
> have the full binary history, it gets recomposed 
> 
> The amount of space used though is flexible because we can also
> prune/rebase the history.
> 
> The goal for how often updated would come down to the rate of change
> of the RPM repositories it tracks; the goal is when an RPM is
> updated, the tree regenerates.

Yeah, my only worry here is that we have too much churn, but it sounds
like there's some mitigation there and you have already been working on
reducing that ;)

kevin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL: <http://lists.fedoraproject.org/pipermail/infrastructure/attachments/20140626/eee2cec9/attachment.sig>


More information about the infrastructure mailing list