#fedora-meeting: Infrastructure (2014-02-27)
Meeting started by nirik at 19:00:04 UTC. The full logs are available at
* Greetings starfighter! (nirik, 19:00:04)
* New folks introductions and Apprentice tasks (nirik, 19:01:48)
* Applications status / discussion (nirik, 19:04:02)
* LINK: https://apps.fedoraproject.org/nuancier
* LINK: https://fedorahosted.org/reviewboard/dashboard/
* Sysadmin status / discussion (nirik, 19:27:30)
* download servers and netapp i/o has been a big issue this week.
ongoing. (nirik, 19:36:50)
* more puppet -> ansible conversions are ready to go (nirik,
* LINK: http://fedorapeople.org/~kevin/ansible-20140224.odp
* LINK: http://skvidal.fedorapeople.org/misc/rbac-playbook
I find just
this (mirek, 19:43:03)
* Upcoming Tasks/Items (nirik, 19:48:40)
* LINK: https://apps.fedoraproject.org/calendar/list/infrastructure/
* Open Floor (nirik, 19:50:03)
for example (pingou, 19:53:20)
Meeting ended at 20:05:11 UTC.
Action Items, by person
People Present (lines said)
* nirik (124)
* pingou (54)
* mirek (17)
* smooge (17)
* threebean (15)
* willo (13)
* sgallagh (11)
* danofsatx-work (7)
* adimania (5)
* misc (4)
* zodbot (4)
* abadger1999 (3)
* lmacken (3)
* docent (1)
* lbazan (1)
* relrod (1)
* janeznemanic (1)
* kushalk124 (1)
* ausmarton (1)
* fchiulli (1)
* mdomsch (0)
* puiterwijk (0)
* dgilmore (0)
19:00:04 <nirik> #startmeeting Infrastructure (2014-02-27)
19:00:04 <zodbot> Meeting started Thu Feb 27 19:00:04 2014 UTC. The chair is nirik.
Information about MeetBot at http://wiki.debian.org/MeetBot
19:00:04 <zodbot> Useful Commands: #action #agreed #halp #info #idea #link #topic.
19:00:04 <nirik> #meetingname infrastructure
19:00:04 <nirik> #topic Greetings starfighter!
19:00:04 <nirik> #chair smooge relrod nirik abadger1999 lmacken dgilmore mdomsch
threebean pingou puiterwijk
19:00:04 <zodbot> The meeting name has been set to 'infrastructure'
19:00:04 <zodbot> Current chairs: abadger1999 dgilmore lmacken mdomsch nirik pingou
puiterwijk relrod smooge threebean
19:00:10 <abadger1999> howdy
19:00:34 <janeznemanic> hi
19:00:35 * adimania is here
19:00:41 * pingou
19:00:44 * lmacken
19:01:03 * threebean is here
19:01:05 <relrod> here
19:01:08 * ausmarton is here
19:01:19 * docent is here
19:01:25 * willo is here
19:01:32 <danofsatx-work> here
19:01:33 <nirik> morning everyone. ;)
19:01:43 <danofsatx-work> but not for long, need to reload F20 :(
19:01:48 <nirik> #topic New folks introductions and Apprentice tasks
19:01:54 <nirik> danofsatx-work: fun times. ;)
19:01:58 * kushalk124 is here
19:02:09 <nirik> any new folks like to introduce themselves? or apprentices with
questions or comments?
19:02:22 <danofsatx-work> new system - old load isn't optimized for it, so I get
to redo it, yet again....
19:03:55 <nirik> ok, moving along then... as always feel free to chime in with
questions or comments anytime.
19:04:02 <nirik> #topic Applications status / discussion
19:04:14 <nirik> any application side news this week?
19:04:26 * mirek is here
19:04:26 <pingou> new fedocal in prod
19:04:32 <pingou> new nuancier in prod
19:04:33 * fchiulli is here. Sorry for being late.
19:04:46 <pingou> (and new(er) nuancier in stg w/ fedmsg integration -- need to test
19:04:58 <nirik> pingou: cool. ;)
19:05:08 <mirek> those problems with copr (caused by createrepo_c) are hopefuly
19:05:09 <abadger1999> Cool.
19:05:13 <nirik> pingou: this is full nuancier now right? not lite?
19:05:23 <pingou> nirik: yup :)
19:05:24 <pingou> all features included
19:05:30 <pingou> https://apps.fedoraproject.org/nuancier
19:05:39 <pingou> and with a nicer frontpage :)
19:05:48 <nirik> mirek: cool. So it was createrepo_c sucking up all memory? or ?
19:05:50 <pingou> mirek: nice!
19:06:21 <mirek> nirik: yes, I even seen process with 10GB RAM.
19:06:29 <pingou> and with threebean we pushed some commits to summershum (support
.gem, more info in the logs and on fedmsg)
19:06:42 <nirik> thats a pile. ;(
19:06:43 <mirek> I find the cause and give it to upstream with reproducer
19:07:16 <lmacken> once createrepo_c is semi-stable, getting mash to use it could be
a great easyfix task
19:07:16 <nirik> mirek: I should have arm socs for you before too long... need to
get dhcpd to not mess up the cloud dhcp and setup a pxe server to install them, etc.
19:07:42 * mirek is happy
19:07:51 <nirik> lmacken: theres still stuff missing tho I think... no deltarpms?
19:08:11 <pingou> (no deltarpms was mentionned at devconf)
19:08:16 <lmacken> nirik: ah, yeah I haven't looked at it too closely, but
that's a blocker for sure :)
19:09:02 <nirik> mirek: those ansible module issues you ran into are really weird.
it's like something is modifying your pythonpath, but only sometimes?
19:10:24 <mirek> yes, it really puzzle me, I want to spend some time, but today it
happen on prod so I was in hurry to return it back online
19:10:34 <nirik> sure, understand
19:11:32 <nirik> oh, I had one thing to note...
19:11:56 <nirik> a while back puiterwijk got our reviewboard on fedorahosted back up
19:12:14 <nirik> I keep not having time to poke around on it more... but we should
see if it's usable for us for any needs...
19:12:18 <nirik> https://fedorahosted.org/reviewboard/dashboard/
19:12:29 <nirik> it is much faster than before...
19:12:57 <pingou> not accessible w/o fas account?
19:13:03 <sgallagh> nirik: I can assist with administration if there are questions
19:13:09 <pingou> ah, it's the dashboard link
19:13:10 <nirik> pingou: openid
19:13:15 <threebean> nice
19:13:18 <sgallagh> pingou: https://fedorahosted.org/reviewboard/r/
19:13:28 <sgallagh> So you can read but not edit.
19:13:45 <sgallagh> puiterwijk elected to make the login page automatically bounce
19:13:49 <nirik> oh reminds me I need to file a bug on the openid part...
19:14:30 <nirik> it makes a local FirstnameLastname user for reviewboard.
19:14:47 <nirik> but... it can't handle users with neat utf8 stuff in name. ;)
19:15:06 <pingou> ^^
19:15:23 <nirik> I'm sure we are shocked. ;)
19:15:35 * pingou looks at abadger1999
19:15:49 <sgallagh> I personally wish he'd just elected to mangle the openid for
19:15:56 <nirik> yeah, seems easier.
19:16:00 <sgallagh> sgallagh-id-fedoraproject-org would have worked better.
19:16:20 <abadger1999> <nod>
19:16:34 <sgallagh> And guaranteed not to be overloaded if we have two John Smiths
19:17:00 <nirik> anyhow, I know we have github for many application reviewing needs,
but if it's nice enough we could look at it for ansible changes during freeze or the
19:17:03 <sgallagh> That's fixable. Please CC me on the bug report.
19:17:17 <nirik> sgallagh: where's the best place to fiile?
19:17:24 <sgallagh> nirik: FWIW, I'm working on Git hooks to be able to manage
pull requests through Review Board.
19:17:42 <sgallagh> So you get the nice review UI of RB alongside the process
management of github
19:17:52 <nirik> cool.
19:17:57 <sgallagh> nirik: Just use the Infra trac for instance-specific ones
19:18:02 <nirik> k
19:18:39 <nirik> ok, any other application news?
19:18:47 <threebean> kinda application-y:
19:19:04 <threebean> pushed out a nice error logging config for fedmsg this morning
19:19:17 <pingou> it's *nice*!
19:19:21 <threebean> so, we'll get error emails from the badges awarder and the
notifications daemon. from summershum too.
19:19:39 <nirik> oh nice. these are when it can't send? or ?
19:19:52 <threebean> well, whenever log.error('blah blah') is called.
19:19:59 <threebean> so its up to each app to catch its own problems and log them.
19:20:16 <pingou> pkgdb2, fedocal and nuancier also send emails
19:20:31 <pingou> I was wondering if we should create an alias to receive these
19:20:40 <nirik> yeah, where do they go now?
19:20:51 <threebean> (the fedmsg ones go to sysadmin-datanommer-members(a)fp.o
19:20:53 <pingou> pkgdb2, fedocal and nuancier to me (only)
19:20:56 <nirik> we do have sysadmin-logs, but thats more sysadminy than
19:21:14 <pingou> sysapp-logs? :D
19:21:40 * pingou doesn't dare to propose appy-logs
19:21:48 <threebean> we had exceptions from fedora-packages coming to lmacken and I
for a while.. but there were just too many.
19:22:05 <nirik> a group is often nice because it's easy to manage who's in
19:22:14 <nirik> no aliases to change, etc
19:22:18 <pingou> +1
19:23:01 <pingou> on the app side, I've been working a little on FAS3 today
19:23:03 <threebean> could we, create an alias for each app so you don't have to
choose the firehose or nothing?
19:23:28 <nirik> threebean: we could, but if they are aliases, that means updating
them via puppet (or ansible) and more pain in freezes, etc.
19:23:35 <pingou> maybe use gitproject as for fedorahosted?
19:23:36 * threebean nods
19:24:35 <nirik> how about: fedmsglogs-applicationname? just tracking groups
19:24:57 <pingou> wfm too
19:25:18 <nirik> the git ones might be folks who dont want our specific error logs
19:25:32 <threebean> oo, wait. I'm not sure how to distinguish fedmsglogs
between applications. :/
19:26:10 <pingou> then just <app>-logs, fedmsg-logs being just one of them
19:26:14 * threebean nods
19:26:29 <nirik> sure.
19:27:06 <nirik> ok, any other apps news? ;)
19:27:30 <nirik> #topic Sysadmin status / discussion
19:27:43 <nirik> on the sysadmin side, smooge and I have been having fun with
19:27:57 <nirik> turns out the load on them has been at least part of the thing
slowing our netapp storage down. ;(
19:27:59 <smooge> download download
19:28:29 <nirik> we tried cachefilesd the other day, but it made the machines
19:28:42 <nirik> so, now we are limiting rsyncs per download server
19:28:57 <nirik> we also have a iptables hashlimit to limit ips that hit rsync too
19:29:00 <mirek> how big is the traffic (or data transfers)
19:29:02 <mirek> ?
19:29:19 * danofsatx-work makes a note to alter his rsync scripts
19:29:37 <nirik> in bytes/packets? a lot. ;)
19:29:53 <nirik> we did have some ip's hitting 100's of times a day
19:30:20 <danofsatx-work> for the record, that wasn't me ;) I hit it once every
19:30:31 <nirik> we are likely going to be moving storage for them next week.
19:31:07 <willo> i'll be making progress on migration of those servers to
ansible this weekend
19:31:10 <nirik> looks like around 10TB a day or so as a ballpark
19:31:43 <nirik> perhaps 15
19:31:47 <nirik> willo: great. ;)
19:32:03 <adimania> migration of paste module to ansible should be good to go.
19:32:21 <adimania> I'll pick up another one this weekend probably.
19:32:22 <nirik> adimania: thanks for working on it. ;)
19:32:40 <adimania> nirik, thanks for all the help :)
19:32:52 <pingou> I wonder if we should track a list of remaining module to port to
19:33:27 <adimania> pingou, that would be really helpful.
19:33:33 <nirik> pingou: we could start doing that yeah... it's a bit of a mess
tho due to puppet having old junk in it that we arent actually using anymore.
19:33:47 <nirik> like for example I think talk.fedoraproject.org/asterisk
19:34:07 <nirik> but we could perhaps list machines in puppet only and extrapolate
19:34:28 <willo> track on a wiki page maybe?
19:34:54 <smooge> I would go with a trac wiki page :)
19:35:12 <smooge> sorry my humour is off. rebooting
19:35:16 <nirik> we could. would somone like to write up at least part of such a
thing? I'd be happy to edit it and add info, etc.
19:35:22 <nirik> smooge: :)
19:35:55 <willo> i'll take a stab
19:36:18 <nirik> willo: cool!
19:36:33 <nirik> lets see...
19:36:34 <willo> i'll email list when outline is done for input
19:36:50 <nirik> #info download servers and netapp i/o has been a big issue this
19:36:53 <nirik> willo: sounds great.
19:37:09 <nirik> #info more puppet -> ansible conversions are ready to go
19:37:31 <nirik> I have one of our arm chassis up in the cloud network, just need to
get dhcpd working and pxe server to install them...
19:38:01 <nirik> Oh, I gave a talk to boulder devops monday night on ansible. My
slides are at:
19:38:25 <nirik> http://fedorapeople.org/~kevin/ansible-20140224.odp
19:38:42 <nirik> for anyone who wants them. Not sure how much sense they make
without me gibbering over them, but there they are. ;)
19:38:49 <smooge> cool
19:39:10 <willo> so no vid of the gibbering for posting to youtube ;)
19:39:11 <nirik> we have some new machines arriving tomorrow (I think)
19:39:24 <nirik> willo: sadly no, they are looking for a a/v person, but didn't
19:39:51 <mirek> I have one idea... write something like rbac-playbook but for
cloud, so sysadmin-cloud would be able to run euca-* and nova commands. Can someone send
me current source of rbac-playbook so I can base it on that please. I just find som e old
version on seth site
19:40:40 <nirik> mirek: not a bad idea...
19:41:04 <nirik> we aren't setup to run nova commands from there... but the euca
ones would work after you source a eucarc...
19:41:20 <nirik> wonder if that is possible to just do in sudo?
19:41:41 <nirik> ie "source this, then run command" ?
19:41:57 <mirek> source is not command
19:42:02 <mirek> it is bash internals
19:42:32 <nirik> yeah, but it looks like sudo you can pass a env_file for this.
19:42:33 <smooge> I normally write a shell script if I have to source stuff
19:42:33 <mirek> but it can be very similar to rbac-playbook, and easy, I just want
to reuse some recent code
19:43:03 <mirek> http://skvidal.fedorapeople.org/misc/rbac-playbook
I find just
19:43:15 <nirik> right, I can send you the current one...
19:43:24 <nirik> it's pretty primitive tho.
19:43:29 <mirek> thanks
19:43:31 <nirik> for example, command line args aren't supported.
19:43:41 <nirik> which may break it for ec2 stuff.
19:43:47 <mirek> I will keep it primitive for sure :)
19:44:24 <nirik> ok, feel free to look, but I think env_file with the ec2rc and
allowing euca* might be easier...
19:44:43 <nirik> thats just a change to sudoers
19:45:04 <nirik> or... hum.
19:45:38 <nirik> what if we add a acl to the ec2rc file to allow sysadmin-cloud to
read it. Then you can just source it and run commands as you. They shouldn't need any
19:46:07 <nirik> it would mean all sysadmin-cloud folks would have the credientals
19:46:42 <mirek> ahh chacl(1) yes, that should work
19:47:01 <nirik> anyhow, will ponder on it and try and get something that works. ;)
19:47:28 <nirik> anything else sysadmin related?
19:47:31 * lbazan here late..
19:48:40 <nirik> #topic Upcoming Tasks/Items
19:48:41 <nirik> https://apps.fedoraproject.org/calendar/list/infrastructure/
19:48:55 <nirik> anything upcoming folks would like to note or schedule?
19:49:14 <nirik> I still haven't done much on FAD organizing. Hopefully more
news by next week
19:49:24 <threebean> nirik: same here
19:50:03 <nirik> #topic Open Floor
19:50:17 <pingou> I have been playing with the lookaside cache
19:50:19 <nirik> anyone have anything for open floor? questions, comments, ideas,
19:50:29 <pingou> I was wondering how often we have 1 tarball with multiple md5
19:50:36 <pingou> the results are interesting:
19:50:57 <nirik> wow.
19:51:07 <pingou> but that on all the current tree, so there are some old versions
19:51:10 <nirik> I wonder if thats indicative of uploads that fail...
19:51:25 <nirik> or if it's upstreams that change stuff and re-release.
19:51:34 <pingou> I'm afraid for the later
19:51:36 <threebean> holy..
19:51:53 <pingou> I'm not sure yet what to do with this, mail on devel, blog
19:52:13 <pingou> maybe it might be worth asking people to watch out for this
19:52:18 <nirik> I wonder if we could find out more by looking at commits on those
19:52:21 <pingou> accident happens but...
19:52:24 <smooge> how do they get 2 different md5s?
19:52:28 <mirek> What does that mean? E.tgz has 10 md5 sums -- does that mean that
10 packages have the same tar.gz?
19:52:35 <pingou> smooge: two different tarball with the same name
19:52:44 <pingou> mirek: yup
19:52:47 <smooge> ah ok.
19:52:51 <nirik> it means you upload foo-1.0.tar.gz
19:53:00 <nirik> then upload it again, but with a different md5
19:53:05 <smooge> ahhhhh
19:53:23 <misc> either upstream did it, which is bad
19:53:30 <smooge> so I guess a timestamp,md5sum would be needed
19:53:42 <misc> or someone did regenerate the tarball from git, this kind of stuff
19:53:42 <nirik> misc: yeah, but does sadly happen
19:53:49 <pingou> smooge: we kinda have the timestamp on the apache page ;-)
19:53:57 <misc> or someone modified the tarball, cause patch is too mainstream :)
19:54:03 <pingou> misc: regenerate the tarball w/o renaming it
19:54:25 <nirik> I wonder, could we grab all those, then unpack and diff -Nur on
them to see how they are different? I guess so, but might take a long time to figure out
all of them.
19:54:38 <pingou> 5569 packages had multiple md5 for at least 1 of their version
19:54:43 <pingou> might be a little much :)
19:54:58 <misc> nirik: skip texlive, this will reduce the time to see :)
19:55:01 <pingou> nirik: smooge but that's the output from my demand from
yesterday (install tree on pkgs01)
19:55:04 <pingou> :)
19:55:15 <smooge> so what is the problem? the build system grabs the wrong one? we
are worried people are uploading different ones
19:55:36 <pingou> smooge: the build system will grab whatever is in the source file,
so we should be fine there
19:55:36 <nirik> pingou: I'd say devel list I guess. Ask people if they are
hitting upload issues (which we could try and fix) or other?
19:55:49 <pingou> it's more about packager/upstream behavior
19:55:52 <nirik> well, if it's a upload issue, we should try and fix it.
19:56:01 <nirik> if it's a upstream issue, we should be very sad, but ok.
19:56:09 <nirik> if it's a packager issue, we should tell them not to do that.
19:56:21 * pingou is on the list
19:56:43 <smooge> pingou, but the source file says 389-admin-1.1.12.tar.bz2 and the
lookaside cache has 3 of them
19:56:58 <nirik> smooge: the sources file has md5 too
19:57:03 <pingou> smooge: 3 different md5, and the source file has the md5
19:57:07 <smooge> duh
19:57:09 <smooge> thanks
19:57:11 <pingou> :)
19:57:19 <smooge> I deal with budgets and PPC for a week
19:57:31 <pingou> you're in pretty good shape then! :D
19:57:32 <smooge> well mostly budgets
19:57:34 <nirik> anyhow, perhaps devel list and hope we can get folks interested in
investigating more so we don't have to?
19:57:46 <pingou> nirik: ok we'll do that :)
19:58:04 <nirik> crowdsource all the things! :)
19:58:26 <pingou> nirik: but I doubt texlive are bad upload, I'm pretty sure
they are small sources :)
19:58:46 <nirik> oh... I wonder if that data would be nice too... size ?
19:58:53 <pingou> ?
19:59:02 <nirik> because if there are 4 of them and 3 of them are really small, it
sounds like an upload problem?
19:59:19 <smooge> i was thinking spec file
19:59:22 <nirik> if all are close to the same size, it sounds more like upstream
re-released or packager messed up
19:59:56 <nirik> pingou: so, for each of those on your list, a 'ls -l' of
the same md5sum one?
20:00:09 <nirik> ls -lR
20:00:13 <smooge> actually even then it could be a bad upload. We had someone
complaining a while back and it turned out about being a bad proxy in front
20:00:31 <nirik> sure, but it might give some more indications.
20:00:38 <pingou> nirik: good idea
20:01:38 <nirik> ok, if nothing else, will close out in a minute.
20:02:03 <willo> quick update from me
20:02:28 <nirik> willo: sure, whats up?
20:02:42 <willo> I'm about half way through collating a list of networks before
I start on the diagrams
20:03:10 <danofsatx-work> willo: sorry I dropped off on helping you with this -
school became harder than I anticipated for this semester.
20:03:22 <nirik> cool. please do ask me if you have questions.
20:03:29 <willo> i'll have something shortly to get input on assumptions
20:03:42 <willo> assumputions about purpose etc
20:03:46 <danofsatx-work> but things are stabilizing, so feel free to ping me for
20:03:47 <willo> nirik: no prob
20:04:00 <willo> danofsatx-work: no probs will do
20:04:18 <nirik> great. ;)
20:04:36 <willo> listing it out in spreadsheet and i'll stick it up on
fedorapeople and ping mailing list
20:04:48 <nirik> ok, thanks for coming everyone! Lets get back to it in
#fedora-admin, #fedora-apps, #fedora-noc.
20:04:55 <nirik> willo: sounds goodly.
20:05:11 <nirik> #endmeeting