pkgdb2 post-mortem and strategy for future deployments

Dennis Gilmore dennis at ausil.us
Wed Jun 4 22:27:44 UTC 2014


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Wed, 4 Jun 2014 11:44:54 -0700
Toshio Kuratomi <a.badger at gmail.com> wrote:

> 
> This came up in a different venue and pingou and I have continued to
> talk about it.  Seemed that this was the right place to bring the
> discussion though.
> 
> Some observations:
> 
> * Pkgdb2 and a call for testing in staging was announced well in
> advance of the deployment to production (good) but not everyone
> understood that we were going to be breaking API (bad).
> 
> * There were people inside of fedora infrastructure and outside of
>   infrastructure who were surprised by the API break.  There were
> also some community members and infrastructure members who heeded the
> call for testing and both gave feedback and ported before the
> deployment.
> 
> * There was a FAS2 update that pkgdb2 depended upon.  That was also
> pending in stg for a long time and also had some minor API changes
> (IIRC, all unintentional.  I hotfixed one of them that was simply a
> bug last week). These also caused issues for some scripts.
> 
> * Unexpected problems: we had things that we didn't know used the
> pkgdb API, things that weren't tested in stg because stg couldn't
> replicate that part of production, and things that were ported but
> mistakes caused the ported scripts to not be deployed or to point at
> stg instead of production. I saw that we had the right people on IRC
> throughout the day working on analyzing and patching all of the
> broken things so. However, this was somewhat by accident and some of
> those people were surprised that they spent their day doing this.
> 
> 
> Some ideas for doing major deployments in the future:
> 
> 1: We have to make people aware when a new deployment means API
> breaks.
>   * Be clear that the new deployment means API breaks in every call
> for testing.  Send announcements to infrastructure list and depending
> on the service to devel list.
>   * Have a separate announcement besides the standard outage
> notification that says that an API breaking update is planned for
> $date
>   * When we set a date for the new deployment, discuss it at least
> once in a weekly infrastructure meeting.
>   * See also the solution in #3 below

I have a suggestion make everything provide and validate api info.

this is something koji does, though we to date have not broken api, we
expect we will do when we start on koji-2.0  so each app would provide
a getAPIVersion() function, then the consumers validate that they know
how to talk to that API version.

for bodhi for instance apps could check for api version if it fails
assume version 1 and then be able to support both version 1 and 2, when
we deploy live bodhi2 the clients like fedpkg update will transparently
switch over.

Dennis
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)

iQIcBAEBAgAGBQJTj51hAAoJEH7ltONmPFDRZB4P/1R3cdAlSMPatOVoGw8QYgzq
khMZKCaNjQT0V3KI5b4yl0MA6VqUXjlhT9SuscWMuQD8eADkizXbVzyHUQxMAz/F
rlAeezgHss/LRt51TmrHhpBupuzDQKHtzh1wrlz1b6KQgnR7ZuyLF0pgdgN4EaiY
Ydtpj0pg7Bm+4x98TgmfoCcugiLuKNSFVyxYCc4Erkr4YB1BEF6VhGZJ1yvlkq23
hugsNbLx6/oGKg18U5QDKPmwl/AuOSOK7hVz+nHDQmjn9x24l2x/yq9KAN8++RMN
2kfqwM3QEJ+qqMoauZc8JiLxJYCATLk3xmrfUKlKF29D1Wy13riburt+VXmmCxyt
PP8QSRWgTesJzecN7YS58Q8HG8Fu209K9mpVLGExg9KEseAAU+Ccm+B2er02gDTw
VJKI/fP341IerEVtF2BFKd6FHhe3yPpzKpAe7pbOcTOi1rDmZSEI9Z7kuLucK/EE
tQSm5ZyXqgOdAvC3nhvTwm1OSRvjDb4zcg+p+Uwgxw/vs6V9c8cxcmQlGQGv6HDd
0VBKsyzhu1cVWJy0pCJHUa4qeztxTkBN167lISxtivTIGhdaIl/HrnEJjBDmXWCO
FMaM/sWVKxMPTetkFFJ96AoxV1E8qzSAXdq2+CqfVQpgPETMMnUsUO7sfTHXGX2i
UAVScMl0M1PBZos1upa6
=0xdj
-----END PGP SIGNATURE-----


More information about the infrastructure mailing list