[top-posting to retain the quoted mail in entirety]
Thank you Toshio for the detailed information. This does provide some clarity
about the events in the background that eventually led to the decision. Since we
are already into the transition phase, further debate on the 'hows' and
'whys'
and 'so thats what it really was' would not serve much purpose besides adding to
the noise.
However, since we do expect to review the decision and explore the possibility
of returning to a Fedora hosted infrastructure, we may as well ensure that we do
not repeat the mistakes, especially that of communicating. As opposed to the
individuals from the FLP, the FLSCo, an elected group of representatives from
the FLP did not seem to figure anywhere during all this (unreported) discussion.
As a result, an established communication channel within the FLP was completely
ignored or not utilized.
The 'threat board' is perhaps another good thing to have in the future. With red
signs flashing all over the FLSCo and FLP could have pushed for discussion much
earlier. It would also allow individual project teams to check back on the
general health of the tools that they use.
Thank you for your patience and hopefully we can see a much stable translation
infrastructure in place for the FLP.
regards
Runa
On শুক্রবার 25 ফেব্রুয়ারি 2011 06:14 , Toshio Kuratomi wrote:
On Thu, Feb 24, 2011 at 02:08:40PM -0800, Karsten Wade wrote:
>
> Definitely there was some process breakdown and some lack-of-process.
>
> It seems that it wasn't an active open community discussion between
> Infrastructure and affected teams about problems with maintaining
>
translate.fp.org and how to fix them. This all could have been discussed
> over the preceding months, presuming that were possible ... and I am going
> to presume it wasn't, that we are here based on a series of best-efforts
> and best-will from people involved. Does that make sense?
>
So... you'll need to contact mmcgrath for exactly where, with whom, and when
previous discussions took place but discussion had been taking place for
quite a while. A lot of people realized there was an issue when this ticket
took 9 months to resolve:
https://fedorahosted.org/fedora-infrastructure/ticket/1455
Enough that gomix, beckerde, other translators, and I discussed the situation
at FUDCon Chile last year and I opened this ticket:
https://fedorahosted.org/fedora-infrastructure/ticket/2277
The first meeting that we (infra) discussed that ticket at, mmcgrath said
that he'd tried but failed to find someone to maintain transifex in infra
from the l10n team. He and I tried to think of someone that we knew
personally that could do the job and via email got in touch with beckerde who
agreed to work on packaging and migration of the new transifex.
Since beckerde was new to Fedora packaging and transifex dependencies had
changed it took a while to get things working but with the help of other
latam packagers we got packaging sorted out eventually and started to test
upgrades on the infrastructure staging servers. The version that's being
worked on is an older version because newer versions of transifex would
change the translation workflow (the recent talks have made clear that the
change in workflow is mainly for the developers of the translated software
rather than the translators.)
{{A side thread to this story: In the F14 time frame (ticket 1455 time
frame), the translation teams made some valid noise about the problems that
the version of transifex we were running were causing them. At that time,
glezos, stickster, some people from l10n and infra analyzed the possibility
of moving to
transifex.net and decided that it was possible but were divided
about the desirability. We were able to upgrade the FI instance of
transifex which deferred the question.}}
Sometime between the tests on stg and fudcon some of infra starts worrying
about how we're going to continue maintaining transifex. beckerde is signing
up to work on packaging the software and deploying it but we're still running
several versions behind and we don't have in-house knowledge on how to fix
issues in the code. Moving to the newest version is caught in the workflow
changing problem that, since infra isn't involved in using transifex, none of
us know what is changing or how to approach l10n about changing their
workflow to accomodate it.
{{Next side thread: mmcgrath leaves as infrastructure leader. This leaves us
shorthanded as we were already running close to capacity and mmcgrath was
also a good community leader who was able to entice and induct volunteers.}}
At FUDCon, smooge, skvidal, ricky, CodeBlock meet with igorps and some other
translators that are present and talk about moving off of Fedora hosting
transifex if glezos still wants to host us. We talk about a gradual shift at
that time. Later, infrastructure members talk to glezos on IRC and more with
each other.
{{Third side note: At FUDCon, infrastructure realizes how deeply
overcommitted we are. People at "The Next Big Infrastructure Project" were
excited about a bunch of cool ideas for new services but we realized that to
do any of them, we needed to either get rid of some of our present
responsibilities or get new long-term sysadmins to help out.}}
On IRC, post-FUDCon, we talk with glezos and firm up plans a bit more.
Someone convinces everyone else that moving a little at a time doesn't help
anyone because there's a lot more confusion if we have two separate places to
translate at, two procedures to integrate the translations with software,
etc. Fedora Infrastructure has a meeting where we figure out how we feel
about migrating transifex service to
tx.net. We're in favour. glezos to
carry the idea for coordination with the l10n team.
At this point things start moving fast. Content is synced to
transifex.net
for a Proof of Concept, meeting with l10n is setup to talk about migrating.
Decision is made to migrate pre-F15 because the FI-hosted transifex is too
painful, coordination with developers who will need to pull the translations
on newer versions of transifex is talked about, duties are assigned for
migrating, setting up teams, packaging the client-side tools for developers,
etc. That brings us pretty much to the present.
What I draw from all this is that:
1) Although communication was present, it didn't involve everyone touched by
the change. That's pretty hard to achieve in any circumstance but we could
try to make it better. From infra's point of view, transifex was being
provided by us to the Fedora l10n team which is not entirely accurate. So
there needs to be a way to figure out the chain of dependent people.
2) In some ways, people were just waiting on a decision. Fedora
Infrastructure was not able to update to the newest transifex because of
workflow changes. However, once it was decided that Fedora Infrastructure,
therefore, couldn't upgrade (or fix) transifex at all, it was easy and quick
for people on l10n to make a decision about changing workflow to get more
reliable service.
3) It seems that maintainance is harder to recruit people to work on than a
new direction. That new direction can even be a continuation of the current
direction (move to a newer transifex on
transifex.net) as long as it's seen
as a decision needing certain definite actions.
> Jared and Paul said in the meeting log multiple times that we needed to
> avoid people being surprised by the situation, but unfortunately that
> wasn't very likely. The timing involved made it a certainty that some
> people were going to be squeezed, surprised, and upset.
>
> I perceive the following items as fait accompli before that meeting
> happened:
>
> * Fedora Infrastructure decided they could no longer provide translation
> services.
>
> * To meet their service obligations, FI put in place a plan to move
> translation services to
Transifex.net.
>
> * Fedora Project leadership were essentially in agreemen with that plan.
>
> I didn't see the following items as fait accompli before that meeting:
>
> * Future of hosting our own translation tool of any kind on
>
fedoraproject.org. Clearly we can revisit this decision, clearly people
> wanted to.
>
> * Ability for resourceful people to step forward and make it possible to
> either not move for F15 or move back for F16. For example, you seem to have
> responded to this by seeking new, on-going resource commitments.
>
> * Commitment of L10n leadership toward any particular plan. People came in
> to that meeting with different opinions and agendas, but seemed to come to
> consensus by the end.
>
One note here -- in the meeting skvidal made a statement about what would
happen if we didn't move to
transifex.net which was both harsh and true. But
it shows that there's another thing that wasn't quite a fait accompli:
* Infrastructure could have continued to host a transifex instance but we
could not continue to support it (with updates, with fixes for any of the
numerous open bugs, with the ability to make it more reliable, etc).
skvidal's projection was that such a strategy, while possible, would
eventually lead to our instance of transifex preventing work from being done
at a time when nothing could be done about it making everybody upset.
> Community supported infrastructure doesn't have to be of a lesser service
> level overall, but things tend to be less formal across the board about
> change impact on community participants.
>
> One lesson to pull from this situation is for *every* service provider
> (which inclues L10n, Docs, Infrastructure, et al) to have a change
> management process in place that:
>
> 1. Exposes problems (needs for change) early and often. 2. Encourages
> discussion and work so decisions (why, how) are visible to all affected. 3.
> Makes sure affected teams are not surprised by developments.
>
I'd like to see an infrastructure threat board. Here's an idea of how that
would work: The threat board would have services that have no dedicated
person working on them as level red, things that have one or two person
working on them as level amber, less than five in yellow, and green at five
or more. A standing policy would be that things in amber or less are in
danger of being dropped and we need to work on how we're going to gracefully
drop, outsource, or reassign-drum up help for them (for instance, we can't
very well drop koji support so if it was in amber, someone would have to come
across from something else that they were doing to work on that). Perhaps
once a month, we'd post services that are not green to fedora-devel-announce
and if we didn't get its levels to rise in X amount of time, we'd be free to
enact our contingency plans if we chose.
-Toshio