Thanks for your comments.
----- "Dimitris Glezos" <dimitris(a)glezos.com> wrote:
> 2008/9/17 Asgeir Frimannsson <asgeirf(a)redhat.com>:
> > On Tuesday 16 September 2008 23:29:32 Mike McGrath wrote:
> >> > >
> >> > > Please correct me if I'm reading this wrong but I see
> "transifex is
> >> > > great or close to it" and "here's how we're going to build our
> >> > > solution anyway" ?
> >> >
> >> > Yes, "Transifex is great and will continue to serve us".
> >> >
> >> > BUT:
> >> >
> >> > If you look at the state of the art in L10N outside the typical
> >> > projects where PO and Gettext rule, you'll notice we are very
> short on
> >> > areas like: - Translation Reuse
> >> > - Terminology Management
> >> > - Translation Workflow and Project Management
> >> > - Integration with CMSs.
> >> > - Richer Translation Tools
> >> >
> >> > This is an effort in narrowing that gap, and I can't see that
> effort work
> >> > by evolving an existing tool from this 'cultural background'.
> Yes, we can
> >> > get some of the way by developing custom solutions for e.g.
> linking wikis
> >> > to Transifex for CMS integration, or using e.g. Pootle for
> >> > translation. But we would still be limited to the core
> architecture of
> >> > the intent of the original developers, which is something that
> >> > radically slow the project down.
> For the record, I believe these are some fine ideas, which I would
> like to see added to Transifex as features (eg. through plugins). I
> have been discussing most of them with people around conferences for
> the past year. An example: Tx already downloaded all the translation
> files from upstream projects, so if someone requests a translation
> file, why not be able to pre-populate it using existing translations
> from all the other projects (translation reuse)?
> Also, I should mention that Transifex isn't (and will never be)
> specific to a particular translation file format (eg. PO) or any
> translation repository. I'd like to support translation of both PO
> XLIFF files. And also support not only VCSs, but CMSs, wiki pages and
> even arbitrary chunks of text. Transifex's goal is to be a platform
> help you manage your translations.
For the record (since XLIFF is mentioned and since I'm part of the Oasis XLIFF Technical Committee), I am not aiming to design anything around XLIFF in this project, other than perhaps support XLIFF is an import/export format for resources in the same way as we support PO (we do have the odd XLIFF file coming through for translation). I don't think XLIFF (1.2) is mature enough yet as a L10N resource format.
I know there are some big ideas in transifex. In fact, when transifex is mentioned, often people refer to the *goal/idea* of transifex, rather the actual current implementation. Take for example plugins, transifex doesn't currently have a plugin system, neither does it have workflow, project management, or any concept of translation resources internally. Transifex today is a simple 'file submission system' with a growing community aiming to build it into something more. With this in mind, 'building on top of transifex' really means redefining what transifex really is. For example, 'file submission' should really be a plugin, not a core feature. That means all of transifex today (excluding maybe the login UI), should really be plugins to a core model of projects, people, etc, that currently doesn't exist.
Defining this 'model' of a repository doesn't really depend much on the implementation, and in fact many implementations might help push this faster and ensure a better solution (if it was on the tx roadmap in the first place). And it's not like it is impossible for e.g. a java based repository to communicate with Transifex for file submissions, isn't that exactly what the remote-interface of TX (on the roadmap) is supposed to provide? What I'm hearing is "Don't build something new, continue building on the python/tg/transifex architecture", which is fully understandable. However, considering the cost of developing this on top of tx (re-architecture, convincing all that it is the right path to go, immaturity/stability of libraries for e.g. ajax, limited workflow support), I honestly think it's better with two projects that 'compliment' each other. There are more than enough tasks for everyone in the existing Tx roadmap, and the idea is bigger than what a combined development team could accomplish. Diversifying and pulling in good people from e.g. the java-side of things might even help speed things up.
> >> Correct me if I'm wrong though, instead of forking or adapting or
> >> with upstream, you are talking about doing your own thing right?
> > We have a goal of where we want to see L10N infrastructure go, to
> enable us in
> > the future to provide internal (translators paid by Red Hat) and
> > translators with tools to increase their productivity as well as
> better tools
> > to manage the overall L10N process. If there is an 'upstream' that
> > this, or a platform on to which we could develop this, then yes, we
> > consider 'working with upstream' or (in a worst-case-scenario)
> > upstream.
> The Translate Toolkit folks are a very friendly bunch, actively
> maintaining and extending the rich library, and always open to
> suggestions. Maybe some (if not all) of the features could be done in
> TT, and the rest that might not fit there, as Python libraries to
> maximize interoperability and community involvement.
Yes, I know TT very well, and have discussed the library with Dwayne Bailey (the main visionary behind the project) in the past, even before tx was born. In fact, a django-migration of Pootle (built on top of the TT) has been on the agenda for a while, and combining forces with TT is one of the other options I have been strongly considering for a repository (TT e.g. has a file submission library, and there is a lot of duplication between tt and tx). Looking at the svn activity of TT (in my rss reader), it is definetly a project with a 'dangerous' future.
> I also think that Transifex could serve as the "UI" for a lot of
> translation-specific tasks. If there's a library that does X, that
> would help people manage their translations or leverage Transifex's
> strong points of "I read a lot of repositories" and "I write to some
> repositories", then we could provide a web wrapper around it. (eg.
> search for string "X" in all translation files of language "Y", or
> "mark <this> file as a downstream of <that> and send me an msgmerged
> file whenever <that> changes".
> > So to answer your question bluntly, YES - after 4 years involvement
> > industry and community L10N processes - I believe we can do better.
> > holding that thought, remember that this is in many ways
> 'middleware', and
> > making use of e.g. the vast amount of knowledge invested in
> Translate Toolkit
> > (file format conversions, build tools, QA) makes sense, and I'm not
> > 'forget about all that we have invested in tools so far'.
> It might be my poor English or the fact that I usually read long
> at night, but despite the lengthy descriptions I still don't have a
> clear picture of exactly what problem you'd like to solve, and the
> reasoning behind the decisions being made.
I do understand there is a 'semantic gap' here, and that we do need to provide a better description and demonstration of why a new project is necessary. I do believe everything is theoretically possible to build on top of python/tg and through reuse of concepts in e.g. tx and TT, but I honestly believe if we are going to manage and drive the development effort in this, it is more worthwhile to expand beyond the fedora/python community, and use tools that the core developers would be more comfortable and productive with. This is not a 'we think you guys should develop this' request, we are taking ownership of the project, as well as inviting anyone that is interested in the community to participate and take ownership.
> Don't take me wrong -- I think there are some good ideas. But I feel
> it would be too bad if you guys didn't invest on top of existing
> (TT for file formats, Transifex for file operations and UI, OmegaT
> translation memory) or just isolate specific solutionsthat don't fit
> into other projects in well-defined libraries (do one thing, to it
> right). Sure, it takes a lot more effort to work *with* other people,
> but it is usually worth it. :-)
This is *not* about an effort to avoid working with people. It is an effort to get more people working on this. I know more people in the Java community that is or might be interested in a open source solution for these problems than in the Python/Fedora/TG community. And of course adding to this a portion of my natural bias towards Java, and the fact that the people that would be working on this would initially be much more productive in Java than in Python (TG2 or django).
With the fact that we throw this idea out to the fedora/tx community early, please take that as a sign that we are trying to work with the community, rather than simply developing something on our own. And I for one will continue being involved with Tx to some degree, and help out where I can. L10N is an area with a lot of space for improvement, and an area that has sadly been to some extent 'neglected' except for Dimitris' recent work. We still have a long way to go before we have what I would call a L10N infrastructure that serves translators well.
Hi infrastructure wranglers,
Over the last few months, a few of us involved in Red Hat L10N engineering
have discussed how to best ensure we have Localisation Infrastructure and
Tools that can serve the needs of Red Hat, JBoss, Fedora and 'upstream'
communities in years to come. Let me first describe some of the background and
requirements behind this project:
Up until now, we have managed translations through version control systems
such as CVS, Svn and Git. This has ensured that all contributions are pushed
upstream, as we always store translations within the upstream repositories and
projects. 'Damned Lies' further gave us a tool to view language-specific
translation statistics for modules, branches and releases, as well as
convenient information about people, teams and projects. This has been a great
help for translators in their work. Dimitris' (and others) work on Transifex
has in addition given the translation community a way to submit translations
upstream without ever touching a developer-centric version control system,
which has been of great help to translators.
Some of the immediate needs that could be addressed within the existing
framework (some of which are on the Transifex roadmap) are:
- Consolidation of Damned Lies and Transifex, allowing retrieving and
submitting translations through the same interface
- Allowing retrieving and submitting multiple-files at once (e.g. for
translating a publican document with many PO files)
- Simple workflow on top of Transifex (porting features from Vertimus)
- Better usability and easier user registration process (Fedora specific)
Transifex is gaining some traction upstream (e.g. within Gnome), and we hope
development will continue strong, serving Fedora and potentially other
Looking at the bigger picture, some of the core requirements we have identified
for Red Hat and community L10N going forward are:
- Customizable Translation Workflows and integration with e.g. Content
- Infrastructure easily adaptable to support new File formats and project
types (e.g. OpenOffice formats, CMS formats, DTP formats, Wiki, Dita, Java
formats), rather than relying on 'upstream' projects to fit a certain L10N
- Managing the life-cycle of a translation project across releases and
- Translation Reuse and Terminology Management across projects and iterations
- Job management, scoping, tracking and resourcing
- Managing and/or Tracking upstream translation projects, pushing changes back
These requirements require a system where the translation lifecycle would be
managed within 'Translation Repositories' (similar to e.g. Pootle or Launchpad
Translations), rather than directly through e.g. upstream version control
systems. With a repository-based approach, we would be able to track and
manage changes to a project on a translation unit level, and manage e.g.
translation reuse and terminology within and across projects. We could still
retain a link with upstream repositories (like with Transifex/Damned Lies).
However, this would not be the 'core datamodel', but on a different layer
through plug-ins. This link to external repositories could also go beyond
traditional version control systems, communicating with external sources like
wikis and CMSs.
We have evaluated a number of existing open source L10N frameworks and
systems, but haven't found any (yet) that stands out or satisfies our needs or
requirements as a development platform. Technology-wise, we are aiming to
develop a Java-based(!) system, using technology such as JBoss Seam,
Hibernate, jBPM and RichFaces. A java based platform will enable us to make
best use of internal expertise in these technologies, as well as making use of
technology we are developing (as open source) through collaboration with
partners in the L10N industry.
We hope some of these requirements and ideas will excite some of you, and
ultimately lead to something that can be of use to open source communities.
While we have certain requirements and goals for this internally within the
company, there is no need for this to be an 'internal' Red Hat project, and
most of the requirements and needs overlap with those of community projects
like Fedora. In other words, by developing this in collaboration with the
community from a very early stage, we are more likely to develop something
that may be of use to the greater community.
Thoughts and comments, all sorts of comments, are very welcome.
(Senior Software Engineer, I18N Engineering, Red Hat APAC)
Some of you have seen the disk alerts on app2, Looking more closely it
seems the host was not built with enough disk space (like was app1). So
after the freeze is over I'll rebuild it.
It does raise a point about storage for transifex though. Basically each
host running transifex (or damned lies, I can't quite remember which)
keeps a local copy of every scm as part of its usage. For performance
reasons I don't think that will change but its something we'll want to
figure out long term. I haven't done the research but in my brain it
seems like running something like git/hg/svn/bzr over nfs will cause
On the other hand, these aren't upstream repos but local cache so I'm also
curious what the harm would be, if they get borked one could just delete
the cache and it would repopulate. Thoughts?
Josh Boyer wrote:
> On Sat, Sep 20, 2008 at 12:15:46AM +0530, Rahul Sundaram wrote:
>> I have. My point still stands. If a new Fedora meeting is being held, it
>> should not planned and organized as a secret. It should be announced
> So those Red Hat budget meetings that control the financial fate of Fedora...
> those should be planned in public too right?
No but the meeting minutes are not published publicly for such meetings
either. The Fedora specific budget details are however published at
This I consider to be a very good thing.
Or the meetings that were held
> to discuss the outage crisis?
Here there might have been reasons for secrecy and it would better to
let people know that such a meeting is being held even then. I don't see
any such requirement for release planning.
Should Paul publish his work calendar, given
> that he's the Fedora project lead and has Fedora meetings all the time?
Since all the other public Fedora meetings are well known, I don't need
his work calender.
> Seriously, everything is not black and white. You're making entirely too
> big of a deal about this because you didn't get invited and someone pissed
> in your lemondade
Whether I get invited or not is irrelevant and I don't drink lemondate
much ;-) I never said I should be invited even once. There is no need to
be so petty. You might want to read about what I said completely. My
issue is with the way in this meeting is organized and planned in secret
and just that.
Josh Boyer wrote:
> On Fri, Sep 19, 2008 at 10:13:32PM +0530, Rahul Sundaram wrote:
>> Josh Boyer wrote:
>>> On Fri, Sep 19, 2008 at 09:07:16PM +0530, Rahul Sundaram wrote:
>>>>>> Wouldn't it be useful to invite more than person as part of the
>>>>>> different groups? Currently it seems a number of people have not
>>>>>> attended which leaves that group voice unheard.
>>>>> I'm pretty sure the meeting was public.
>>>> Was the invitation public?
>>> I don't know. It doesn't really matter though. See the part of my email
>>> that you cut off in your reply.
>> I think it does matter. If new meetings for Fedora is being organized,
>> invitations MUST be send publicly. I would want to know who is
>> organizing it, where it was held, which date and time, whether others
>> are attending etc. If it is a IRC meeting, where are the logs? You said
>> you are sure the meeting was public. Why do you think that?
> I was wrong in thinking it was public. It was a phone call, not an IRC
> meeting. Still does not matter, and you still apparently haven't read the
> second part of my original reply.
I have. My point still stands. If a new Fedora meeting is being held, it
should not planned and organized as a secret. It should be announced
ahead of time in public so interested people can participate. Even if
you don't want others to participate, you can still organize it
publicly. The fact that the people someone decided to pick on their own
in private, can pull in others doesn't change this.
On Fri, 2008-09-19 at 12:53 -0400, Josh Boyer wrote:
> I was wrong in thinking it was public. It was a phone call, not an IRC
Just for more fun and confusion, a meeting doesn't have to be on IRC for
it to be "public".
While this meeting wasn't announced on one of the major lists (for good
reason), it was somewhat assumed that if a leader for a group couldn't
make it that they would have somebody else go in their steed. Perhaps
we'll call that out a bit more clearly next time. These people were
sent mail multiple times leading up to the meeting so there was plenty
of chance to find an alternative.
The release readiness meetings are designed to be very high bandwith
information exchanges between the various groups involved with doing
releases. Obviously the later releases (preview, final) are more
important and have more people involved than the earlier (alpha, beta)
It's an exchange of information that each group should already have
through through and discussed in lower bandwith higher visibility
meetings within each group. The readiness meeting is just like a mini
mission control meeting to ensure things go off without a hitch and that
those leading the groups and responsible for the functions are aware of
whats going on.
If nobody from your group showed up, I'm sorry, but we gave them ample
time to find a replacement. You can be prepared as these meetings will
come up for each major milestone during our development cycle. If you
want, we can probably embed this information into the schedule pages
that John creates, it's not like those aren't busy enough as they are (:
Fedora -- Freedom² is a feature!
Josh Boyer wrote:
> On Fri, Sep 19, 2008 at 09:07:16PM +0530, Rahul Sundaram wrote:
>>>> Wouldn't it be useful to invite more than person as part of the
>>>> different groups? Currently it seems a number of people have not
>>>> attended which leaves that group voice unheard.
>>> I'm pretty sure the meeting was public.
>> Was the invitation public?
> I don't know. It doesn't really matter though. See the part of my email
> that you cut off in your reply.
I think it does matter. If new meetings for Fedora is being organized,
invitations MUST be send publicly. I would want to know who is
organizing it, where it was held, which date and time, whether others
are attending etc. If it is a IRC meeting, where are the logs? You said
you are sure the meeting was public. Why do you think that?
Fedora 10 Beta Release Planning Meeting
== Invitees ==
Jonathan Roberts -- Marketing
Karsten Wade -- Docs
Jesse Keating -- Rel Eng (present)
Paul Frields -- FPL (present)
Spot Callaway -- FEM (present)
John Poelstra -- Organizer (present)
Mike McGrath -- Infrastructure (present)
Dimitris Glezos -- Translation (present)
Máirín Duffy -- Art (present)
Will Woods -- QA
James Laska -- QA (present)
Ricky Zhou -- Websites (present)
Bill Nottingham -- Development
== Meeting Goals ==
o not intended to be a lengthy meeting
o opportunity to quickly go around the "virtual room" to make sure we
are all ready for Beta release day
o make it easier for us to communicate in real-time across teams on the
o John Poelstra also trying to flush out the taskjuggler schedules to
reflect the reality of what needs to get done so that future schedules
o Invited a representative from each Fedora group with a role in getting
the release out the door
== Meeting Notes ==
=== Release Engineering ===
o Release Engineering is expecting the following on release day
1) announcement with link to page with information about the beta
2) landing zone to drive all the users to
--Mike M notes that this page http://get.fedoraproject.org should
be used going forward for all releases
o All tasks and content for the beta release (for all teams) should be
ready the day before release--Monday, September 22, 2008
--need to do a dry run or validate that things (links, content, etc.)
are in working order before announcing
--meet 1 hour before release and try things out
o Future release cycles could be smoother with a buffer between feature
freeze and beta freeze
--considering during Fedora 11 planning
o Consider proposing a set of earlier dates for core packages in the
distro to solidify the release earlier. For example: anaconda, yum,
o Consider moving feature freeze to one week before beta freeze so that
features don't crash land on the beta freeze date
=== Documentation ===
o docs team responsible for content of releases notes
o single page release notes
--do not get translated
o will assist release engineering to help with release announcement
o Docs team will need more community participation and help to get the
installation guide ready
--past contributors will not be able to be as involved this release cycle
--waiting for git repo to get online; Mike will help coordinate as needed
=== Art Team ===
o Art team to create banner to point to beta
o Art team is still deciding on a final theme and content will not be
ready in time to package for the beta
--in the future we will target getting new artwork ready and packaged
by feature freeze
o Art team will work on a count-down timer to be ready at the release of
the Preview Release
==== Web Sites Team ===
o websites team responsible for presentation of get.fedoraproject.org
=== Marketing & PR ===
o Paul is working with RHT PR to create press release blog
o Also need to do coordination with Jonathan Roberts and the Fedora
=== Translation ===
o Translation deadline pulled in by one week to provide one week between
the translation deadline and the final development freeze
o Translation of the release notes starts with the preview release
o Dimitris is trying to get Red Hat to commit internal resources to help
translate release notes for languages that RHT supports
--15 languages now supported by Fedora could increase to 30 or 35
o Considered how we could make sure that translations are repacked
before before final devel freeze?
--use tracker bugs?
--Spot is willing to help patrol and look for things that are missing
== Next Meeting ==
o Prepare for Preview Release
o Wednesday, October 22, 2008
o 17:00 UTC