For at least the last year or so, we've been looking for ways to make
docs.fedoraproject.org (d.fp.o) more usable for our readers and less
onerous for our contributors. At the last docs meeting,[0] nb outlined a
proposal to replace the current CVS mechanism, and I presented a brief
demo of what publishing looks like with the forthcoming Publican 2.0.[1]
To further the discussion, I thought I'd present the options a bit more
completely here.
Sorry this is so very wordy! I threw together some graphics to accompany
this post that hopefully might make it a bit more digestible.
Please have a read and share your thoughts.
Cheers
Rudi
================================
OPTION 0 -- What we have now -- CVS + PHP
================================
1. Writers[2] pull down the source code of a doc from a repo (mostly
git) and at least some part of d.fp.o site as stored in a CVS repo.
Unless you're working on multiple docs, you only need the directory that
corresponds to your book, not the entire site.
2. Writers build the book from source on their local machines, and copy
the Publican output into their local copy of the d.fp.o directory for
that book. They also hand-edit PHP files as needed to provide the index
pages.
3. Writers perform CVS voodoo to add and commit the files back up to the
CVS repo, and tag the files LIVE. Within a few hours, the web server
picks up any files tagged LIVE in the CVS repo and publishes them.
PROBLEMS:
1. Editing the PHP index files by hand is onerous and seriously prone to
error. Everyone who has worked on this has routinely made mistakes
keeping track of which book in which format in which language goes where.
2. The CVS process is notoriously temperamental, unforgiving, and prone
to error. Mistakes are frequent, and correcting them is difficult and
frustrating (and in some cases, not even possible).
================================
OPTION 1 -- Proposal from Infra -- git + static HTML
================================
1. Writers pull down the source code of a doc from a repo (mostly git)
and a git repo containing the entire d.fp.o website.
2. Writers build the book from source on their local machines, and copy
the Publican output into the local copy of the git directory for that
book. They also hand-edit html files as needed to provide the index pages.
3. Writers commit the files back up to the d.fp.o git repo. Within a few
hours, the web server synchronises with the repo and publishes the new
files.
ADVANTAGES:
1. Replacing CVS with git eliminates the single greatest source of
failure and frustration.
2. The tools available to edit static html index files are more powerful
and more user-friendly than editing PHP in a text editor, so mistakes
are less likely to occur.
DISADVANTAGES:
1. The initial git clone is massive -- something like 2 GB -- and this
will only grow bigger over time.
2. Indexes are still built by hand.
================================
OPTION 2 -- Publican
================================
Note technology demonstration available at
http://publictest8.fedoraproject.org/fedoradocs/public_html -- hurried,
and slightly broken in places
1. Writers ssh into a remote system (let's call it the "build system"
and pull down the source code of a doc from a repo (mostly git) onto
that system.
2. Writers build the book with Publican on the build system, and run
the "publican install_book" command. Publican adds the book to a local
copy of d.fp.o and automatically rebuilds the site index.
3. At regular intervals, any new or changed files in the version of
d.fp.o on the build server are copied to the web server, using scp or
rsync or something similar, initiated either on the build server end or
the web server end.[3]
ADVANTAGES:
1. Writers are finally completely free from the two most onerous and
error-prone parts of the publication process: using a version-control
system as a publishing tool, and having to rebuild indexes by hand. Even
if a book itself is misconfigured, screwups are contained to a single
book in a single version in a single language, unlike some incidents
we've had with hand-edited PHP.
DISADVANTAGES:
None of which I'm aware, however we'll need some new infrastructure in
place and we'll all no doubt make mistakes while getting used to a new
way of doing things.
================================
OPTION FOR TOMORROW -- Publican + Koji
================================
1. Writers pull down the source code of a doc from a repo (mostly git)
2. Writers run "publican package" on their local systems with an option
that tells Publican to upload the SRPM package to a Koji instance. Koji
builds the package and places it in a repo.
3. At regular intervals, the web server pulls any new docs packages from
the repo and installs them. Documentation packages built with Publican
automatically install into subdirectories of /var/www/html/ and
regenerate any Publican site index present.
ADVANTAGES:
1. Publishing is completely automated. This is actually the job that
Publican was designed to do and pretty much how it's been running
internally within Red Hat for years. Writers and translators can focus
on writing and translating and leave everything else to the machines.
2. Since the documentation is going to a repo, Fedora users world-wide
could yum install docs packages from this repo. We might also be able to
take advantage of the new langpacks feature in yum to help users find
documentation in their own language.
DISADVANTAGES:
No disadvantages of which I'm aware, but the current stumbling blocks to
implementing such a system right now are:
1. Publican doesn't yet support sending packages to arbitrary Koji
instances! So this would require some code changes to Publican. Since
Publican already supports Red Hat's Brew system, I suspect these are
relatively minor.
2. We would need a Koji instance on which to build docs. Writers and
translators who are trusted to publish docs would need to be approved as
packagers on that Koji instance. In a perfect world, this would be the
same Koji instance that the Fedora Project uses already to build
software packages. However, most writers and translators do not need to
or want to package software. We would need some administrative changes
within the project that would approve certain writers and translators to
build docs packages only. We would also likely need a different package
approval process for docs, as docs packages would quickly overwhelm the
existing system.[4] Failing to get these admin changes, we would need a
separate Koji instance somewhere, and need Infra to maintain it for us.
================================
Notes
================================
[0] minutes --
http://lists.fedoraproject.org/pipermail/docs/2010-April/012130.html
[1] Publican 2.0 is still under heavy development, but I have (unsigned)
packages for Publican 1.99 and its web component available from my
fedorapeople page:
http://rlandmann.fedorapeople.org/publican/ and you
can find instructions for the new web stuff on Jeff Fearn's page:
http://jfearn.fedorapeople.org -- please download and experiment!
However, if you're involved in building or publishing F13 docs, please
do not use Publican 1.99 for this, and in particular, do *not* update
POT or PO files for current work with Publican 1.99!
[2] I say "writers" here, but the same applies to translators, although
to date, translators have not generally been involved in publishing
their work. One advantage of rethinking how we publish is to allow
translators and language teams direct control over their work, without
making them rely on writers as "gatekeepers" for their docs.
[3] Technically, there's no reason why the build server and web server
need to be two separate machines, or why the "publican install_book"
command couldn't write directly to a publicly accessible directory on
the web server. I've suggested separating them mainly out of paranoia,
but maybe I'm being overly cautious?
[4] Hopefully in the context of this whole workflow, Publican's
insistence on packaging languages one-at-a-time and on including version
numbers in package names makes more sense -- it's these distinctions
that produce the table of contents for the site. It also illustrates why
a different package approval process would be necessary for docs -- 15
books in up to 12 regularly-translated languages is up to 180 new
packages for each Fedora release...