docs.fedoraproject.org redesign -- very long!

Sat Apr 24 11:11:13 UTC 2010

For at least the last year or so, we've been looking for ways to make 
docs.fedoraproject.org (d.fp.o) more usable for our readers and less 
onerous for our contributors. At the last docs meeting,[0] nb outlined a 
proposal to replace the current CVS mechanism, and I presented a brief 
demo of what publishing looks like with the forthcoming Publican 2.0.[1]

To further the discussion, I thought I'd present the options a bit more 
completely here.

Sorry this is so very wordy! I threw together some graphics to accompany 
this post that hopefully might make it a bit more digestible.

Please have a read and share your thoughts.

Cheers
Rudi

================================
OPTION 0 -- What we have now -- CVS + PHP
================================

1. Writers[2] pull down the source code of a doc from a repo (mostly 
git) and at least some part of d.fp.o site as stored in a CVS repo. 
Unless you're working on multiple docs, you only need the directory that 
corresponds to your  book, not the entire site.

2. Writers build the book from source on their local machines, and copy 
the Publican output into their local copy of the d.fp.o directory for 
that book. They also hand-edit PHP files as needed to provide the index 
pages.

3. Writers perform CVS voodoo to add and commit the files back up to the 
CVS repo, and tag the files LIVE. Within a few hours, the web server 
picks up any files tagged LIVE in the CVS repo and publishes them.

PROBLEMS:
1. Editing the PHP index files by hand is onerous and seriously prone to 
error.  Everyone who has worked on this has routinely made mistakes 
keeping track of which book in which format in which language goes where.

2. The CVS process is notoriously temperamental, unforgiving, and prone 
to error. Mistakes are frequent, and correcting them is difficult and 
frustrating (and in some cases, not even possible).

================================
OPTION 1 -- Proposal from Infra -- git + static HTML
================================

1. Writers pull down the source code of a doc from a repo (mostly git) 
and a git repo containing the entire d.fp.o website.

2. Writers build the book from source on their local machines, and copy 
the Publican output into the local copy of the git directory for that 
book. They also hand-edit html files as needed to provide the index pages.

3. Writers commit the files back up to the d.fp.o git repo. Within a few 
hours,  the web server synchronises with the repo and publishes the new 
files.

ADVANTAGES:
1. Replacing CVS with git eliminates the single greatest source of 
failure and frustration.

2. The tools available to edit static html index files are more powerful 
and  more user-friendly than editing PHP in a text editor, so mistakes 
are less likely  to occur.

DISADVANTAGES:
1. The initial git clone is massive -- something like 2 GB -- and this 
will only grow bigger over time.

2. Indexes are still built by hand.

================================
OPTION 2 -- Publican
================================
Note technology demonstration available at 
http://publictest8.fedoraproject.org/fedoradocs/public_html -- hurried, 
and slightly broken in places

1. Writers ssh into a remote system (let's call it the "build system" 
and pull  down the source code of a doc from a repo (mostly git) onto 
that system.

2. Writers build the book with Publican on the build system, and run 
the  "publican install_book" command. Publican adds the book to a local 
copy of d.fp.o and automatically rebuilds the site index.

3. At regular intervals, any new or changed files in the version of 
d.fp.o on the build server are copied to the web server, using scp or 
rsync or  something similar, initiated either on the build server end or 
the web server end.[3]

ADVANTAGES:
1. Writers are finally completely free from the two most onerous and  
error-prone parts of the publication process: using a version-control 
system as a publishing tool, and having to rebuild indexes by hand. Even 
if a book itself is misconfigured, screwups are contained to a single 
book in a single version in a single language, unlike some incidents 
we've had with hand-edited PHP.

DISADVANTAGES:
None of which I'm aware, however we'll need some new infrastructure in 
place  and we'll all no doubt make mistakes while getting used to a new 
way of  doing things.

================================
OPTION FOR TOMORROW -- Publican + Koji
================================

1. Writers pull down the source code of a doc from a repo (mostly git)

2. Writers run "publican package" on their local systems with an option 
that  tells Publican to upload the SRPM package to a Koji instance. Koji 
builds the package and places it in a repo.

3. At regular intervals, the web server pulls any new docs packages from 
the  repo and installs them. Documentation packages built with Publican  
automatically install into subdirectories of /var/www/html/ and 
regenerate any Publican site index present.

ADVANTAGES:
1. Publishing is completely automated. This is actually the job that 
Publican was designed to do and pretty much how it's been running 
internally within Red Hat for  years. Writers and translators can focus 
on writing and translating and leave everything else to the machines.

2. Since the documentation is going to a repo, Fedora users world-wide 
could yum install docs packages from this repo. We might also be able to 
take advantage of the new langpacks feature in yum to help users find 
documentation in their own language.

DISADVANTAGES:
No disadvantages of which I'm aware, but the current stumbling blocks to 
implementing such a system right now are:

1. Publican doesn't yet support sending packages to arbitrary Koji 
instances! So this would require some code changes to Publican. Since 
Publican already supports Red Hat's Brew system, I suspect these are 
relatively minor.

2. We would need a Koji instance on which to build docs. Writers and 
translators who are trusted to publish docs would need to be approved as 
packagers on that Koji instance. In a perfect world, this would be the 
same Koji instance that the Fedora Project uses already to build 
software packages. However, most writers and translators do  not need to 
or want to package software. We would need some administrative changes 
within the project that would approve certain writers and translators to 
build docs packages only. We would also likely need a different  package 
approval process for docs, as docs packages would quickly overwhelm the 
existing system.[4] Failing to get these admin changes, we would need a 
separate Koji instance somewhere, and need Infra to maintain it for us.

================================
Notes
================================

[0] minutes -- 
http://lists.fedoraproject.org/pipermail/docs/2010-April/012130.html

[1] Publican 2.0 is still under heavy development, but I have (unsigned) 
packages for Publican 1.99 and its web component available from my 
fedorapeople page: http://rlandmann.fedorapeople.org/publican/ and you 
can find instructions for the new web stuff on Jeff Fearn's page: 
http://jfearn.fedorapeople.org -- please download and experiment! 
However, if you're involved in building or publishing F13 docs, please 
do not use Publican 1.99 for this, and in particular, do *not* update 
POT or PO files for current work with Publican 1.99!

[2] I say "writers" here, but the same applies to translators, although 
to date, translators have not generally been involved in publishing 
their work. One advantage of rethinking how we publish is to allow 
translators and language teams direct control over their work, without 
making them rely on writers as "gatekeepers" for their docs.

[3] Technically, there's no reason why the build server and web server 
need to be two separate machines, or why the "publican install_book" 
command couldn't write directly to a publicly accessible directory on 
the web server. I've suggested separating them mainly out of paranoia, 
but maybe I'm being overly cautious?

[4] Hopefully in the context of this whole workflow, Publican's 
insistence on packaging languages one-at-a-time and on including version 
numbers in package names makes more sense -- it's these distinctions 
that produce the table of contents for the site. It also illustrates why 
a different package approval process would be necessary for docs -- 15 
books in up to 12 regularly-translated languages is up to 180 new 
packages for each Fedora release...

-------------- next part --------------
A non-text attachment was scrubbed...
Name: optionfuture.png
Type: image/png
Size: 30940 bytes
Desc: not available
Url : http://lists.fedoraproject.org/pipermail/docs/attachments/20100424/7dc9e54e/attachment-0004.png 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: option2.png
Type: image/png
Size: 23574 bytes
Desc: not available
Url : http://lists.fedoraproject.org/pipermail/docs/attachments/20100424/7dc9e54e/attachment-0005.png 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: option1.png
Type: image/png
Size: 41867 bytes
Desc: not available
Url : http://lists.fedoraproject.org/pipermail/docs/attachments/20100424/7dc9e54e/attachment-0006.png 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: option0.png
Type: image/png
Size: 38629 bytes
Desc: not available
Url : http://lists.fedoraproject.org/pipermail/docs/attachments/20100424/7dc9e54e/attachment-0007.png