-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
This evening we opened up dist-git for business. Dist-git is our git based replacement for dist-cvs, the source control system we were using for our package specs, patches, and source files. This has been a long time coming and a massive effort. I want to take a little time here to outline what we've done and where we are going.
First a brief outline of how our CVS system worked. CVS is a daemon of sorts, and all repos typically live within a single CVSROOT. Within this CVSROOT we had an 'avail' system to make use of, that we could populate with data from Fedora Account System and dump into this file. Avail worked on path names, relative to the CVSROOT. Since we used directories for each Fedora release as pseudo branches we could set avail info on each release "branch". CVS also used some filesystem symlink tricks to create a "common" subdir for every package module, and this is where we stuffed common scripts and Makefile content. Pretty clever on one hand, we can make updates to the make system without touching every single package, but it is pretty hackish and we had constant issues where somebody would attempt an action using old common content and stuff would fall over.
Now we look at git. Git is for the most part a daemonless system. Each repository is completely stand alone and generally does not require any other infrastructure to be useful. You can interact with a repository directly on the filesystem using /usr/bin/git or you can interact with it through say ssh, again using /usr/bin/git (your local /usr/bin/git will call a remote /usr/bin/git). Generally there is no running daemon to connect to and authenticate with. Basic authentication of who can check out what and commit where with git happens at the filesystem permissions level. One twist with this is that with git, we wanted to use real branches within a package repo to reflect the different Fedora/EPEL releases. In repo branches are not reflected with path names that we could set filesystem ACLs upon, so this posed a problem for our conversion.
Enter gitolite. Gitolite ( http://github.com/sitaramc/gitolite )is an addon system to git that provides ACL functionality including different rights for different branches within a given repository, and more! By using gitolite as a replacement for /usr/bin/git when a user connects to our git server we can again utilize the information we have within the Fedora Package Database and properly allow / restrict changes on specific branches for specific developers. The gitolite upstream ( Sitaram Chamarty, sitaram@atc.tcs.com ) has been fantastically responsive to our needs, which are admittedly a little unique. We have a very large set of repositories (over 10.5K) and a largish number of contributors (1050). The combination of the two leads to a very very large and complex ACL structure that at first broke the system quite badly. Upstream was very quick to create a "bigconfig" method of compiling the ACLs without crashing the box. Our other unique needs involve having individual accounts for each committer instead of a shared account with a large list of allowed SSH keys. Add to that some of our accounts need to be able to ssh shell into the git server for administrative duties. Throughout our trials and testing with gitolite every time we've ran into some issue that just didn't fit the mold, Sitaram has been there with a smile and a fix. At this point our production server is a whopping one line different from current gitolite upstream. This is a fantastic win for us, for our sustainability, and for the next large group that wants to make use of gitolite.
Once we had a plan for ACLs and for branches, we had to decide if we were going to replace the Make system and with what. I had never been a fan of Make, it was entirely too difficult to modify and innovate with. Since I was the one pushing this transition forward, I decided to write a new tool in my favorite language, python. The fedpkg tool was born and took off. fedpkg was born around January 4th, 2010 and has since grown into 1,523 (via sloccount) lines of code. While far from complete, it is a great start (if I do say so myself!) at replacing the make system. Because it is written in python (or maybe just !Make) it seems to be easier to contribute to, and I've already gotten a number of contributions. More will come as it starts to be more widely used. The biggest challenge with fedpkg is removing the need to update something on the end user's system every single time we added a new Fedora release and changed what happens when you build for rawhide. I'll spare you the details but I'm fairly happy with what we have. The end result should be far fewer misfires and end user confusion.
The last major piece of the puzzle was how to actually convert the existing CVS repositories, including the fun pseudo branches, into git repositories. I tried a number of options over the years (I've been working on this off and on for nearly 4 years!) ranging from the built in git cvsimport to git-svn to parsecvs and a few others. In the end, we took a page from the gnome project and used parsecvs ( http://cgit.freedesktop.org/~krh/parsecvs/ ) for the vast majority of our repositories. There were a few that gave parsecvs fits and recent versions of git cvsimport were able to handle them. The git system is fantastic enough that we were able to merge our pseudo cvs branches into actual git branches complete with a real shared history, but again I'll spare you the details of the scripting to do this. All but the kernel repo seems to have converted successfully which is a pretty good success rate in my book. We may yet get the kernel converted, but in the interest of time we opted to start fresh with dist-git for now.
Without the help of many others, this project would never have gotten done. Folks helped out with Koji modifications, with fedpkg contributions, with repeated testing of attempted conversions, with logic checking of my plans, of helping me understand more of git internals and deciphering error output, and most of all with being patient while we worked through the transition and very positive along the way. Things will be bumpy over the next few weeks as we really start putting distgit to the test. No amount of staging and testing can really replace production use. There will be many more updates to fedpkg as bugs are found and fixed and features are contributed. Wiki pages will get filled out as knowledge of how to interact with dist-git starts to spread ( https://fedoraproject.org/wiki/Using_Fedora_GIT is a good start ).
Once again I want to thank everybody who helped out and for all the (continued) patience! I'll be available via email and IRC as much as possible the next few days to help anybody with dist-git issues. Look for Oxf13 on freenode. Happy gitting!
- -- Jesse Keating Fedora -- FreedomĀ² is a feature! identi.ca: http://identi.ca/jkeating
devel-announce@lists.fedoraproject.org