In response to Jeff Spaleta's recent query about the sorts of projects that could make use of NightLife, I'd like to pose the research in our group as an example of the type of work that could readily use such a resource.
Short version:
Our public protein structure prediction server, Robetta, relies upon farming our computationally intensive steps to NCSA's clusters, via CONDOR, to provide timely access to our group's methods for the general academic community. But we find that our resources are stretched thin and at times we are unable to provide the researchers the sort of quick response that allows their research efforts to proceed.
If we had access to more computing power, even that available from modest periods of inactivity, we could put that power to work to address many pressing issues in bio- medical research such as HIV/AIDS vaccine design, improvement of existing drugs and/or design new drugs, and creation of new methods to harness biology to address issues such as carbon sequestration.
Overly-long version:
I run the various computing infrastructures for David Baker's computational biophysics group at the University of Washington, http://www.bakerlab.org. The group's primary computational focus is the de-novo prediction of the 3-D structure of proteins from the linear sequence of amino acid in the given protein chain. The algorithm under constant development here, Rosetta, is an embarrassingly parallel, Monte Carlo application that requires significant amounts of CPU time to discover the "best" protein structures in a statistically significant fashion. And this approach has enjoyed modest success over the past few years.
The group's success has led to a broad interest in the availability of the methods to the academic community. The code (which is freely available to academic researchers) is challenging to use correctly and the post-production data crunching can be daunting. As a result we created a publicly available, automatic service - robetta (http:// www.robetta.org) - roughly 4-5 years ago to allow anyone to use the methods via the service. We've been victims of our popularity and the server was soon awash in work that pushed the wait times from a day or two to almost a year.
To gain more horsepower, we began a collaboration with NCSA and the CONDOR group to farm our work to their systems - via CONDOR - and that has proven quite successful at keeping the wait times down to the range of "months".
I'd specifically like to point out that the CONDOR group has been VERY helpful with our CONDOR issues - their goal is your successful use of CONDOR and they're good at it! We've been using CONDOR on our local infrastructure for ~8 years and are quite happy. The transition to CONDOR wasn't as challenging for the scientists as I feared and it's integration into Globus make using remote resources straightforward.
We have researchers from a wide variety of fields who use this service as an integral portion of their research effort, despite the somewhat slow turn-around time. But this summer is the bi-annual, world-wide CASP contest (http://www.predictioncenter.org/casp8/, blind testing of methods, lasts May->August) and, by popular demand, the automatic service is turned towards the many "targets" of the contest which some use as starting points for their work. This leaves many researchers waiting for their work to be addressed until the contest is completed.
If we could access more computing power we would be able to keep the service working on the non-CASP related work during these contests and improve turn-around time in general.
-KEL
nightlife@lists.fedoraproject.org