In response to Jeff Spaleta's recent query about the sorts of
projects that could make use
of NightLife, I'd like to pose the research in our group as an example
of the type of work
that could readily use such a resource.
Short version:
Our public protein structure prediction server, Robetta, relies upon
farming our computationally
intensive steps to NCSA's clusters, via CONDOR, to provide timely
access to our group's methods
for the general academic community. But we find that our resources
are stretched thin and at times
we are unable to provide the researchers the sort of quick response
that allows their research efforts
to proceed.
If we had access to more computing power, even that available from
modest periods of inactivity,
we could put that power to work to address many pressing issues in bio-
medical research such
as HIV/AIDS vaccine design, improvement of existing drugs and/or
design new drugs, and creation
of new methods to harness biology to address issues such as carbon
sequestration.
Overly-long version:
I run the various computing infrastructures for David Baker's
computational biophysics
group at the University of Washington,
http://www.bakerlab.org. The
group's primary computational
focus is the de-novo prediction of the 3-D structure of proteins from
the linear sequence of amino
acid in the given protein chain. The algorithm under constant
development here, Rosetta, is an
embarrassingly parallel, Monte Carlo application that requires
significant amounts of CPU time
to discover the "best" protein structures in a statistically
significant fashion. And this approach
has enjoyed modest success over the past few years.
The group's success has led to a broad interest in the availability
of the methods to the
academic community. The code (which is freely available to academic
researchers) is challenging
to use correctly and the post-production data crunching can be
daunting. As a result we created a
publicly available, automatic service - robetta (http://
www.robetta.org) - roughly 4-5 years ago
to allow anyone to use the methods via the service. We've been
victims of our popularity and
the server was soon awash in work that pushed the wait times from a
day or two to almost a year.
To gain more horsepower, we began a collaboration with NCSA and the
CONDOR group to farm
our work to their systems - via CONDOR - and that has proven quite
successful at keeping the wait times
down to the range of "months".
I'd specifically like to point out that the CONDOR group has been VERY
helpful with our CONDOR issues - their goal is your successful use of
CONDOR and they're good at
it! We've been using CONDOR on our local infrastructure for ~8 years
and are quite happy. The
transition to CONDOR wasn't as challenging for the scientists as I
feared and it's integration into Globus
make using remote resources straightforward.
We have researchers from a wide variety of fields who use this
service as an integral portion of their
research effort, despite the somewhat slow turn-around time. But this
summer is the bi-annual, world-wide
CASP contest (
http://www.predictioncenter.org/casp8/, blind testing of
methods, lasts May->August) and,
by popular demand, the automatic service is turned towards the many
"targets" of the contest which
some use as starting points for their work. This leaves many
researchers waiting for their work to be
addressed until the contest is completed.
If we could access more computing power we would be able to keep the
service working on the
non-CASP related work during these contests and improve turn-around
time in general.
-KEL
--
+> Keith E. Laidig, Ph.D., M.S., M.Phil. laidig(a)u.washington.edu
+> HHMI Affiliate
http://staff.washington.edu/laidig