[Bug 611873] New: Review Request: R-Rsolid - Quantile normalization and base calling for second generation sequencing data

bugzilla at redhat.com bugzilla at redhat.com
Tue Jul 6 18:43:53 UTC 2010


Please do not reply directly to this email. All additional
comments should be made in the comments box of this bug.

Summary: Review Request: R-Rsolid - Quantile normalization and base calling for second generation sequencing data

https://bugzilla.redhat.com/show_bug.cgi?id=611873

           Summary: Review Request: R-Rsolid - Quantile normalization and
                    base calling for second generation sequencing data
           Product: Fedora
           Version: rawhide
          Platform: All
        OS/Version: Linux
            Status: NEW
          Severity: medium
          Priority: medium
         Component: Package Review
        AssignedTo: nobody at fedoraproject.org
        ReportedBy: bloch at verdurin.com
         QAContact: extras-qa at fedoraproject.org
                CC: notting at redhat.com, fedora-package-review at redhat.com
   Estimated Hours: 0.0
    Classification: Fedora


Spec URL: http://verdurin.fedorapeople.org/reviews/R-Rsolid/R-Rsolid.spec
SRPM URL:
http://verdurin.fedorapeople.org/reviews/R-Rsolid/R-Rsolid-0.9.2-1.fc12.src.rpm
Description: 

Rsolid is an R package for normalizing fluorescent intensity data from
ABI/SOLiD second generation sequencing platform. It has been observed
that the color-calls provided by factory software contain technical
artifacts, where the proportions of colors called are extremely
variable across sequencing cycles. Under the random DNA fragmentation
assumption, these proportions should be equal across sequencing cycles
and proportional to the dinucleotide frequencies of the sample.

Rsolid implements a version of the quantile normalization algorithm
that transforms the intensity values before calling colors. Results
show that after normalization, the total number of mappable reads
increases by around 5%, and number of perfectly mapped reads increases
by 10%. Moreover a 2-5% reduction in overall error rates is observed,
with a 2-6% reduction in the rate of valid adjacent color
mis-matches. The latter is important, since it leads to a decrease in
false-positive SNP calls.

The normalization algorithm is computationally efficient. In a test we
are able to process 300 million reads in 2 hours using 10 computer
cluster nodes. The engine functions of the package are written in C
for better performance.

-- 
Configure bugmail: https://bugzilla.redhat.com/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.



More information about the package-review mailing list