Fedora 20 Update: t-digest-3.0-1.fc20

updates at fedoraproject.org updates at fedoraproject.org
Fri Aug 15 02:54:14 UTC 2014


--------------------------------------------------------------------------------
Fedora Update Notification
FEDORA-2014-9081
2014-08-01 05:01:37
--------------------------------------------------------------------------------

Name        : t-digest
Product     : Fedora 20
Version     : 3.0
Release     : 1.fc20
URL         : https://github.com/tdunning/t-digest
Summary     : A new data structure for on-line accumulation of statistics
Description :
A new data structure for accurate on-line accumulation of rank-based statistics
eg. quantiles and trimmed means. The t-digest algorithm is also very parallel
friendly making it useful in map-reduce and parallel streaming applications.

--------------------------------------------------------------------------------
Update Information:

A new data structure for accurate on-line accumulation of rank-based statistics such as quantiles and trimmed means. The t-digest algorithm is also very parallel friendly making it useful in map-reduce and parallel streaming applications.

The t-digest construction algorithm uses a variant of 1-dimensional k-means clustering to product a data structure that is related to the Q-digest. This t-digest data structure can be used to estimate quantiles or compute other rank statistics. The advantage of the t-digest over the Q-digest is that the t-digest can handle floating point values while the Q-digest is limited to integers. With small changes, the t-digest can handle any values from any ordered set that has something akin to a mean. The accuracy of quantile estimates produced by t-digests can be orders of magnitude more accurate than those produced by Q-digests in spite of the fact that t-digests are more compact when stored on disk.

In summary, the particularly interesting characteristics of the t-digest are that it

    has smaller summaries than Q-digest
    works on doubles as well as integers.
    provides part per million accuracy for extreme quantiles and typically <1000 ppm accuracy for middle quantiles
    is fast
    is very simple
    has a reference implementation that has > 90% test coverage
    can be used with map-reduce very easily because digests can be merged

--------------------------------------------------------------------------------
References:

  [ 1 ] Bug #1121402 - Review Request: t-digest
        https://bugzilla.redhat.com/show_bug.cgi?id=1121402
--------------------------------------------------------------------------------

This update can be installed with the "yum" update program.  Use
su -c 'yum update t-digest' at the command line.
For more information, refer to "Managing Software with yum",
available at http://docs.fedoraproject.org/yum/.

All packages are signed with the Fedora Project GPG key.  More details on the
GPG keys used by the Fedora Project can be found at
https://fedoraproject.org/keys
--------------------------------------------------------------------------------


More information about the package-announce mailing list