Fw: Improving metrics gathering

Matt Domsch matt at domsch.com
Thu Feb 4 16:24:47 UTC 2010


Please note this thread on advisory-board and voice your concerns
there.

Thanks,
Matt
-Matt

----- Forwarded message from Matt Domsch <matt at domsch.com> -----

Date: Thu, 4 Feb 2010 10:16:09 -0600
From: Matt Domsch <matt at domsch.com>
To: advisory-board at lists.fedoraproject.org
Subject: Improving metrics gathering

I've spent quite a bit of time over the last week fixing up the
scripts that generate Fedora's worldwide user maps [1] including the
worldwide map for all Fedora versions currently in use [2] as
determined by yum requests for mirrorlists.

One thing that's painfully obvious is that the "Unique IP addresses"
method of counting the number of installations [3] is woefully
under-counting the actual number of installs.  Looking at a single
day's worth of checkins (over 3 million), we see ~40k unique IP
addresses checking in twice a day, another 40k checking in between
4x/day and up to say 20x/day, and then a long tail, fairly evenly
distributed, where a small number of single IPs are checking in up to
2000x/day.  It takes quite a bit of effort to cause yum to make that
many mirrorlist requests using a single machine and a single IP
address - but it's highly likely there are 1000-2000 machines behind a
NAT making those requests.

[snip]

To this end, I would like to see yum enhanced to provide information
which we can use to more accurately count the number of installed
Fedora systems.  This has been discussed before, and documented on the
wiki [5], but for various reasons never been acted upon.  While I'll
leave the implementation details to the appropriate teams, I think
including some form of UUID in yum mirrorlist queries would be both
appropriate, and safe.


[1] http://fedoraproject.org/maps/
[2] http://fedoraproject.org/maps/all.png
[3] http://fedoraproject.org/wiki/Statistics
[4] http://fedoraproject.org/wiki/Infrastructure/Metrics#Metrics_are_actually_important
[5] http://fedoraproject.org/wiki/Infrastructure/Metrics#Unique_Identifiers


More information about the devel mailing list