Improving metrics gathering

Bruno Wolff III bruno at wolff.to
Thu Feb 4 16:37:14 UTC 2010


On Thu, Feb 04, 2010 at 10:16:09 -0600,
  Matt Domsch <matt at domsch.com> wrote:
> 
> The biggest concern people have with using any UUID in any form is the
> "trackability" that comes inherent with it.  Given enough log data
> that includes UUIDs, one could potentially use it to understand
> something about a user that they otherwise wouldn't want you to know.

A possible concern is that if you are primarily trying to determine if
there is more than one machine using a single IP address to get yum
updates do you really need to track those updates accross multiple IP
addresses? This information could be more revealing as it can show travel
patterns. If that isn't a requirement, perhaps yum could create a uuid
per externel IP address. Doing that behind NAT is a bit problematic though.

> For implementation details, I suggest yum create and persist a single
> UUID for each installed system.  This UUID would be separate from any
> smolt UUID.  Yum would include this UUID in HTTP requests.  Yum would
> only provide this UUID when making mirrorlist requests, not when
> downloading content (from mirrors or other).  All yumlib-using
> applications such as PackageKit would then inherit this capability.
> On the back end, Fedora Infrastructure would add capability to log
> this UUID for each request, just as it logs mirrorlist requests
> today.  FI scripts would then use this UUID to accurately count the
> number of installed instances over time, recognizing that systems can
> get re-installed (and thus get new yum UUIDs), but over time can
> provide more accurate trending than we can get today.

Are you planning on logging UUID IP pairs or logging IP addresses independently
of UUIDs?


More information about the advisory-board mailing list