Metrics and your privacy

James Wilkinson fedora at aprilcottage.co.uk
Wed Nov 22 17:44:51 UTC 2006


http://fedoraproject.org/wiki/Infrastructure/Metrics says:
> Perhaps the most promising technology is adding a phone home into
> anaconda. 
<snip>
> This could also be considered something to be done one time on first
>boot.
<snip>
> Cons:
<snip>
> * Cannot track current number of installs, only total successful
>   installs

Would it be possible to ask the question near the *start* of Anaconda,
and get the install to try to "phone home" twice with a GUID -- once at
the start of the install, and once after a successful install when the
box gets Internet connectivity. (And then the machine could forget the
GUID).

That way, you'll get:
 1) computers that started the install online but never finished it;

 2) computers that started the install online and finished successfully;

 3) computers that started the install offline and then went online.

By comparing 1 and 2, you'll have an idea of the proportion of
unsuccessful installs. By adding 2 and 3, you'll have an idea of the
number of total successful installs. So you can estimate the total number
of unsuccessful installs. (This would have to assume that online users
were no more likely to be able to finish an install -- this assumption is
not necessarily correct.)

(I know it's possible for someone to start an install online, then unplug
the machine and it never go online again. But how often will that
happen?)

You'd also have an idea of how useful making repos available at install
time really is -- if 75% of users aren't online when they install, then
the answer is "not very" and we need to put more effort into making ISOs
available of what's currently in Extras.

On the advisory board list, Christopher Blizzard commented:
> If we are able to collect a set of hardware profiles for people, just
> after an install, and tie that to a unique machine identifier, we could
> make that really useful for people. The reason being that having access
> to information about what hardware people are really using allows us to
> know where we need to concentrate our efforts.

If you could send that at the *beginning* of the install process, you
could also look for hardware on which Fedora just does not install.

No-one seems to have taken Xen into account in this. Do you actually
*want* Xen clients to appear as separate installs or as one?

I understand you want to know which packages are the most popular. I
don't think install-time popularity contests are going to buy you much --
you'll mostly get variations on the standard options Anaconda gives. Many
(most?) people will change what's installed after install, and too many
people will change what they actually use over time. And very few users
will actively remove software they're not using, especially if they're
not sure what it does.

I was going to suggest that logs from even one or two mirrors would give
you a good idea of the proportions of downloaded software, but (e.g.)
South American mirrors wouldn't see much demand for Far Eastern input
methods.

Hope this helps,

James.

-- 
E-mail:     james@ | a11y: There's a sense of irony in a term defining
aprilcottage.co.uk | accessability which makes non tech savvy people go
                   | "Whaa?".
                   |     -- Dave Jones




More information about the users mailing list