How many users does Fedora have?

Ben Cotton bcotton at fedoraproject.org
Tue Dec 2 14:42:24 UTC 2014


On Mon, Dec 1, 2014 at 11:11 PM, Matthew Miller
<mattdm at fedoraproject.org> wrote:
> Okay, this seems like a good start. What _are_ the right questions?

Oof, I walked right into this. :-)

On Tue, Dec 2, 2014 at 6:46 AM, Alec Leamas <leamas.alec at gmail.com> wrote:
> Hm... and backtracking this another step, an even more basic issue is "why
> would one ask these question(s)?"
>
> I started with the need for packagers to get some feedback, the feeling that
> there are users out there using my work. Although trivial,  this is still
> essential for me. Rahul adds the negative feedback that my package is
> actively removed (don't want that).
>
I wouldn't call it trivial. When I wrote a small plugin for a Twitter
client I use, I was thrilled to find out that people used it. I almost
fell over when the developer asked if he could include it upstream.
Knowing people use what your make is a very powerful motivator.

> So, a revised/enhanced/even worse set of questions:
>
> - How many users have installed product X/spin Y?
> - How many users have installed package X?
> - How many users have actively removed package X?
> - What packages are installed from non-Fedora repos?
> - Count on non-packaged specific applications.
> - How often are updates performed? Which packages are excluded?
> - How often is package X updated from updates-testing?

This is a good start, though I'm still not sure it addresses the "why
are we asking?" component. The answers may be of value to individual
maintainers/developers, but are they of use to Fedora as a whole?
(Aside: what does "Fedora as a whole" even mean?)

For reasons others have pointed out, I'm not sure straight counts are
very useful. Here's the sort of question that might give more reliable
answers:

- What percentage of installs explicitly installed package X versus
installing as a dependency?

In this case, each system that reports in can say "explicit" or
"dependency", and we can aggregate that. Granted, we still have a
problem with excluded machines (e.g. those behind a local mirror), but
I _suspect_ that would vary more on a package level than an machine
level.

For packages that have a lot of sub packages (for example, texlive and
its nearly 5k friends), queries on the raw counts could be
interesting:

- What percentage of installs have texlive-apacite and also texlive-beamer?
- What percentage of installs have texlive-graphics but not texlive-hyphennat?

But again, it comes down to "what question are we trying to answer?"
I'm not trying to be argumentative here, I agree that we need better
information about our users in order to best serve them, and I'm
trying to help us figure out what that information is. Package
installation doesn't give us a very clear picture. Changes from
defaults might be more interesting.

Here are some questions that could give us valuable information (most
of which are too nose-wipey to actually implement, but they're good
for discussion):

- What is the default browser on the system?
- How many local users have been created? How many of them have real
shells (as opposed to nologin)?
- What is the default desktop on the system?
- What daemons are running on the system?
- How many monitors are attached?
- What's the hardware platform (bare metal, AWS, KVM, etc)?

Some of these are very workstation-focused, and I feel like we've
tried to ask all of them before, but I think the more interesting
questions revolve around how people use Fedora, which isn't
necessarily represented by what packages they install.


Thanks,
BC

-- 
Ben Cotton


More information about the devel mailing list