PROPOSAL: Fedora user survey

Wed Mar 10 19:08:51 UTC 2010

On Tue, Mar 9, 2010 at 10:48 PM, Jon Masters <jonathan at jonmasters.org> wrote:
> On Tue, 2010-03-09 at 17:30 -0500, Chris Ball wrote:
>
>> Now, there's a reasonable argument that says that Fedora users without
>> FAS accounts didn't vote for FESCo, so it's still legitimate to ask
>> *those* users what they think.  The impossibility of reaching such a
>> group of users without incorporating selection bias would turn me off
>> from trying to do that -- it would be nice if we could find out how
>> the entire Fedora-using world feels about updates, but that's not
>> actually plausible.  Even just the set of Fedora users who visit
>> http://fedoraproject.org/ is significantly selection-biased already,
>> in my opinion.
>
> You could argue that only certain really interested parties will look at
> fedoraproject.org, but that's why I was suggesting the start page or
> firstboot (bad idea, I know), or something people will see when they
> first open a browser or use a system. Not just the die hards.
>
> My personal opinion is that user feedback in general is a good thing,
> even if you ultimately choose to disregard it. And I think putting a
> survey up for 6 months (e.g. all of F13) would give plenty of time to do
> reasonably useful statistical analysis of opinions. Shove in a few other
> more generic questions if there are any others - e.g. "what do you use
> Fedora for anyway?". Wouldn't that be good to know :)
>
> Jon.
>
> P.S. The government is elected, doesn't mean they don't still hold a
> census every decade to find out who their users are.
>

Ok just basic statistics here (I will defer to Jeff Spaleta or Diana
or other gurus)... but your analogy and survey need improvement to
have statistical validity. In these sort of surveys you either need to
survey everyone OR survey a random sample that are not self-selected.

Valid census data is one that has close to 100% coverage within some
statistical deltas. It is probably the most valid data because of
that, however it can be gammed if the people taking don't take it
seriously. [The statistical deltas are the things that everyone argues
about saying you can't trust the data... but well we are all comrade
scientists here (thats supposed to be a funny)]

Surveys are valid within some margin of error and 'confidence level'
as long as the data can be shown not be self-selecting, the questions
properly phrased, and a study of various groups being polled is known
(reference to census material to know that if you randomly call X
people in Y region you will get Z% of sub-group.) You have to work out
how many people you tried to survey, how many completed the survey,
how much confidence you have in the population etc.

For a group as small as Fedora (less than 50,000 registered users who
you could survey) you would need to survey that actually is much
larger than standard. An initial survey would probably want to have a
confidence level of 95% and a confidence interval of +/- 5%. If after
the survey your percentages are within that error bar (47% for/48%
against..) you need to resurvey with a larger number. So for a first
step, we would need to get a survey list made, have the questions
reworded/etc to meet various survey tests, and then randomly pick a
population and survey them (I am over simplifying here... there are
various steps required I am not sure of).

That looks to be about 400 people need to randomly selected and
complete the survey (for +/- 5%). to get down to 1% you would need to
get 6500 people.

-- 
Stephen J Smoogen.

Ah, but a man's reach should exceed his grasp. Or what's a heaven for?
-- Robert Browning