On Mon, Jan 07, 2019 at 11:09:48PM +0100, Kevin Kofler wrote:
Please no! This is an inherent privacy violation. I hate software
doing this
and I always opt out of it. I find it especially worrying that Free Software
is now doing this more and more often, this used to be something only
privacy-violating proprietary software would do.
Since there is no personal information attached, I don't see how on the face
of it this is a privacy violation. I want to take this concern seriously,
but I need more to go on than "this is inherent". Can you elaborate?
You will never be able to reliably count all Fedora installations.
Any UUID
you introduce can be opted out of, bypassed, etc. Installations using local
mirrors for updates will never send you a UUID to begin with. All numbers
will always be estimates, no matter how deeply you invade our privacy in an
attempt to get a supposedly better count.
It's true that it will always be an estimate. I think this scheme gives a
reasonable better estimate.
I also don't see why it is so important to have an absolute count
of Fedora
users. IMHO, data like the relative download frequency of the different
Fedora deliverables is much more interesting (though you have to keep in
mind that the download count does not necessarily reflect the true user
preferences because deliverables that you advertise more prominently will
necessarily get downloaded more often than those hidden behind several
clicks from the download page).
The download count is *really* noisy. There are an order of magnitude more
bot and automatic downloads then there are ones that seem initiated by a
human. Maybe this is due to automated systems, but I suspect it is basically
just the horrible nature of the internet. Unless we were to gate downloads
with a captcha or registration (which, uh, we don't want, just to be clear),
I don't see any way to make those numbers useful.
But sending a UUID inherently also allows to track the machine. There
is no
way for the user to be sure that the UUID will not be used to track them.
Even if the software on the Fedora infrastructure is completely open and
audited, there might still be some proxy in the middle, some mirror
operator, etc. abusing the UUID for tracking purposes. And besides, the user
would in all cases have to trust that Fedora really runs the published code
and only the published code on the infrastructure servers.
Like I said, tracking is a non-goal. And, we want a design that is resistant
to tracking -- but I don't think we need to go overboard.
Such a tracking feature must be opt-in, not opt-out! See also the EU
GDPR.
This will be reviewed by lawyers. And, I do note that what I am proposing is
nothing more than what openSUSE already does.
> * We need to be able to distinguish between short-lived
instances
> (like temporary containers or test machines) and actual installations.
And how would you accomplish that? Other than an "I am a test installation"
checkbox in the installer, I don't see at all how it could be done.
One method: separate UUIDs which only show up on a single day. (This is why
a UUID is better than just a ping.)
[...]
The installation would also only end up recognized as permanent after
the 24
hours pass. And who says a test installation cannot last more than 24 hours?
I think it can last at least a week, but that also means that it would take
a whole week until you can reasonably assume that an installation is
probably permanent.
Sure, it's a threshold and we'd have to set a balance.
--
Matthew Miller
<mattdm(a)fedoraproject.org>
Fedora Project Leader