On Tue, Jan 8, 2019 at 7:17 AM Benjamin Berg <bberg(a)redhat.com> wrote:
On Tue, 2019-01-08 at 12:33 +0100, Miroslav Suchý wrote:
> Dne 08. 01. 19 v 11:35 Nicolas Mailhot napsal(a):
> > *which* *do* *not* *permit* *or* *no* *longer* *permit* *the*
> > *identification* *of* *data* *subjects*
>
> How do you identify data subject solely on UUID?
You also inherently collect information such as the IP and the
timestamp of the request which in principle permits identification. You
could for example collect the IP from Fedora account logins and one of
these pings. This way you can de-anonymise the data collected for the
UUID.
We can certainly implement a setup that does not collect or store the
UUID together with the IP address or timestamp. Send the UUID as a
HTTP header, don't log it, send the UUID off to a counting service
(*). If we make sure the UUID is protected in transit, sent only to
our own servers (or servers configured by the user), and not collected
or stored in a personally identifiable way, I suspect that we're
meeting our obligations under the GDPR, though we'd need to
double-check any selected solution carefully.
That being said, certainly some users might still have an issue with
having a UUID sent to Fedora servers even if we are meeting our legal
obligations. What we say we are doing with the data might not
correspond to reality in case of a security breach or court order. For
this reason, the first_time_this_week=1 option that Lennart and
Benjamin mentioned has some appeal to me - it would avoid the need for
extra opt-in/out screens, confusing text, etc. It would also allow any
yum repository to do counting the same way - not just our own
repositories.
Owen
(*) implementation left to your imagination. Store a hash of the UUID
for a week then discard. Use HyperLogLog. Etc.