[Fedora-legal-list] Making Infrastructure httpd logs public
Ricky Zhou
ricky at fedoraproject.org
Wed Apr 18 16:27:32 UTC 2012
On 2012-04-18 09:56:44 AM, Kevin Fenzi wrote:
> http://stackoverflow.com/questions/4552566/logging-ip-address-for-uniqueness-without-storing-the-ip-address-itself-for-priv
>
> has some ideas, but no great clear answer.
>
> http://bug.st/mod_anonstats seems to use md5.
>
> I'm assuming the consumer of these logs will process them after they
> are hashed? In which case we do need to make sure the same ip hashes to
> the same hash ? Or could we process them first, then hash the ip before
> making the data public?
I think something like an HMAC is the correct way to hide IPs.
Unfortunately, there is still information other than IP address that can
potentially leak some privacy information, such as:
* rare/unique user agent strings
* URLs that can be be linked to the person who's visiting them (a lot
of mailman links contain emails, for example)
* potentially still-valid CSRF tokens
I think a lot more thought and user notification should happen before we
can consider making logs public. Alternatively, what do you think about
a system where somebody who wanted to run statistics either gets access
to the logs, or gives us a script that we'll verify and then run in a
cronjob. I don't think we'll get enough requests to the point where
doing things manually like this becomes a burden.
Maybe we can also take a look at how organizations like wikipedia handle
these sorts of things.
Thanks,
Ricky
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://lists.fedoraproject.org/pipermail/infrastructure/attachments/20120418/837a48dc/attachment.sig>
More information about the infrastructure
mailing list