On Wed, Feb 12, 2014 at 12:46 AM, Pierre-Yves Chibon <pingou@pingoured.fr> wrote:

The idea originates from a discussion between Mickael Scherrer, Ralph and I on
Friday evening. Could we track all the files in every packages in the

Ideally, this would allow us to investigate questions like:
 - How many copies of the GPL license are shipped?
 - How many GPL license still ship the old FSF address?
 - How many copies of jquery or md5.c?
 - How many files changed between two releases?

Cool idea, and sounds a lot like what FOSSology could do for you already (http://www.fossology.org/projects/fossology).  Have you checked that out?
