Request for test data based off of obfuscated live data

Toshio Kuratomi a.badger at
Wed Nov 19 18:29:07 UTC 2008

John Palmieri wrote:
> ----- "Toshio Kuratomi" <a.badger at> wrote:
>> If we can munge the data enough to be comfortable releasing it to the
>> public, it seems like that would cost us less man hours.  However, it
>> isn't entirely free.  We'd still have to make new dumps of data,
>> modify
>> it for changes in the data model, etc.  Then the developer would
>> become
>> responsible for downloading the sanitised data and running it on
>> their
>> network.  Which is good because it isn't us but bad because it's not
>> trivial to set all this up.
> I would be willing to write scripts and a kickstart file to make this trivial to get a qemu image or test machine up and running in a couple of hours (mostly waiting for download and installs to happen).  What I was thinking was an environment that setup a stable fedora infrastructure environment complete with puppet scripts to configure the services to work with one another and a set of scripts for pulling fresh data, modifying common pieces of the various dbs (like changing dates to stay current or setting up one of the users as your test user),  and pulling down code from the various source trees for hacking on particular pieces of the infrastructure while integrating them into the environment.  
I think this is a bit too amibitious but if you are willing to write it
and maintain it, that's fine.  My concerns are: keeping things synced
(for instance, we're slowly migrating the way we organize puppet
configs.  So those will have to change), pulling bits apart that are
interdependent, generating the database dumps, putting this into a
package that's still resource light enough for someone to actually use
on their workstation.

> A note on the code drop, by requiring the author to modify a spec file if needed in order to deploy their changes into their environment (revision numbers would be automated) patches would include spec file changes instead of the maintainer having to sync by hand.  This would also make sure the build files are kept up to date as the author would have to make their changes work in an RPM environment just to test them as opposed to just installing from their source tree which often leads to annoying bugs (like missing files in a distributed tarball).  Also by making it easy to generate a patch and submit it to trac we will get more consistent formatted patches (such as using the VCS's patch format) and most likely more people getting involved as the overhead shrinks a bit (how many people want to go to individual trac instances to file a patch?).
I'm not sure that this is a logical outcome of having an
infrastructure-apps image.  It seems more like guidelines that would
need to be established per-project.  For instance, there is absolutely
nothing stopping someone from installing the packagedb from the rpm,
checking out the development source to their home directory, and then
changing the config file to run that instead.


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 197 bytes
Desc: OpenPGP digital signature
Url : 

More information about the infrastructure mailing list