Log Data Retention

Kamil Paral kparal at redhat.com
Fri Oct 9 11:58:28 UTC 2015


> > 
> > 1. How long is long enough to keep log and execution data?
> 
> 6-12 months should be more than enough but it might be worth trying
> to keep a release-lifetime of logs (~18 months, including pre-release)

Those are higher numbers than I expected to be realistic. For me, I'd see 3 months as the required minimum, we sometimes need to go back and debug some issues. 6 months is great. Anything above doesn't hurt of course, but I wouldn't mind losing it.

This question regards "log and execution data". Where is a similar question regarding task artifacts? For those, I think we should try to keep at least 6 months of results.

> 
> > 2. Should be be cleaning up anything that references builds/artifacts
> > (like links in resultsdb) before we delete them?
> 
> Ideally, yes but I don't think it's worth more than a day's effort
> for one person if we have proper 404 processing on the machine hosting
> the artifacts.

What's the benefit of removing links from the database, does it increase storage space or speed it up? Because otherwise I see it the other way round, if the resultsdb page contains a link to an artifact and it goes to a 404 page saying "sorry, this is probably too old and already deleted, we usually keep files around for XYZ time", the user has learned everything important. If the resultsdb page does not contain any link, the user will wonder "where's the check log? why is it missing?" which is a more confusing scenario.

> > 3. Do we want to put resources into figuring out whether the result
> > was a PASS or FAIL before deleting it?
> 
> No, it's not worth the effort - I'd rather just store more logs than
> put much dev time into deciding which logs to delete

Agreed.

> 
> > 4. Should fesco be involved in this decision?
> 
> Either way - I suspect that they're not going to have much of an
> opinion and it adds bureaucracy to the process but I suppose that the
> decision would be a bit more "official" if we asked them.

I don't think it's important enough to bother them, and is mainly affected by our storage capacity anyways, not by "management decision".


More information about the qa-devel mailing list