Now that we've had a second flavor of this issue (running out of
inodes on a buildmaster) hit us, it's probably time to address log data
retention.
At the moment, we don't have a log data retention policy which has lead
to filling up disks with logs. We need some policy for how long we're
going to keep this data but I don't want to just decide something
without some form of discussion/documentation.
When we had this problem with AutoQA, we implemented a cronjob that
would delete logs older than 30 days but we also had a lot less disk to
work with back then.
There are 2 forms of log data that this new policy would affect: the
artifacts created by task execution and the build logs/data stored by
the buildmaster. Both are relatively simple file-based data which can
be removed without any additional consequences than no longer being
available.
The questions raised so far are:
1. How long is long enough to keep log and execution data?
2. Should be be cleaning up anything that references builds/artifacts
(like links in resultsdb) before we delete them?
3. Do we want to put resources into figuring out whether the result was
a PASS or FAIL before deleting it?
4. Should fesco be involved in this decision?
Thoughts or Suggestions? I really don't want to spend much time on this
but that statement does seem to come out of me when we're about to
spend too much time on a topic (at least some of which ends up being my
fault) :)
Tim