On 04/02/2012 09:49 AM, Matt Wagner wrote:
On Fri, Mar 30, 2012 at 11:59:11AM -0400, Tzu-Mainn Chen wrote:
>
https://www.aeolusproject.org/redmine/projects/aeolus/wiki/Allow_Conducto...
So I wrote the below email and was about to send it, when I realized
that I'd focused on the technical implementation details without ever
fully understanding the overall use case. But it's not abundantly clear
to me what that entails. What information, exactly, are we intending to
keep? I wonder if we should look at stashing core information in the
Events table?
I'd imagine Hugh or someone else with a better feel for customer
requirements may need to weigh in, but in general part of what we may be
dealing with are audit requirements, especially if we ever start
modeling cost issues. We may need to be able to produce a report on a
user's history/activities, when instances were up, which ones, who
changed properties, etc.
The "soft delete" vs. "archiving" is only part of this, though. We
need
it so that references in events, etc. to deleted items are still valid.
However we also need to fully spec out the event/auditing requirements
(not part of this task it seems). What changes do we need to record?
What history do we need for what object types? Is there some process for
purging in-db events/archives to some "long-term" storage/report format?
> The document talks about two possible approaches to keeping
deployment activity; one is to use an archive table, and another is to use a soft_delete
flag. I'd be interested in hearing people's thoughts on both.
I was going to mention that I'm still fond of acts_as_paranoid, but I
just saw your mention on the wiki page that Tomas sent a patch adding
archivist six months ago. I'm all for just using that in this case. I
think we're likely to need to stitch together associations anyway when
we're dealing with deleted objects.
Though, FWIW, using soft delete wouldn't necessarily require any major
changes. acts_as_paranoid would override the finder methods of AR to add
a "WHERE deleted_at IS NULL" to all queries unless you ran
"find_with_deleted". If we did this ourselves, we could probably just
use a default scope and a with_deleted method to override it.
Yeah -- the document didn't make it clear whether you'd looked into the
existing rails plugins for this, but when I looked into this a couple
years ago there were options for both soft_delete (i.e.
acts_as_paranoid) or archive tables (I forget what I found here). Also,
while the finder overrides will make _most_ instances of deleted objects
disappear, you've still got to be careful with joins, custom queries,
etc. I'd imagine that even with acts_as_paranoid we'll need to be
careful to make sure that our code doesn't find deleted stuff when we
don't want it -- and that we've got decent automated test coverage to
confirm that deleted things don't start appearing with future code changes.
The real pain with deletion isn't the object itself but the
associations. Not just for query filtering (w/ soft-delete) but also w/
foreign key constraints (for 'archive tables').
> Oh, and the document also talks about potentially using dbomatic
for periodic history purges. I know there's been some discussion around dbomatic, so
if someone had an opinion about that as well. . .
>
I think I talked to you on IRC about this and didn't note this
objection, so I apologize for suddenly raising it. ;) But it occurs to
me now that cleaning up old history is something that we want to do
perhaps once a month, not every couple of minutes, so maybe we should
just use cron. What if we added a rake task / little script that did
this, using a configurable time period? Then we could ship with a
reasonable default (do a monthly cleanup of entries older than 6
months?), but allow administrators to change it by just editing crontab.
(I'm also not entirely convinced that deleting archived entries by
default is a good thing... Some users might want to preserve history
indefinitely. This way they could at least disable it easily.)
In addition, there may be other actions that need to occur in
conjunction with purging history. Something as simple as a log file
logging each deleted item (so there's still some possibility to manually
recover old archive data if needed), or perhaps someting more complex.
Also the 'deleted item' purging and old events purging should probably
done in conjunction with each other.
One additional note -- the last sentence of your "Summary"
section
includes this interesting tidbit: "Finally, historical event data will
need to be searched efficiently. In order to do so, we will evaluate
Katello's search implementation and see if it is suitable."
With soft-delete, we've got search -- but perhaps even the standard list
UI will help us -- we can provide an 'include deleted' flag for admins
to see deleted stuff too.
That sounds like a really good idea, although I suspect it'll be
a good
bit of work, enough that I wonder if it should be its own feature. I
think we'd need to work with Katello to make sure it was appropriately
configured for multi-tenancy, and make sure that Katello and Aeolus both
included a dependency on it, plus make sure that we had consistent
configuration so that if one project set it up the other project could
use it without problems. It's definitely achievable and likely
worthwhile; I just think it deserves scoping as a bigger task. (IMHO.)
-- Matt