On 1/9/2013 2:42 PM, Michael Foley wrote:
I am not sure I understand the debate on metric data migration with the server up or down.  

Isn't the data migration occuring in context of an RHQ upgrade from one RHQ version to another?  And won't the RHQ server already be down for that RHQ upgrade?
I think the fear is that huge data migration from the RDB to Cassandra may take an unacceptably long amount of down time.  And therefore the question was whether the migration could be done while the RHQ server was up, offloading data while new data came in such that a large % of data could move before having to take the system down.  By disabling the data purge job this could maybe happen somewhat safely for the most of the data tables maybe.  But no matter how you slice it it will complicate things.

I think the hope will be that a quiet environment, only reads from the DB, and the usual fast writes from Cassandra, will make it feasible to migrate with the system down in a reasonable amount of time.


If the data is going to be migrated while the server is up ....are we saying that an existing version of RHQ can switch to a Cassandra back end?  
To be clear, no.  This would just be in preparation for the impending upgrade, no matter how it is done.

Additionally ... I have some concerns that a data migration on a running server that is collecting metrics is something that can be verified easily.




From: "John Sanda" <jsanda@redhat.com>
To: rhq-users@lists.fedorahosted.org
Cc: rhq-devel@lists.fedorahosted.org
Sent: Wednesday, January 9, 2013 2:32:24 PM
Subject: Re: Metrics Migration Tool - Cassandra


On Jan 9, 2013, at 1:37 PM, Charles Crouch <ccrouch@redhat.com> wrote:

>
>
> ----- Original Message -----
>> After some lengthy discussion with Stefan, I think the migration tool
>> needs to be run while the server is offline or possibly in
>> maintenance mode. The best time to run the tool would be prior to
>> running the RHQ installer. A problem with doing the data migration
>> while the server is running is that we could wind up with skews in
>> the aggregate data. The easiest and fastest way to ensure data is
>> consistent for both pre- and post-upgrade is to run the data
>> migration while the server is down.
>
> Whether this approach is feasible or not could very well depend on the speed of the migration
> -How long does our normal upgrade take today? I'm not aware of folks saying that was too long
> -If it takes hours/days to move data during upgrade then that may simple not be an option. Could a portion of the data be moved while the server is live?

If we do the data migration while the RHQ server is up, we potentially have to deal with the following:

* the migration is going to take longer than if we do it while the server is down
* there is a good possibility of data skew with aggregate data
* we may have to disable baseline alerting until the migration is finished

If we do the data migration while the server is down, then:

* the migration will run faster than it would if run while the server is up
* there is possibility of data skew
* the upgrade process, i.e, the amount of time the server is down might be longer than if we do migration while the server is up

At this point, all we can do is speculate about how long the migration will actually take until we do some load testing. If we find that the migration is taking longer than we would like, another option could be to explore using the bulk import/export utilities provided by each of the databases.

- John
_______________________________________________
rhq-users mailing list
rhq-users@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/rhq-users



_______________________________________________
rhq-users mailing list
rhq-users@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/rhq-users