Hello Everybody,
With new Cassandra based metrics storage system, existing data will need to be migrated out of existing SQL storage to Cassandra. I put together a simple design document for the data migration tool that will be delivered with RHQ when the new metrics storage system is completed.
My initial thoughts about the tool are: 1) Restartable migration process, this will be especially useful for users with large amounts of metrics. 2) Favour robustness over performance because this migration will be done only once. 3) Make the tool external to the product (not part of the installation), so user can do the migration ahead of the upgrade.
Here is the full design document: https://docs.jboss.org/author/display/RHQ/Metrics+Data+Migration+-+Design
Feedback is more than welcomed since this tool is still in the early planning stages.
Thank you, Stefan Negrea
Software Engineer
After some lengthy discussion with Stefan, I think the migration tool needs to be run while the server is offline or possibly in maintenance mode. The best time to run the tool would be prior to running the RHQ installer. A problem with doing the data migration while the server is running is that we could wind up with skews in the aggregate data. The easiest and fastest way to ensure data is consistent for both pre- and post-upgrade is to run the data migration while the server is down.
- John
On Jan 2, 2013, at 6:20 PM, Stefan Negrea snegrea@redhat.com wrote:
Hello Everybody,
With new Cassandra based metrics storage system, existing data will need to be migrated out of existing SQL storage to Cassandra. I put together a simple design document for the data migration tool that will be delivered with RHQ when the new metrics storage system is completed.
My initial thoughts about the tool are:
- Restartable migration process, this will be especially useful for users with large amounts of metrics.
- Favour robustness over performance because this migration will be done only once.
- Make the tool external to the product (not part of the installation), so user can do the migration ahead of the upgrade.
Here is the full design document: https://docs.jboss.org/author/display/RHQ/Metrics+Data+Migration+-+Design
Feedback is more than welcomed since this tool is still in the early planning stages.
Thank you, Stefan Negrea
Software Engineer
rhq-devel mailing list rhq-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/rhq-devel
Nice document Stefan!
I am of course interested in testing and software quality. Is it okay if I enter the conversation in this regard?
Primarily, I want to make sure the design process considers testing and verification. So to begin that thought process I penciled in a few thoughts on the Wiki ... as shown below. As you continue with the design and implementation ... is it OK if you add your thoughts to the Testing section as well?
Later on ...this will be very useful information as QE begins engaging on this new feature ....
Thanks!
Michael Foley QE Lead, JBoss Operations Network
Testing
1. Use-cases 1. Oracle --> Cassandra 2. Postgres --> Cassandra 3. Edge cases to consider 1. high-volume of metrics 2. low-volume of metrics 3. 4. 4. Negative tests cases to consider 1. connection failures 2. 3. 2. Verification ... or, "How do we determine that migration was successful?" 1. Output generated by the migration tool 1. error messages ... or absence of error messages 2. # of rows read in Oracle/Postgres and number of rows? created in Cassandra ...on a per resource basis 2. Visual inspection in RHQ UI 1. exactly what in RHQ should remain exactly the same after the backend switches from Oracle/Postgres to Cassandra 1. 2. 3. 2. 3. Unit tests on the data migration tool 1. 2. 3. 4.
Labels:
----- Original Message -----
From: "John Sanda" jsanda@redhat.com To: rhq-devel@lists.fedorahosted.org Cc: rhq-users@lists.fedorahosted.org Sent: Wednesday, January 9, 2013 10:12:25 AM Subject: Re: Metrics Migration Tool - Cassandra
After some lengthy discussion with Stefan, I think the migration tool needs to be run while the server is offline or possibly in maintenance mode. The best time to run the tool would be prior to running the RHQ installer. A problem with doing the data migration while the server is running is that we could wind up with skews in the aggregate data. The easiest and fastest way to ensure data is consistent for both pre- and post-upgrade is to run the data migration while the server is down.
- John
On Jan 2, 2013, at 6:20 PM, Stefan Negrea snegrea@redhat.com wrote:
Hello Everybody,
With new Cassandra based metrics storage system, existing data will need to be migrated out of existing SQL storage to Cassandra. I put together a simple design document for the data migration tool that will be delivered with RHQ when the new metrics storage system is completed.
My initial thoughts about the tool are:
- Restartable migration process, this will be especially useful for users with large amounts of metrics.
- Favour robustness over performance because this migration will be done only once.
- Make the tool external to the product (not part of the installation), so user can do the migration ahead of the upgrade.
Here is the full design document: https://docs.jboss.org/author/display/RHQ/Metrics+Data+Migration+-+Design
Feedback is more than welcomed since this tool is still in the early planning stages.
Thank you, Stefan Negrea
Software Engineer
rhq-devel mailing list rhq-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/rhq-devel
_______________________________________________ rhq-users mailing list rhq-users@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/rhq-users
I added a bit more.
Testing
1. Use-cases 1. Oracle --> Cassandra 2. Postgres --> Cassandra 3. Edge cases to consider 1. high-volume of metrics 2. low-volume of metrics 3. 4. 4. Negative tests cases to consider 1. connection failures 2. 3. 2. Verification ... or, "How do we determine that migration was successful?" 1. Output generated by the migration tool 1. the migration tool generates a human readable log file 2. the human readable log file generates a message saying the migration was successful ...or an error occurred 3. if an error occurs in the migration ... a meaningful human readable error message is displayed 4. # of rows read in Oracle/Postgres and number of rows? created in Cassandra ...on a per resource basis 2. Visual inspection in RHQ UI 1. exactly what in RHQ should remain exactly the same after the backend switches from Oracle/Postgres to Cassandra 1. 2. 3. 2. 3. Unit tests on the data migration tool 1. 2. 3. 4. Performance and baselining 1. How long should this take? What is an acceptable migration time? 2. What is the SLA for a large deployment? 3. If large deployments may take a long time to migrate ... 1. is it wise to consider a tool that could migrate things incrementally? 1 resource at a time? 2. will the UI or command line allow the ability to only migrate 1 or select resources at a time? 4. 5. Risk areas ...If you could predict where a problem may manifests ...where would it be? For each risk area, list a possible mitigation approach. 1. 2. 3. 6. Testing tool? Is there a need for a automated or semi-automated testing or verification tool? 1. A tool that accesses both backends ... and verifies metrics on a particular resource are identical. 1.
----- Original Message -----
From: "Michael Foley" mfoley@redhat.com To: rhq-users@lists.fedorahosted.org Cc: rhq-devel@lists.fedorahosted.org Sent: Wednesday, January 9, 2013 10:35:18 AM Subject: Re: Metrics Migration Tool - Cassandra
Nice document Stefan!
I am of course interested in testing and software quality. Is it okay if I enter the conversation in this regard?
Primarily, I want to make sure the design process considers testing and verification. So to begin that thought process I penciled in a few thoughts on the Wiki ... as shown below. As you continue with the design and implementation ... is it OK if you add your thoughts to the Testing section as well?
Later on ...this will be very useful information as QE begins engaging on this new feature ....
Thanks!
Michael Foley QE Lead, JBoss Operations Network
Testing
1. Use-cases 1. Oracle --> Cassandra 2. Postgres --> Cassandra 3. Edge cases to consider 1. high-volume of metrics 2. low-volume of metrics 3. 4. 4. Negative tests cases to consider 1. connection failures 2. 3. 2. Verification ... or, "How do we determine that migration was successful?" 1. Output generated by the migration tool 1. error messages ... or absence of error messages 2. # of rows read in Oracle/Postgres and number of rows? created in Cassandra ...on a per resource basis 2. Visual inspection in RHQ UI 1. exactly what in RHQ should remain exactly the same after the backend switches from Oracle/Postgres to Cassandra 1. 2. 3. 2. 3. Unit tests on the data migration tool 1. 2. 3. 4.
Labels:
----- Original Message -----
From: "John Sanda" jsanda@redhat.com To: rhq-devel@lists.fedorahosted.org Cc: rhq-users@lists.fedorahosted.org Sent: Wednesday, January 9, 2013 10:12:25 AM Subject: Re: Metrics Migration Tool - Cassandra
After some lengthy discussion with Stefan, I think the migration tool needs to be run while the server is offline or possibly in maintenance mode. The best time to run the tool would be prior to running the RHQ installer. A problem with doing the data migration while the server is running is that we could wind up with skews in the aggregate data. The easiest and fastest way to ensure data is consistent for both pre- and post-upgrade is to run the data migration while the server is down.
- John
On Jan 2, 2013, at 6:20 PM, Stefan Negrea snegrea@redhat.com wrote:
Hello Everybody,
With new Cassandra based metrics storage system, existing data will need to be migrated out of existing SQL storage to Cassandra. I put together a simple design document for the data migration tool that will be delivered with RHQ when the new metrics storage system is completed.
My initial thoughts about the tool are:
- Restartable migration process, this will be especially useful for users with large amounts of metrics.
- Favour robustness over performance because this migration will be done only once.
- Make the tool external to the product (not part of the installation), so user can do the migration ahead of the upgrade.
Here is the full design document: https://docs.jboss.org/author/display/RHQ/Metrics+Data+Migration+-+Design
Feedback is more than welcomed since this tool is still in the early planning stages.
Thank you, Stefan Negrea
Software Engineer
rhq-devel mailing list rhq-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/rhq-devel
_______________________________________________ rhq-users mailing list rhq-users@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/rhq-users
_______________________________________________ rhq-users mailing list rhq-users@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/rhq-users
----- Original Message -----
After some lengthy discussion with Stefan, I think the migration tool needs to be run while the server is offline or possibly in maintenance mode. The best time to run the tool would be prior to running the RHQ installer. A problem with doing the data migration while the server is running is that we could wind up with skews in the aggregate data. The easiest and fastest way to ensure data is consistent for both pre- and post-upgrade is to run the data migration while the server is down.
Whether this approach is feasible or not could very well depend on the speed of the migration -How long does our normal upgrade take today? I'm not aware of folks saying that was too long -If it takes hours/days to move data during upgrade then that may simple not be an option. Could a portion of the data be moved while the server is live?
- John
On Jan 2, 2013, at 6:20 PM, Stefan Negrea snegrea@redhat.com wrote:
Hello Everybody,
With new Cassandra based metrics storage system, existing data will need to be migrated out of existing SQL storage to Cassandra. I put together a simple design document for the data migration tool that will be delivered with RHQ when the new metrics storage system is completed.
My initial thoughts about the tool are:
- Restartable migration process, this will be especially useful
for users with large amounts of metrics. 2) Favour robustness over performance because this migration will be done only once. 3) Make the tool external to the product (not part of the installation), so user can do the migration ahead of the upgrade.
Here is the full design document: https://docs.jboss.org/author/display/RHQ/Metrics+Data+Migration+-+Design
Feedback is more than welcomed since this tool is still in the early planning stages.
Thank you, Stefan Negrea
Software Engineer
rhq-devel mailing list rhq-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/rhq-devel
rhq-users mailing list rhq-users@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/rhq-users
On Jan 9, 2013, at 1:37 PM, Charles Crouch ccrouch@redhat.com wrote:
----- Original Message -----
After some lengthy discussion with Stefan, I think the migration tool needs to be run while the server is offline or possibly in maintenance mode. The best time to run the tool would be prior to running the RHQ installer. A problem with doing the data migration while the server is running is that we could wind up with skews in the aggregate data. The easiest and fastest way to ensure data is consistent for both pre- and post-upgrade is to run the data migration while the server is down.
Whether this approach is feasible or not could very well depend on the speed of the migration -How long does our normal upgrade take today? I'm not aware of folks saying that was too long -If it takes hours/days to move data during upgrade then that may simple not be an option. Could a portion of the data be moved while the server is live?
If we do the data migration while the RHQ server is up, we potentially have to deal with the following:
* the migration is going to take longer than if we do it while the server is down * there is a good possibility of data skew with aggregate data * we may have to disable baseline alerting until the migration is finished
If we do the data migration while the server is down, then:
* the migration will run faster than it would if run while the server is up * there is possibility of data skew * the upgrade process, i.e, the amount of time the server is down might be longer than if we do migration while the server is up
At this point, all we can do is speculate about how long the migration will actually take until we do some load testing. If we find that the migration is taking longer than we would like, another option could be to explore using the bulk import/export utilities provided by each of the databases.
- John
I am not sure I understand the debate on metric data migration with the server up or down.
Isn't the data migration occuring in context of an RHQ upgrade from one RHQ version to another? And won't the RHQ server already be down for that RHQ upgrade?
If the data is going to be migrated while the server is up ....are we saying that an existing version of RHQ can switch to a Cassandra back end?
Additionally ... I have some concerns that a data migration on a running server that is collecting metrics is something that can be verified easily.
----- Original Message -----
From: "John Sanda" jsanda@redhat.com To: rhq-users@lists.fedorahosted.org Cc: rhq-devel@lists.fedorahosted.org Sent: Wednesday, January 9, 2013 2:32:24 PM Subject: Re: Metrics Migration Tool - Cassandra
On Jan 9, 2013, at 1:37 PM, Charles Crouch ccrouch@redhat.com wrote:
----- Original Message -----
After some lengthy discussion with Stefan, I think the migration tool needs to be run while the server is offline or possibly in maintenance mode. The best time to run the tool would be prior to running the RHQ installer. A problem with doing the data migration while the server is running is that we could wind up with skews in the aggregate data. The easiest and fastest way to ensure data is consistent for both pre- and post-upgrade is to run the data migration while the server is down.
Whether this approach is feasible or not could very well depend on the speed of the migration -How long does our normal upgrade take today? I'm not aware of folks saying that was too long -If it takes hours/days to move data during upgrade then that may simple not be an option. Could a portion of the data be moved while the server is live?
If we do the data migration while the RHQ server is up, we potentially have to deal with the following:
* the migration is going to take longer than if we do it while the server is down * there is a good possibility of data skew with aggregate data * we may have to disable baseline alerting until the migration is finished
If we do the data migration while the server is down, then:
* the migration will run faster than it would if run while the server is up * there is possibility of data skew * the upgrade process, i.e, the amount of time the server is down might be longer than if we do migration while the server is up
At this point, all we can do is speculate about how long the migration will actually take until we do some load testing. If we find that the migration is taking longer than we would like, another option could be to explore using the bulk import/export utilities provided by each of the databases.
- John _______________________________________________ rhq-users mailing list rhq-users@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/rhq-users
On Jan 9, 2013, at 2:42 PM, Michael Foley mfoley@redhat.com wrote:
I am not sure I understand the debate on metric data migration with the server up or down.
The issues stems from the possibility that the data migration could take a long time for larger deployments. It is hard to quantify how long until we do some load testing, but it is not unreasonable to think it could be on the order of hours.
Isn't the data migration occuring in context of an RHQ upgrade from one RHQ version to another? And won't the RHQ server already be down for that RHQ upgrade?
If the data is going to be migrated while the server is up ....are we saying that an existing version of RHQ can switch to a Cassandra back end?
No. Once the RHQ installer finishes the upgrade/installation, regardless of whether or not the data migration tool has been run, all numeric metric data will be getting inserted into Cassandra.
Additionally ... I have some concerns that a data migration on a running server that is collecting metrics is something that can be verified easily.
This is another, valid concern about doing the migration while the server is running, not just for QE but also for end users.
From: "John Sanda" jsanda@redhat.com To: rhq-users@lists.fedorahosted.org Cc: rhq-devel@lists.fedorahosted.org Sent: Wednesday, January 9, 2013 2:32:24 PM Subject: Re: Metrics Migration Tool - Cassandra
On Jan 9, 2013, at 1:37 PM, Charles Crouch ccrouch@redhat.com wrote:
----- Original Message -----
After some lengthy discussion with Stefan, I think the migration tool needs to be run while the server is offline or possibly in maintenance mode. The best time to run the tool would be prior to running the RHQ installer. A problem with doing the data migration while the server is running is that we could wind up with skews in the aggregate data. The easiest and fastest way to ensure data is consistent for both pre- and post-upgrade is to run the data migration while the server is down.
Whether this approach is feasible or not could very well depend on the speed of the migration -How long does our normal upgrade take today? I'm not aware of folks saying that was too long -If it takes hours/days to move data during upgrade then that may simple not be an option. Could a portion of the data be moved while the server is live?
If we do the data migration while the RHQ server is up, we potentially have to deal with the following:
- the migration is going to take longer than if we do it while the server is down
- there is a good possibility of data skew with aggregate data
- we may have to disable baseline alerting until the migration is finished
If we do the data migration while the server is down, then:
- the migration will run faster than it would if run while the server is up
- there is possibility of data skew
- the upgrade process, i.e, the amount of time the server is down might be longer than if we do migration while the server is up
At this point, all we can do is speculate about how long the migration will actually take until we do some load testing. If we find that the migration is taking longer than we would like, another option could be to explore using the bulk import/export utilities provided by each of the databases.
- John
rhq-users mailing list rhq-users@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/rhq-users
rhq-devel mailing list rhq-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/rhq-devel
On 1/9/2013 2:42 PM, Michael Foley wrote:
I am not sure I understand the debate on metric data migration with the server up or down.
Isn't the data migration occuring in context of an RHQ upgrade from one RHQ version to another? And won't the RHQ server already be down for that RHQ upgrade?
I think the fear is that huge data migration from the RDB to Cassandra may take an unacceptably long amount of down time. And therefore the question was whether the migration could be done while the RHQ server was up, offloading data while new data came in such that a large % of data could move before having to take the system down. By disabling the data purge job this could maybe happen somewhat safely for the most of the data tables maybe. But no matter how you slice it it will complicate things.
I think the hope will be that a quiet environment, only reads from the DB, and the usual fast writes from Cassandra, will make it feasible to migrate with the system down in a reasonable amount of time.
If the data is going to be migrated while the server is up ....are we saying that an existing version of RHQ can switch to a Cassandra back end?
To be clear, no. This would just be in preparation for the impending upgrade, no matter how it is done.
Additionally ... I have some concerns that a data migration on a running server that is collecting metrics is something that can be verified easily.
*From: *"John Sanda" jsanda@redhat.com *To: *rhq-users@lists.fedorahosted.org *Cc: *rhq-devel@lists.fedorahosted.org *Sent: *Wednesday, January 9, 2013 2:32:24 PM *Subject: *Re: Metrics Migration Tool - Cassandra
On Jan 9, 2013, at 1:37 PM, Charles Crouch ccrouch@redhat.com wrote:
----- Original Message -----
After some lengthy discussion with Stefan, I think the migration tool needs to be run while the server is offline or possibly in maintenance mode. The best time to run the tool would be prior to running the RHQ installer. A problem with doing the data migration while the server is running is that we could wind up with skews in the aggregate data. The easiest and fastest way to ensure data is consistent for both pre- and post-upgrade is to run the data migration while the server is down.
Whether this approach is feasible or not could very well depend on
the speed of the migration
-How long does our normal upgrade take today? I'm not aware of folks
saying that was too long
-If it takes hours/days to move data during upgrade then that may
simple not be an option. Could a portion of the data be moved while the server is live?
If we do the data migration while the RHQ server is up, we potentially have to deal with the following:
- the migration is going to take longer than if we do it while the
server is down
- there is a good possibility of data skew with aggregate data
- we may have to disable baseline alerting until the migration is finished
If we do the data migration while the server is down, then:
- the migration will run faster than it would if run while the server
is up
- there is possibility of data skew
- the upgrade process, i.e, the amount of time the server is down
might be longer than if we do migration while the server is up
At this point, all we can do is speculate about how long the migration will actually take until we do some load testing. If we find that the migration is taking longer than we would like, another option could be to explore using the bulk import/export utilities provided by each of the databases.
- John
rhq-users mailing list rhq-users@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/rhq-users
rhq-users mailing list rhq-users@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/rhq-users
Le 09/01/2013 20:32, John Sanda a écrit :
At this point, all we can do is speculate about how long the migration will actually take until we do some load testing. If we find that the migration is taking longer than we would like, another option could be to explore using the bulk import/export utilities provided by each of the databases.
I think working on bulk export files would be far more efficient. And it shouldn't be too difficult given the measurement tables have very simple schema (migrating to Cassandra may not be as simple as migrating these tables data though).
So why not having the two mechanisms: 1. batching with Hibernate which would support a larger number of deployments (Postgres, Oracle, SQLServer) 2. batching with bulk export files for the supported databases (Postgres, Oracle)
I know it's double code, test and support but I really doubt #1 can handle large amounts of data in less than a few hours.
And you're right, we cannot speculate on this and I don't believe we could make a release without actually trying the tool on different workloads.
Thomas
Just the voice of QE on this .... I am hearing lots of discussion on design decisions on this data migration tool ... which is good.
One word of caution (and I hear this in all your emails ... so I know you are thinking about this...this is just a reminder): Avoid premature optimization.
The course of action that makes sense to me:
#1. Baseline the current implementation. Use small and large datasets typical of what a customer would use. Document the baseline. This will be incredibly useful moving forward to compare alternative solutions. You need the baseline. #2. Determine if any design optimizations need to be made. #3. Compare alternative solutions against the baseline.
I guess what I am saying ... I think we need a baseline before this conversation can continue in a meaningful way.
How can QE help establish a meaningful baseline? What do we need? Resources in the bladecenter? Determining an SLA or acceptance criteria for performance? Datasets?
----- Original Message -----
From: "Thomas Segismont" tsegismo@redhat.com To: rhq-devel@lists.fedorahosted.org, rhq-users@lists.fedorahosted.org Sent: Thursday, January 10, 2013 12:02:07 PM Subject: Re: Metrics Migration Tool - Cassandra
Le 09/01/2013 20:32, John Sanda a écrit :
At this point, all we can do is speculate about how long the migration will actually take until we do some load testing. If we find that the migration is taking longer than we would like, another option could be to explore using the bulk import/export utilities provided by each of the databases.
I think working on bulk export files would be far more efficient. And it shouldn't be too difficult given the measurement tables have very simple schema (migrating to Cassandra may not be as simple as migrating these tables data though).
So why not having the two mechanisms: 1. batching with Hibernate which would support a larger number of deployments (Postgres, Oracle, SQLServer) 2. batching with bulk export files for the supported databases (Postgres, Oracle)
I know it's double code, test and support but I really doubt #1 can handle large amounts of data in less than a few hours.
And you're right, we cannot speculate on this and I don't believe we could make a release without actually trying the tool on different workloads.
Thomas _______________________________________________ rhq-users mailing list rhq-users@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/rhq-users
----- Original Message -----
From: "Michael Foley" mfoley@redhat.com To: rhq-users@lists.fedorahosted.org Cc: rhq-devel@lists.fedorahosted.org Sent: Thursday, January 10, 2013 11:13:59 AM Subject: Re: Metrics Migration Tool - Cassandra
Just the voice of QE on this .... I am hearing lots of discussion on design decisions on this data migration tool ... which is good.
One word of caution (and I hear this in all your emails ... so I know you are thinking about this...this is just a reminder): Avoid premature optimization.
Totally agree with you Mike. Premature optimization can be the root of a lot of evil. However, in this case I think the expectation was/is that the amount of data to be migrated is large. The only questions not addressed were "How large? How much data?" Hopefully I shed some light into this with with the updates to the design wiki: https://docs.jboss.org/author/display/RHQ/Metrics+Data+Migration+-+Design
The course of action that makes sense to me:
#1. Baseline the current implementation. Use small and large datasets typical of what a customer would use. Document the baseline. This will be incredibly useful moving forward to compare alternative solutions. You need the baseline. #2. Determine if any design optimizations need to be made. #3. Compare alternative solutions against the baseline.
I guess what I am saying ... I think we need a baseline before this conversation can continue in a meaningful way.
I am not sure we can ever have exact figures for the amount of data to be migrated. Metrics collection is highly configurable in RHQ. Users can increase/decrease collection frequency of any metric. Another example is exact configuration of the resource monitored; a deployment with a single JBoss AS server but with 1000 applications inside will have a lot more metrics than the average JBoss AS install. The number of installed plugins and discovered resources is yet another example.
I hope that the estimates [1] (please see the estimation process) gives as stable starting point for creating test scenarios and some sort of timing baseline. However, I caution against treating these estimates as 99.9999% accurate because of highly configurable nature of metrics collection.
[1] https://docs.jboss.org/author/display/RHQ/Metrics+Data+Migration+-+Design
Thank you, Stefan Negrea
How can QE help establish a meaningful baseline? What do we need? Resources in the bladecenter? Determining an SLA or acceptance criteria for performance? Datasets?
From: "Thomas Segismont" tsegismo@redhat.com To: rhq-devel@lists.fedorahosted.org, rhq-users@lists.fedorahosted.org Sent: Thursday, January 10, 2013 12:02:07 PM Subject: Re: Metrics Migration Tool - Cassandra
Le 09/01/2013 20:32, John Sanda a écrit :
At this point, all we can do is speculate about how long the migration will actually take until we do some load testing. If we find that the migration is taking longer than we would like, another option could be to explore using the bulk import/export utilities provided by each of the databases.
I think working on bulk export files would be far more efficient. And it shouldn't be too difficult given the measurement tables have very simple schema (migrating to Cassandra may not be as simple as migrating these tables data though).
So why not having the two mechanisms:
- batching with Hibernate which would support a larger number of
deployments (Postgres, Oracle, SQLServer) 2. batching with bulk export files for the supported databases (Postgres, Oracle)
I know it's double code, test and support but I really doubt #1 can handle large amounts of data in less than a few hours.
And you're right, we cannot speculate on this and I don't believe we could make a release without actually trying the tool on different workloads.
Thomas _______________________________________________ rhq-users mailing list rhq-users@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/rhq-users
rhq-devel mailing list rhq-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/rhq-devel
Just my 2 cents:
I would like to continue running RHQ with minimal downtime (less than 30 minutes if possible). The main issue with downtime is loss of systems monitoring capability. Historical data is definitely nice to have, but system monitoring trumps this. I would sooner drop the metrics data than have to face an unpredictable migration process.
I would prefer if migration could happen concurrent to running of RHQ.
Hello Everybody,
I updated the design wiki [1] with estimates for the amount of data to be migrated. The estimates show that even for relatively small deployments there is a non-trivial amount of data to be moved from the relational database to Cassandra. For example, on a system with 10 agents (a small deployment) the estimates show about 0.5 GB or about 16 million rows of data to be migrated. For a larger deployment with 125 agents, the estimates came to 6GB or 197 million rows of data.
So far the migration process design is: 1) Read a batch of data 2) Insert data into Cassandra 3) Delete data from relational database
Lets consider that the deletion process is optimized and takes a relatively trivial amount of time compared to reading or inserting. That means the amount of data to be processed by the migration is 2 times the estimate. For example, 0.5 GB read + 0.5 GB inserted in the case of small deployments.
I am almost done with a random data generator that matches these estimates. That will help with migration benchmarks early in the development process to adjust the design/plan if necessary.
How do these estimates look? Do these numbers change the perspective on the complexity of the task?
[1] https://docs.jboss.org/author/display/RHQ/Metrics+Data+Migration+-+Design
----- Original Message -----
From: "Thomas Segismont" tsegismo@redhat.com To: rhq-devel@lists.fedorahosted.org, rhq-users@lists.fedorahosted.org Sent: Thursday, January 10, 2013 11:02:07 AM Subject: Re: Metrics Migration Tool - Cassandra
Le 09/01/2013 20:32, John Sanda a écrit :
At this point, all we can do is speculate about how long the migration will actually take until we do some load testing. If we find that the migration is taking longer than we would like, another option could be to explore using the bulk import/export utilities provided by each of the databases.
I think working on bulk export files would be far more efficient. And it shouldn't be too difficult given the measurement tables have very simple schema (migrating to Cassandra may not be as simple as migrating these tables data though).
So why not having the two mechanisms:
- batching with Hibernate which would support a larger number of
deployments (Postgres, Oracle, SQLServer)
The current plan is to primarily support batched operations for data migration. Things will get a tad speedier on Cassandra side because of async inserts.
- batching with bulk export files for the supported databases
(Postgres, Oracle)
I know it's double code, test and support but I really doubt #1 can handle large amounts of data in less than a few hours.
I am not sure if that is feasible. With export files the data will be processed 4 times: read from relational, write to export file, read from export file, write to Cassandra. For a deployment with about 2GB of data (which I think will be closer to the average deployment size) that will be become 8GB of data processed.
And you're right, we cannot speculate on this and I don't believe we could make a release without actually trying the tool on different workloads.
I was hesitant to reply without having numbers. Hopefully with these estimates and a random data generator we will get a better picture for the length and complexity of the process.
Do these estimates change your mind regarding the export files approach?
Thomas _______________________________________________ rhq-devel mailing list rhq-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/rhq-devel
rhq-users@lists.fedorahosted.org