we are doing some use of RHQ on EC2 - the question came up, when the EC2 instances change IP address (but not FQDN - we are using Elastic IPs) is there any chance of RHQ losing the connection to the server? In other words does RHQ store the IP for a server with an agent anywhere, or does it just use the FQDN- or does it get it from the agent?
Thanks, John
John Sanda did alot of work to get RHQ to run on EC2 and has solutions to these issues. He should reply with links/info to any details he can provide.
On 05/05/2011 11:56 AM, John Hollland wrote:
we are doing some use of RHQ on EC2 - the question came up, when the EC2 instances change IP address (but not FQDN - we are using Elastic IPs) is there any chance of RHQ losing the connection to the server? In other words does RHQ store the IP for a server with an agent anywhere, or does it just use the FQDN- or does it get it from the agent?
Thanks, John _______________________________________________ rhq-users mailing list rhq-users@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/rhq-users
There are two issues you have to consider. One of those has to do precisely with agents and server(s) maintaining connectivity across machine restarts. When an agent registers with the server, the agent sends its IP address to the server so the server knows how to contact the agent. And the agent receives a failover list from the server that contains a list of server host names/IP addresses. When an agent machine is restarted in EC2, the server no longer has a valid address for the agent (unless you assign the agent machine an elastic IP address); consequently, the server will not be able to contact the agent. Likewise, if the server machine is restarted, the agent will not be able to contact the server. Since Amazon limits the number of elastic IPs that you can have, I first assign your RHQ server an elastic IP. That way, if/when an agent is restarted, it has a known address for the server. And when the agent reconnects to the server, it will send its new hostname/address. To make sure your agent uses the elastic IP address, you can do the following,
export RHQ_AGENT_CMDLINE_OPTS="-Drhq.agent.server.bind-address=<ELASTIC_IP_ADDR>"
The second issue has to do with resource keys. Resource keys are supposed to be static in nature. They should not change. The resource key for platforms and agents is based off host names and/or IP addresses by default. You can override that default as follows,
export RHQ_AGENT_CMDLINE_OPTS="-Drhq.agent.name=<name_to_use>"
Using elastic IPs is not a good idea though because they too can change. Granted an elastic IP does not change across machine restarts, but you can at any time unassign an elastic IP from an instance. I suggest using instance ids for the resource keys since those do not change across restarts and should be constant for the life time of a machine.
We have some internal documentation around all of this that I need to get posted on the RHQ wiki. I will send out an email when those docs get posted. There is also a server plugin I wrote a while back to address the first issue discussed. It is in master located at rhq/modules/enterprise/server/plugins/cloud. I will get some documentation written up on that as well. Basically the plugin executes as a scheduled job monitoring for RHQ server address changes. When it detects that a server has a new address, it will push down an updated fail over list to agents to ensure agents always have a valid list of servers to contact.
The work we did does not utilize anything EC2-specific because we wanted to come up with a solution that could be reused across different cloud providers. Since I did this work, I know that Amazon has rolled out a new DNS service. I have not looked into it, but it could very well make things a lot easier too. I'd be interested to hear about any experiences people have had using it.
- John
On 5/5/11 12:55 PM, John Mazzitelli wrote:
John Sanda did alot of work to get RHQ to run on EC2 and has solutions to these issues. He should reply with links/info to any details he can provide.
On 05/05/2011 11:56 AM, John Hollland wrote:
we are doing some use of RHQ on EC2 - the question came up, when the EC2 instances change IP address (but not FQDN - we are using Elastic IPs) is there any chance of RHQ losing the connection to the server? In other words does RHQ store the IP for a server with an agent anywhere, or does it just use the FQDN- or does it get it from the agent?
Thanks, John _______________________________________________ rhq-users mailing list rhq-users@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/rhq-users
rhq-users mailing list rhq-users@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/rhq-users
I should have pointed out that the rhq.agent.server.bind-address and rhq.agent.name properties are defined in agent-configuration.xml. You could override the defaults directly in agent-configuration.xml. I find it easier to do by setting the RHQ_AGENT_CMDLINE_OPTS environment variable.
On 5/5/11 1:33 PM, John Sanda wrote:
There are two issues you have to consider. One of those has to do precisely with agents and server(s) maintaining connectivity across machine restarts. When an agent registers with the server, the agent sends its IP address to the server so the server knows how to contact the agent. And the agent receives a failover list from the server that contains a list of server host names/IP addresses. When an agent machine is restarted in EC2, the server no longer has a valid address for the agent (unless you assign the agent machine an elastic IP address); consequently, the server will not be able to contact the agent. Likewise, if the server machine is restarted, the agent will not be able to contact the server. Since Amazon limits the number of elastic IPs that you can have, I first assign your RHQ server an elastic IP. That way, if/when an agent is restarted, it has a known address for the server. And when the agent reconnects to the server, it will send its new hostname/address. To make sure your agent uses the elastic IP address, you can do the following,
export RHQ_AGENT_CMDLINE_OPTS="-Drhq.agent.server.bind-address=<ELASTIC_IP_ADDR>"
The second issue has to do with resource keys. Resource keys are supposed to be static in nature. They should not change. The resource key for platforms and agents is based off host names and/or IP addresses by default. You can override that default as follows,
export RHQ_AGENT_CMDLINE_OPTS="-Drhq.agent.name=<name_to_use>"
Using elastic IPs is not a good idea though because they too can change. Granted an elastic IP does not change across machine restarts, but you can at any time unassign an elastic IP from an instance. I suggest using instance ids for the resource keys since those do not change across restarts and should be constant for the life time of a machine.
We have some internal documentation around all of this that I need to get posted on the RHQ wiki. I will send out an email when those docs get posted. There is also a server plugin I wrote a while back to address the first issue discussed. It is in master located at rhq/modules/enterprise/server/plugins/cloud. I will get some documentation written up on that as well. Basically the plugin executes as a scheduled job monitoring for RHQ server address changes. When it detects that a server has a new address, it will push down an updated fail over list to agents to ensure agents always have a valid list of servers to contact.
The work we did does not utilize anything EC2-specific because we wanted to come up with a solution that could be reused across different cloud providers. Since I did this work, I know that Amazon has rolled out a new DNS service. I have not looked into it, but it could very well make things a lot easier too. I'd be interested to hear about any experiences people have had using it.
- John
rhq-users@lists.fedorahosted.org