Fedora Devs,
I just spent the last couple of days fighting with Essex on RHEL6 and its been entertaining and I'd like to share some of the oddities and experiences.
System configuration is the following.
Two nodes on their own /24 connected by cross over to each other on the second interface. The first node is the cloud controller and has tons of storage (11T) and 32Gb ram and 16 cores The second node I would like to make an extra compute node and it has 24Gb ram and 8 cores (still in a work in progress)
Originally the cloud controller was running Diablo on RHEL6 and was working fine.
I couldn't find any 'upgrade' instructions for going between Diablo and Essex and I wasn't too worried because the usage of the cloud was limited to just a couple of guys. So I was satisfied with backing up manually all the data and rebuild the cluster. I noticed when I did the update that things stopped working and following the install instructions blew away all local data in the cloud.
I was following the instructions found at the following URL.
http://fedoraproject.org/wiki/Getting_started_with_OpenStack_EPEL
I got the packages from
http://pbrady.fedorapeople.org/openstack-el6/
First issue. Wow, this is long, its almost long enough that making an uber script in a common package somewhere to run would strip out of most of the manual commands to run. I'd suggest first pulling out all the openstack-config-set commands and put them in a script to run. Not sure what to do about the swift documentation bits, that seems like a very manual set of configurations why aren't they part of the swift rpm? Another suggestion would be to split it out into a couple of documents one describing installation and configuration then the next describing putting data/users into it and starting stuff? thoughts?
After I got everything setup and working I noticed an issue with the dashboard, most of the static stuff wasn't showing up I had to add a symlink. /usr/share/openstack-dashboard/static -> openstack_dashboard/static Then the dashboard picked up the right stuff and it worked.
There's some consistency issues and I'm not sure if this is an openstack issue in general. The euca tools and how you configure them with keystone only seem to work with your personal instances and configuration. However, the dashboard seems to show users everything associated with the project instead. For example when I allocate floating IPs from the website those won't show up when I run euca-describe-addresses and respectively euca-allocate-address won't show up the IP allocated in the dashboard. I've looked at the database and the project ids are used when using the dashboard and user ids are used when using the euca tools. I think the euca tools could be setup to see everything that the dashboard sees however the documentation doesn't point to how to do that.
There also seems to be some serious functionality faults that I can't seem to make work. I can't make a user attached to multiple projects, not sure how to do that. Also, seems like there's a lot of, "huh, that doesn't seem implemented yet." However, this seems like a general openstack issue, documentation says X but that doesn't work yet or anymore.
I'm having a serious issue not getting a the second compute node working `nova-manage service list' doesn't show ':-)' for the compute and network services running on that node. I've followed the instructions to the letter and tried getting things working but its not going.
nova.conf for the controller.
[DEFAULT] logdir = /var/log/nova state_path = /var/lib/nova lock_path = /var/lib/nova/tmp dhcpbridge = /usr/bin/nova-dhcpbridge dhcpbridge_flagfile = /etc/nova/nova.conf force_dhcp_release = False injected_network_template = /usr/share/nova/interfaces.template libvirt_xml_template = /usr/share/nova/libvirt.xml.template libvirt_nonblocking = True vpn_client_template = /usr/share/nova/client.ovpn.template credentials_template = /usr/share/nova/novarc.template network_manager = nova.network.manager.FlatDHCPManager iscsi_helper = tgtadm sql_connection = mysql://nova:nova@localhost/nova connection_type = libvirt firewall_driver = nova.virt.libvirt.firewall.IptablesFirewallDriver rpc_backend = nova.rpc.impl_qpid root_helper = sudo nova-rootwrap auth_strategy = keystone public_interface = eth0 quota_floating_ips = 100
nova.conf on compute node
[DEFAULT] logdir = /var/log/nova state_path = /var/lib/nova lock_path = /var/lib/nova/tmp dhcpbridge = /usr/bin/nova-dhcpbridge dhcpbridge_flagfile = /etc/nova/nova.conf force_dhcp_release = True injected_network_template = /usr/share/nova/interfaces.template libvirt_xml_template = /usr/share/nova/libvirt.xml.template libvirt_nonblocking = True vpn_client_template = /usr/share/nova/client.ovpn.template credentials_template = /usr/share/nova/novarc.template network_manager = nova.network.manager.FlatDHCPManager iscsi_helper = tgtadm sql_connection = mysql://nova:nova@CC_NAME/nova connection_type = libvirt firewall_driver = nova.virt.libvirt.firewall.IptablesFirewallDriver rpc_backend = nova.rpc.impl_qpid root_helper = sudo nova-rootwrap rabbit_host = CC_NAME glance_api_servers = CC_NAME:9292 iscsi_ip_prefix = CC_ADDR public_interface = eth2 verbose = True s3_host = CC_NAME ec2_api = CC_NAME ec2_url = http://CC_NAME:8773/services/Cloud fixed_range = 10.0.0.0/24 network_size = 256
Any help would be helpful.
Thanks, - David Brown
On 04/25/2012 11:04 PM, Brown, David M JR wrote:
Fedora Devs,
I just spent the last couple of days fighting with Essex on RHEL6 and its been entertaining and I'd like to share some of the oddities and experiences.
Thanks a lot for taking the time to write this up, and test these packages.
System configuration is the following.
Two nodes on their own /24 connected by cross over to each other on the second interface. The first node is the cloud controller and has tons of storage (11T) and 32Gb ram and 16 cores The second node I would like to make an extra compute node and it has 24Gb ram and 8 cores (still in a work in progress)
Originally the cloud controller was running Diablo on RHEL6 and was working fine.
I couldn't find any 'upgrade' instructions for going between Diablo and Essex and I wasn't too worried because the usage of the cloud was limited to just a couple of guys. So I was satisfied with backing up manually all the data and rebuild the cluster. I noticed when I did the update that things stopped working and following the install instructions blew away all local data in the cloud.
I was following the instructions found at the following URL.
http://fedoraproject.org/wiki/Getting_started_with_OpenStack_EPEL
I got the packages from
http://pbrady.fedorapeople.org/openstack-el6/
First issue. Wow, this is long, its almost long enough that making an uber script in a common package somewhere to run would strip out of most of the manual commands to run. I'd suggest first pulling out all the openstack-config-set commands and put them in a script to run.
Good idea :) We're working on something very much like that: https://github.com/fedora-openstack/openstack-utils/blob/master/utils/openst... That file should already be on your system, but it's very early stage, and still being finalized.
Not sure what to do about the swift documentation bits, that seems like a very manual set of configurations why aren't they part of the swift rpm? Another suggestion would be to split it out into a couple of documents one describing installation and configuration then the next describing putting data/users into it and starting stuff? thoughts?
After I got everything setup and working I noticed an issue with the dashboard, most of the static stuff wasn't showing up I had to add a symlink. /usr/share/openstack-dashboard/static -> openstack_dashboard/static Then the dashboard picked up the right stuff and it worked.
Were you using django 1.3 from epel-testing? That particular issue was solved with that I think. I presume you've seen http://pbrady.fedorapeople.org/openstack-el6/README, which contains instructions that should install django 1.3, though you may have been unlucky with a slow mirror serving you Django 1.2?
There's some consistency issues and I'm not sure if this is an openstack issue in general. The euca tools and how you configure them with keystone only seem to work with your personal instances and configuration. However, the dashboard seems to show users everything associated with the project instead. For example when I allocate floating IPs from the website those won't show up when I run euca-describe-addresses and respectively euca-allocate-address won't show up the IP allocated in the dashboard. I've looked at the database and the project ids are used when using the dashboard and user ids are used when using the euca tools. I think the euca tools could be setup to see everything that the dashboard sees however the documentation doesn't point to how to do that.
Noted. BTW we've been looking at migrating instructions away from euca tools to the equivalent openstack tools (nova-manage etc.)
There also seems to be some serious functionality faults that I can't seem to make work. I can't make a user attached to multiple projects, not sure how to do that. Also, seems like there's a lot of, "huh, that doesn't seem implemented yet." However, this seems like a general openstack issue, documentation says X but that doesn't work yet or anymore.
I'm having a serious issue not getting a the second compute node working `nova-manage service list' doesn't show ':-)' for the compute and network services running on that node. I've followed the instructions to the letter and tried getting things working but its not going.
For debugging issues I find looking in /var/log/nova/*.log informative, especially with verbose = True in nova.conf
cheers, Pádraig.
Hi David, see below
On 04/25/2012 11:04 PM, Brown, David M JR wrote:
Fedora Devs,
I just spent the last couple of days fighting with Essex on RHEL6 and its been entertaining and I'd like to share some of the oddities and experiences.
System configuration is the following.
Two nodes on their own /24 connected by cross over to each other on the second interface. The first node is the cloud controller and has tons of storage (11T) and 32Gb ram and 16 cores The second node I would like to make an extra compute node and it has 24Gb ram and 8 cores (still in a work in progress)
Originally the cloud controller was running Diablo on RHEL6 and was working fine.
I couldn't find any 'upgrade' instructions for going between Diablo and Essex and I wasn't too worried because the usage of the cloud was limited to just a couple of guys. So I was satisfied with backing up manually all the data and rebuild the cluster. I noticed when I did the update that things stopped working and following the install instructions blew away all local data in the cloud.
I was following the instructions found at the following URL.
http://fedoraproject.org/wiki/Getting_started_with_OpenStack_EPEL
I got the packages from
http://pbrady.fedorapeople.org/openstack-el6/
First issue. Wow, this is long, its almost long enough that making an uber script in a common package somewhere to run would strip out of most of the manual commands to run. I'd suggest first pulling out all the openstack-config-set commands and put them in a script to run. Not sure what to do about the swift documentation bits, that seems like a very manual set of configurations why aren't they part of the swift rpm? Another suggestion would be to split it out into a couple of documents one describing installation and configuration then the next describing putting data/users into it and starting stuff? thoughts?
After I got everything setup and working I noticed an issue with the dashboard, most of the static stuff wasn't showing up I had to add a symlink. /usr/share/openstack-dashboard/static -> openstack_dashboard/static Then the dashboard picked up the right stuff and it worked.
There's some consistency issues and I'm not sure if this is an openstack issue in general. The euca tools and how you configure them with keystone only seem to work with your personal instances and configuration. However, the dashboard seems to show users everything associated with the project instead. For example when I allocate floating IPs from the website those won't show up when I run euca-describe-addresses and respectively euca-allocate-address won't show up the IP allocated in the dashboard. I've looked at the database and the project ids are used when using the dashboard and user ids are used when using the euca tools. I think the euca tools could be setup to see everything that the dashboard sees however the documentation doesn't point to how to do that.
There also seems to be some serious functionality faults that I can't seem to make work. I can't make a user attached to multiple projects, not sure how to do that. Also, seems like there's a lot of, "huh, that doesn't seem implemented yet." However, this seems like a general openstack issue, documentation says X but that doesn't work yet or anymore.
I'm having a serious issue not getting a the second compute node working `nova-manage service list' doesn't show ':-)' for the compute and network services running on that node. I've followed the instructions to the letter and tried getting things working but its not going.
nova.conf for the controller.
[DEFAULT] logdir = /var/log/nova state_path = /var/lib/nova lock_path = /var/lib/nova/tmp dhcpbridge = /usr/bin/nova-dhcpbridge dhcpbridge_flagfile = /etc/nova/nova.conf force_dhcp_release = False injected_network_template = /usr/share/nova/interfaces.template libvirt_xml_template = /usr/share/nova/libvirt.xml.template libvirt_nonblocking = True vpn_client_template = /usr/share/nova/client.ovpn.template credentials_template = /usr/share/nova/novarc.template network_manager = nova.network.manager.FlatDHCPManager iscsi_helper = tgtadm sql_connection = mysql://nova:nova@localhost/nova connection_type = libvirt firewall_driver = nova.virt.libvirt.firewall.IptablesFirewallDriver rpc_backend = nova.rpc.impl_qpid root_helper = sudo nova-rootwrap auth_strategy = keystone public_interface = eth0 quota_floating_ips = 100
nova.conf on compute node
[DEFAULT] logdir = /var/log/nova state_path = /var/lib/nova lock_path = /var/lib/nova/tmp dhcpbridge = /usr/bin/nova-dhcpbridge dhcpbridge_flagfile = /etc/nova/nova.conf force_dhcp_release = True injected_network_template = /usr/share/nova/interfaces.template libvirt_xml_template = /usr/share/nova/libvirt.xml.template libvirt_nonblocking = True vpn_client_template = /usr/share/nova/client.ovpn.template credentials_template = /usr/share/nova/novarc.template network_manager = nova.network.manager.FlatDHCPManager iscsi_helper = tgtadm sql_connection = mysql://nova:nova@CC_NAME/nova connection_type = libvirt firewall_driver = nova.virt.libvirt.firewall.IptablesFirewallDriver rpc_backend = nova.rpc.impl_qpid root_helper = sudo nova-rootwrap rabbit_host = CC_NAME glance_api_servers = CC_NAME:9292 iscsi_ip_prefix = CC_ADDR public_interface = eth2 verbose = True s3_host = CC_NAME ec2_api = CC_NAME ec2_url = http://CC_NAME:8773/services/Cloud fixed_range = 10.0.0.0/24 network_size = 256
Any help would be helpful.
It looks to me like your missing qpid_hostname from the compute node, because your using qpid as the rpc backend I think rabbit_host is ignored, try qpid_hostname = <ipaddr>
That config param on the wiki needs to be updated, I'll sort that out now, I noticed also it says to start the network service on the compute node I don't think this is required (infact it might cause problems). I'll run through this section of the wiki and see if anything else needs updating. If you notice anything else yourself, feel free to update the document or post here.
Hope this helps, Thanks, Derek.
It looks to me like your missing qpid_hostname from the compute node, because your using qpid as the rpc backend I think rabbit_host is ignored, try qpid_hostname = <ipaddr>
That config param on the wiki needs to be updated, I'll sort that out now, I noticed also it says to start the network service on the compute node I don't think this is required (infact it might cause problems). I'll run through this section of the wiki and see if anything else needs updating. If you notice anything else yourself, feel free to update the document or post here.
Thanks Derek that seemed to work. However, aren't all AMQP derivatives going to have the same endpoint configuration? Why isn't amqp_host and amqp_port? Probably something for the openstack guys.
So now the compute node and the controller are communicating I'm getting a back trace when trying to launch an instance over there.
2012-04-26 07:11:04 TRACE nova.rpc.amqp Traceback (most recent call last): 2012-04-26 07:11:04 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/rpc/amqp.py", line 252, in _process_data 2012-04-26 07:11:04 TRACE nova.rpc.amqp rval = node_func(context=ctxt, **node_args) 2012-04-26 07:11:04 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/exception.py", line 114, in wrapped 2012-04-26 07:11:04 TRACE nova.rpc.amqp return f(*args, **kw) 2012-04-26 07:11:04 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 177, in decorated_function 2012-04-26 07:11:04 TRACE nova.rpc.amqp sys.exc_info()) 2012-04-26 07:11:04 TRACE nova.rpc.amqp File "/usr/lib64/python2.6/contextlib.py", line 23, in __exit__ 2012-04-26 07:11:04 TRACE nova.rpc.amqp self.gen.next() 2012-04-26 07:11:04 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 171, in decorated_function 2012-04-26 07:11:04 TRACE nova.rpc.amqp return function(self, context, instance_uuid, *args, **kwargs) 2012-04-26 07:11:04 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 651, in run_instance 2012-04-26 07:11:04 TRACE nova.rpc.amqp do_run_instance() 2012-04-26 07:11:04 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/utils.py", line 946, in inner 2012-04-26 07:11:04 TRACE nova.rpc.amqp retval = f(*args, **kwargs) 2012-04-26 07:11:04 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 650, in do_run_instance 2012-04-26 07:11:04 TRACE nova.rpc.amqp self._run_instance(context, instance_uuid, **kwargs) 2012-04-26 07:11:04 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 451, in _run_instance 2012-04-26 07:11:04 TRACE nova.rpc.amqp self._set_instance_error_state(context, instance_uuid) 2012-04-26 07:11:04 TRACE nova.rpc.amqp File "/usr/lib64/python2.6/contextlib.py", line 23, in __exit__ 2012-04-26 07:11:04 TRACE nova.rpc.amqp self.gen.next() 2012-04-26 07:11:04 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 420, in _run_instance 2012-04-26 07:11:04 TRACE nova.rpc.amqp image_meta = self._check_image_size(context, instance) 2012-04-26 07:11:04 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 498, in _check_image_size 2012-04-26 07:11:04 TRACE nova.rpc.amqp image_meta = _get_image_meta(context, instance['image_ref']) 2012-04-26 07:11:04 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 184, in _get_image_meta 2012-04-26 07:11:04 TRACE nova.rpc.amqp return image_service.show(context, image_id) 2012-04-26 07:11:04 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/image/glance.py", line 241, in show 2012-04-26 07:11:04 TRACE nova.rpc.amqp _reraise_translated_image_exception(image_id) 2012-04-26 07:11:04 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/image/glance.py", line 239, in show 2012-04-26 07:11:04 TRACE nova.rpc.amqp image_id) 2012-04-26 07:11:04 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/image/glance.py", line 145, in _call_retry 2012-04-26 07:11:04 TRACE nova.rpc.amqp return getattr(client, name)(*args, **kwargs) 2012-04-26 07:11:04 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/glance/client.py", line 101, in get_image_meta 2012-04-26 07:11:04 TRACE nova.rpc.amqp res = self.do_request("HEAD", "/images/%s" % image_id) 2012-04-26 07:11:04 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/glance/common/client.py", line 61, in wrapped 2012-04-26 07:11:04 TRACE nova.rpc.amqp return func(self, *args, **kwargs) 2012-04-26 07:11:04 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/glance/common/client.py", line 420, in do_request 2012-04-26 07:11:04 TRACE nova.rpc.amqp headers=headers) 2012-04-26 07:11:04 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/glance/common/client.py", line 75, in wrapped 2012-04-26 07:11:04 TRACE nova.rpc.amqp return func(self, method, url, body, headers) 2012-04-26 07:11:04 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/glance/common/client.py", line 534, in _do_request 2012-04-26 07:11:04 TRACE nova.rpc.amqp raise exception.NotAuthenticated(res.read()) 2012-04-26 07:11:04 TRACE nova.rpc.amqp ImageNotAuthorized: Not authorized for image 74ac10a0-5522-44f7-b677-b2b5060a4b18. 2012-04-26 07:11:04 TRACE nova.rpc.amqp
The last line is the important bit. Is there a way to run a glance client from the command line to the remote controller? Are there credentials for every node in the cluster and did they get corrupt somehow?
Thoughts?
Thanks, - David Brown
On 04/26/2012 10:21 AM, Brown, David M JR wrote:
It looks to me like your missing qpid_hostname from the compute node, because your using qpid as the rpc backend I think rabbit_host is ignored, try qpid_hostname = <ipaddr>
That config param on the wiki needs to be updated, I'll sort that out now, I noticed also it says to start the network service on the compute node I don't think this is required (infact it might cause problems). I'll run through this section of the wiki and see if anything else needs updating. If you notice anything else yourself, feel free to update the document or post here.
Thanks Derek that seemed to work. However, aren't all AMQP derivatives going to have the same endpoint configuration? Why isn't amqp_host and amqp_port? Probably something for the openstack guys.
Nova has two different AMQP drivers:
rpc_backend=nova.rpc.impl_kombu (for use with RabbitMQ)
or
rpc_backend=nova.rpc.impl_qpid (for use with Qpid)
impl_kombu came before impl_qpid and has options with a "rabbit" prefix. The drivers have some options that overlap and others that don't. We just opted for having each one maintain its own set of options.
Aha!!!
Figured it out.
The compute node also needs to have auth_stragegy = keystone to work properly and pull the compute image across the wire. Might be something to add to the EPEL page for adding a compute node.
Thanks, - David Brown
-----Original Message----- From: Brown, David M JR Sent: Thursday, April 26, 2012 7:21 AM To: 'Derek Higgins'; Fedora Cloud SIG Subject: RE: Essex EPEL Testing
It looks to me like your missing qpid_hostname from the compute node, because your using qpid as the rpc backend I think rabbit_host is ignored, try qpid_hostname = <ipaddr>
That config param on the wiki needs to be updated, I'll sort that out now, I noticed also it says to start the network service on the compute node I don't think this is required (infact it might cause problems). I'll run through this section of the wiki and see if anything else needs updating. If you notice anything else yourself, feel free to update the document or post here.
Thanks Derek that seemed to work. However, aren't all AMQP derivatives going to have the same endpoint configuration? Why isn't amqp_host and amqp_port? Probably something for the openstack guys.
So now the compute node and the controller are communicating I'm getting a back trace when trying to launch an instance over there.
2012-04-26 07:11:04 TRACE nova.rpc.amqp Traceback (most recent call last): 2012-04-26 07:11:04 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/rpc/amqp.py", line 252, in _process_data 2012-04-26 07:11:04 TRACE nova.rpc.amqp rval = node_func(context=ctxt, **node_args) 2012-04-26 07:11:04 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/exception.py", line 114, in wrapped 2012-04-26 07:11:04 TRACE nova.rpc.amqp return f(*args, **kw) 2012-04-26 07:11:04 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 177, in decorated_function 2012-04-26 07:11:04 TRACE nova.rpc.amqp sys.exc_info()) 2012-04-26 07:11:04 TRACE nova.rpc.amqp File "/usr/lib64/python2.6/contextlib.py", line 23, in __exit__ 2012-04-26 07:11:04 TRACE nova.rpc.amqp self.gen.next() 2012-04-26 07:11:04 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 171, in decorated_function 2012-04-26 07:11:04 TRACE nova.rpc.amqp return function(self, context, instance_uuid, *args, **kwargs) 2012-04-26 07:11:04 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 651, in run_instance 2012-04-26 07:11:04 TRACE nova.rpc.amqp do_run_instance() 2012-04-26 07:11:04 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/utils.py", line 946, in inner 2012-04-26 07:11:04 TRACE nova.rpc.amqp retval = f(*args, **kwargs) 2012-04-26 07:11:04 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 650, in do_run_instance 2012-04-26 07:11:04 TRACE nova.rpc.amqp self._run_instance(context, instance_uuid, **kwargs) 2012-04-26 07:11:04 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 451, in _run_instance 2012-04-26 07:11:04 TRACE nova.rpc.amqp self._set_instance_error_state(context, instance_uuid) 2012-04-26 07:11:04 TRACE nova.rpc.amqp File "/usr/lib64/python2.6/contextlib.py", line 23, in __exit__ 2012-04-26 07:11:04 TRACE nova.rpc.amqp self.gen.next() 2012-04-26 07:11:04 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 420, in _run_instance 2012-04-26 07:11:04 TRACE nova.rpc.amqp image_meta = self._check_image_size(context, instance) 2012-04-26 07:11:04 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 498, in _check_image_size 2012-04-26 07:11:04 TRACE nova.rpc.amqp image_meta = _get_image_meta(context, instance['image_ref']) 2012-04-26 07:11:04 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 184, in _get_image_meta 2012-04-26 07:11:04 TRACE nova.rpc.amqp return image_service.show(context, image_id) 2012-04-26 07:11:04 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/image/glance.py", line 241, in show 2012-04-26 07:11:04 TRACE nova.rpc.amqp _reraise_translated_image_exception(image_id) 2012-04-26 07:11:04 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/image/glance.py", line 239, in show 2012-04-26 07:11:04 TRACE nova.rpc.amqp image_id) 2012-04-26 07:11:04 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/image/glance.py", line 145, in _call_retry 2012-04-26 07:11:04 TRACE nova.rpc.amqp return getattr(client, name)(*args, **kwargs) 2012-04-26 07:11:04 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/glance/client.py", line 101, in get_image_meta 2012-04-26 07:11:04 TRACE nova.rpc.amqp res = self.do_request("HEAD", "/images/%s" % image_id) 2012-04-26 07:11:04 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/glance/common/client.py", line 61, in wrapped 2012-04-26 07:11:04 TRACE nova.rpc.amqp return func(self, *args, **kwargs) 2012-04-26 07:11:04 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/glance/common/client.py", line 420, in do_request 2012-04-26 07:11:04 TRACE nova.rpc.amqp headers=headers) 2012-04-26 07:11:04 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/glance/common/client.py", line 75, in wrapped 2012-04-26 07:11:04 TRACE nova.rpc.amqp return func(self, method, url, body, headers) 2012-04-26 07:11:04 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/glance/common/client.py", line 534, in _do_request 2012-04-26 07:11:04 TRACE nova.rpc.amqp raise exception.NotAuthenticated(res.read()) 2012-04-26 07:11:04 TRACE nova.rpc.amqp ImageNotAuthorized: Not authorized for image 74ac10a0-5522-44f7-b677-b2b5060a4b18. 2012-04-26 07:11:04 TRACE nova.rpc.amqp
The last line is the important bit. Is there a way to run a glance client from the command line to the remote controller? Are there credentials for every node in the cluster and did they get corrupt somehow?
Thoughts?
Thanks, - David Brown
Hi David, hi all:
Looking the openstack-nova package, I suggest to build separated packages for api, network, compute, cert, objectstorage, scheduler, docs, python-nova and others needed. I'm thinking in run these components in separated machines like many others users thinking in an enterprise environment. Like it is, is generally used for development environments. What you think about it? I can prepare the sub-packages if you accept this change.
Cheers,
On Thu, Apr 26, 2012 at 12:25 PM, Brown, David M JR david.brown@pnnl.govwrote:
Aha!!!
Figured it out.
The compute node also needs to have auth_stragegy = keystone to work properly and pull the compute image across the wire. Might be something to add to the EPEL page for adding a compute node.
Thanks,
- David Brown
-----Original Message----- From: Brown, David M JR Sent: Thursday, April 26, 2012 7:21 AM To: 'Derek Higgins'; Fedora Cloud SIG Subject: RE: Essex EPEL Testing
It looks to me like your missing qpid_hostname from the compute node,
because your using qpid as the rpc backend I think rabbit_host is ignored, try qpid_hostname = <ipaddr>
That config param on the wiki needs to be updated, I'll sort that out
now, I noticed also it says to start the network service on the compute node I don't think this is required (infact it might cause problems).
I'll run through this section of the wiki and see if anything else needs
updating. If you notice anything else yourself, feel free to update the document or post here.
Thanks Derek that seemed to work. However, aren't all AMQP derivatives going to have the same endpoint configuration? Why isn't amqp_host and amqp_port? Probably something for the openstack guys.
So now the compute node and the controller are communicating I'm getting a back trace when trying to launch an instance over there.
2012-04-26 07:11:04 TRACE nova.rpc.amqp Traceback (most recent call last): 2012-04-26 07:11:04 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/rpc/amqp.py", line 252, in _process_data 2012-04-26 07:11:04 TRACE nova.rpc.amqp rval = node_func(context=ctxt, **node_args) 2012-04-26 07:11:04 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/exception.py", line 114, in wrapped 2012-04-26 07:11:04 TRACE nova.rpc.amqp return f(*args, **kw) 2012-04-26 07:11:04 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 177, in decorated_function 2012-04-26 07:11:04 TRACE nova.rpc.amqp sys.exc_info()) 2012-04-26 07:11:04 TRACE nova.rpc.amqp File "/usr/lib64/python2.6/contextlib.py", line 23, in __exit__ 2012-04-26 07:11:04 TRACE nova.rpc.amqp self.gen.next() 2012-04-26 07:11:04 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 171, in decorated_function 2012-04-26 07:11:04 TRACE nova.rpc.amqp return function(self, context, instance_uuid, *args, **kwargs) 2012-04-26 07:11:04 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 651, in run_instance 2012-04-26 07:11:04 TRACE nova.rpc.amqp do_run_instance() 2012-04-26 07:11:04 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/utils.py", line 946, in inner 2012-04-26 07:11:04 TRACE nova.rpc.amqp retval = f(*args, **kwargs) 2012-04-26 07:11:04 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 650, in do_run_instance 2012-04-26 07:11:04 TRACE nova.rpc.amqp self._run_instance(context, instance_uuid, **kwargs) 2012-04-26 07:11:04 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 451, in _run_instance 2012-04-26 07:11:04 TRACE nova.rpc.amqp self._set_instance_error_state(context, instance_uuid) 2012-04-26 07:11:04 TRACE nova.rpc.amqp File "/usr/lib64/python2.6/contextlib.py", line 23, in __exit__ 2012-04-26 07:11:04 TRACE nova.rpc.amqp self.gen.next() 2012-04-26 07:11:04 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 420, in _run_instance 2012-04-26 07:11:04 TRACE nova.rpc.amqp image_meta = self._check_image_size(context, instance) 2012-04-26 07:11:04 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 498, in _check_image_size 2012-04-26 07:11:04 TRACE nova.rpc.amqp image_meta = _get_image_meta(context, instance['image_ref']) 2012-04-26 07:11:04 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/compute/manager.py", line 184, in _get_image_meta 2012-04-26 07:11:04 TRACE nova.rpc.amqp return image_service.show(context, image_id) 2012-04-26 07:11:04 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/image/glance.py", line 241, in show 2012-04-26 07:11:04 TRACE nova.rpc.amqp _reraise_translated_image_exception(image_id) 2012-04-26 07:11:04 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/image/glance.py", line 239, in show 2012-04-26 07:11:04 TRACE nova.rpc.amqp image_id) 2012-04-26 07:11:04 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/nova/image/glance.py", line 145, in _call_retry 2012-04-26 07:11:04 TRACE nova.rpc.amqp return getattr(client, name)(*args, **kwargs) 2012-04-26 07:11:04 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/glance/client.py", line 101, in get_image_meta 2012-04-26 07:11:04 TRACE nova.rpc.amqp res = self.do_request("HEAD", "/images/%s" % image_id) 2012-04-26 07:11:04 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/glance/common/client.py", line 61, in wrapped 2012-04-26 07:11:04 TRACE nova.rpc.amqp return func(self, *args, **kwargs) 2012-04-26 07:11:04 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/glance/common/client.py", line 420, in do_request 2012-04-26 07:11:04 TRACE nova.rpc.amqp headers=headers) 2012-04-26 07:11:04 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/glance/common/client.py", line 75, in wrapped 2012-04-26 07:11:04 TRACE nova.rpc.amqp return func(self, method, url, body, headers) 2012-04-26 07:11:04 TRACE nova.rpc.amqp File "/usr/lib/python2.6/site-packages/glance/common/client.py", line 534, in _do_request 2012-04-26 07:11:04 TRACE nova.rpc.amqp raise exception.NotAuthenticated(res.read()) 2012-04-26 07:11:04 TRACE nova.rpc.amqp ImageNotAuthorized: Not authorized for image 74ac10a0-5522-44f7-b677-b2b5060a4b18. 2012-04-26 07:11:04 TRACE nova.rpc.amqp
The last line is the important bit. Is there a way to run a glance client from the command line to the remote controller? Are there credentials for every node in the cluster and did they get corrupt somehow?
Thoughts?
Thanks,
- David Brown
cloud mailing list cloud@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/cloud
The packages started out with a pile of separate subpackages, each tiny. There was little benefit seen in maintaining such separation, so the subpackages were removed. If there's a compelling reason to split them again, it can be reconsidered.
-- Matt Domsch Technology Strategist Dell | Office of the CTO ________________________________________ From: cloud-bounces@lists.fedoraproject.org [cloud-bounces@lists.fedoraproject.org] On Behalf Of Marco Sinhoreli [msinhore@gmail.com] Sent: Saturday, April 28, 2012 9:15 AM To: Fedora Cloud SIG Subject: Re: Essex EPEL Testing
Hi David, hi all:
Looking the openstack-nova package, I suggest to build separated packages for api, network, compute, cert, objectstorage, scheduler, docs, python-nova and others needed. I'm thinking in run these components in separated machines like many others users thinking in an enterprise environment. Like it is, is generally used for development environments. What you think about it? I can prepare the sub-packages if you accept this change.
Cheers,
In my opinion the reason to use sub-packages is isolated the nova services in a solution where it could be deployed in differents servers. For example: will be unnecessary to have in a nova-node (compute+network) running the nova-api together or installed. Separating, we could have a more secure environment for large deploys like Data Centers. The package could contemplate meta-packages to full nova install and node-install, for example.
On Sat, Apr 28, 2012 at 1:12 PM, Matt_Domsch@dell.com wrote:
The packages started out with a pile of separate subpackages, each tiny. There was little benefit seen in maintaining such separation, so the subpackages were removed. If there's a compelling reason to split them again, it can be reconsidered.
-- Matt Domsch Technology Strategist Dell | Office of the CTO ________________________________________ From: cloud-bounces@lists.fedoraproject.org [ cloud-bounces@lists.fedoraproject.org] On Behalf Of Marco Sinhoreli [ msinhore@gmail.com] Sent: Saturday, April 28, 2012 9:15 AM To: Fedora Cloud SIG Subject: Re: Essex EPEL Testing
Hi David, hi all:
Looking the openstack-nova package, I suggest to build separated packages for api, network, compute, cert, objectstorage, scheduler, docs, python-nova and others needed. I'm thinking in run these components in separated machines like many others users thinking in an enterprise environment. Like it is, is generally used for development environments. What you think about it? I can prepare the sub-packages if you accept this change.
Cheers,
cloud mailing list cloud@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/cloud
On 04/28/2012 05:27 PM, Marco Sinhoreli wrote:
In my opinion the reason to use sub-packages is isolated the nova services in a solution where it could be deployed in differents servers. For example: will be unnecessary to have in a nova-node (compute+network) running the nova-api together or installed. Separating, we could have a more secure environment for large deploys like Data Centers. The package could contemplate meta-packages to full nova install and node-install, for example.
Having a single package doesn't imply a single server. You can install the package on each server but only enable each service where required.
The main advantage of sub packages is to separate logic and to minimize dependencies, neither of which is significant I think when splitting the nova package.
cheers, Pádraig.
Ok, I accept it. Then, we needed while the package is been installed, remove the "chkconfig -add" command in %post and let the installer decides about it. Will be prudent also change the group of the files where the owner is nova user to have other group different of the root group like nobody group.
Regards!
2012/4/28 Pádraig Brady P@draigbrady.com
On 04/28/2012 05:27 PM, Marco Sinhoreli wrote:
In my opinion the reason to use sub-packages is isolated the nova
services in a solution where it could be deployed in differents servers. For example: will be unnecessary to have in a nova-node (compute+network) running the nova-api together or installed. Separating, we could have a more secure environment for large deploys like Data Centers. The package could contemplate meta-packages to full nova install and node-install, for example.
Having a single package doesn't imply a single server. You can install the package on each server but only enable each service where required.
The main advantage of sub packages is to separate logic and to minimize dependencies, neither of which is significant I think when splitting the nova package.
cheers, Pádraig.
On Sat, Apr 28, 2012 at 6:51 PM, Marco Sinhoreli msinhore@gmail.com wrote:
remove the "chkconfig -add" command in %post
This is fine, it does not enable service to start by default and is required by packaging guidelines http://fedoraproject.org/wiki/Packaging:SysVInitScript#Initscripts_in_spec_f...
and let the installer decides about it
Installer will "chkconfig on" services as required by the selected node function (compute, image etc.)
Alan
there are some problem of init.d start sequence of openstack service.
all are at 20, which start before mysql and qpid
#!/bin/sh # # openstack-nova-api OpenStack Nova API Server # # chkconfig: - 20 80 # description: At the heart of the cloud framework is an API Server. \ # This API Server makes command and control of the \ # hypervisor, storage, and networking programmatically \ # available to users in realization of the definition \ # of cloud computing.
2012/4/26 Brown, David M JR david.brown@pnnl.gov
Fedora Devs,
I just spent the last couple of days fighting with Essex on RHEL6 and its been entertaining and I'd like to share some of the oddities and experiences.
System configuration is the following.
Two nodes on their own /24 connected by cross over to each other on the second interface. The first node is the cloud controller and has tons of storage (11T) and 32Gb ram and 16 cores The second node I would like to make an extra compute node and it has 24Gb ram and 8 cores (still in a work in progress)
Originally the cloud controller was running Diablo on RHEL6 and was working fine.
I couldn't find any 'upgrade' instructions for going between Diablo and Essex and I wasn't too worried because the usage of the cloud was limited to just a couple of guys. So I was satisfied with backing up manually all the data and rebuild the cluster. I noticed when I did the update that things stopped working and following the install instructions blew away all local data in the cloud.
I was following the instructions found at the following URL.
http://fedoraproject.org/wiki/Getting_started_with_OpenStack_EPEL
I got the packages from
http://pbrady.fedorapeople.org/openstack-el6/
First issue. Wow, this is long, its almost long enough that making an uber script in a common package somewhere to run would strip out of most of the manual commands to run. I'd suggest first pulling out all the openstack-config-set commands and put them in a script to run. Not sure what to do about the swift documentation bits, that seems like a very manual set of configurations why aren't they part of the swift rpm? Another suggestion would be to split it out into a couple of documents one describing installation and configuration then the next describing putting data/users into it and starting stuff? thoughts?
After I got everything setup and working I noticed an issue with the dashboard, most of the static stuff wasn't showing up I had to add a symlink. /usr/share/openstack-dashboard/static -> openstack_dashboard/static Then the dashboard picked up the right stuff and it worked.
There's some consistency issues and I'm not sure if this is an openstack issue in general. The euca tools and how you configure them with keystone only seem to work with your personal instances and configuration. However, the dashboard seems to show users everything associated with the project instead. For example when I allocate floating IPs from the website those won't show up when I run euca-describe-addresses and respectively euca-allocate-address won't show up the IP allocated in the dashboard. I've looked at the database and the project ids are used when using the dashboard and user ids are used when using the euca tools. I think the euca tools could be setup to see everything that the dashboard sees however the documentation doesn't point to how to do that.
There also seems to be some serious functionality faults that I can't seem to make work. I can't make a user attached to multiple projects, not sure how to do that. Also, seems like there's a lot of, "huh, that doesn't seem implemented yet." However, this seems like a general openstack issue, documentation says X but that doesn't work yet or anymore.
I'm having a serious issue not getting a the second compute node working `nova-manage service list' doesn't show ':-)' for the compute and network services running on that node. I've followed the instructions to the letter and tried getting things working but its not going.
nova.conf for the controller.
[DEFAULT] logdir = /var/log/nova state_path = /var/lib/nova lock_path = /var/lib/nova/tmp dhcpbridge = /usr/bin/nova-dhcpbridge dhcpbridge_flagfile = /etc/nova/nova.conf force_dhcp_release = False injected_network_template = /usr/share/nova/interfaces.template libvirt_xml_template = /usr/share/nova/libvirt.xml.template libvirt_nonblocking = True vpn_client_template = /usr/share/nova/client.ovpn.template credentials_template = /usr/share/nova/novarc.template network_manager = nova.network.manager.FlatDHCPManager iscsi_helper = tgtadm sql_connection = mysql://nova:nova@localhost/nova connection_type = libvirt firewall_driver = nova.virt.libvirt.firewall.IptablesFirewallDriver rpc_backend = nova.rpc.impl_qpid root_helper = sudo nova-rootwrap auth_strategy = keystone public_interface = eth0 quota_floating_ips = 100
nova.conf on compute node
[DEFAULT] logdir = /var/log/nova state_path = /var/lib/nova lock_path = /var/lib/nova/tmp dhcpbridge = /usr/bin/nova-dhcpbridge dhcpbridge_flagfile = /etc/nova/nova.conf force_dhcp_release = True injected_network_template = /usr/share/nova/interfaces.template libvirt_xml_template = /usr/share/nova/libvirt.xml.template libvirt_nonblocking = True vpn_client_template = /usr/share/nova/client.ovpn.template credentials_template = /usr/share/nova/novarc.template network_manager = nova.network.manager.FlatDHCPManager iscsi_helper = tgtadm sql_connection = mysql://nova:nova@CC_NAME/nova connection_type = libvirt firewall_driver = nova.virt.libvirt.firewall.IptablesFirewallDriver rpc_backend = nova.rpc.impl_qpid root_helper = sudo nova-rootwrap rabbit_host = CC_NAME glance_api_servers = CC_NAME:9292 iscsi_ip_prefix = CC_ADDR public_interface = eth2 verbose = True s3_host = CC_NAME ec2_api = CC_NAME ec2_url = http://CC_NAME:8773/services/Cloud fixed_range = 10.0.0.0/24 network_size = 256
Any help would be helpful.
Thanks,
- David Brown
cloud mailing list cloud@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/cloud
On Tue, May 01, 2012 at 04:37:36PM +0800, 彭勇 wrote:
there are some problem of init.d start sequence of openstack service.
all are at 20, which start before mysql and qpid
And also before libvirtd, which is 97
#!/bin/sh # # openstack-nova-api OpenStack Nova API Server # # chkconfig: - 20 80 # description: At the heart of the cloud framework is an API Server. \ # This API Server makes command and control of the \ # hypervisor, storage, and networking programmatically \ # available to users in realization of the definition \ # of cloud computing.
I don't think there's a need for any of these to be started early, since nothing depends on them. I'd suggest s/20 80/98 02/
Daniel
On 05/01/2012 09:39 AM, Daniel P. Berrange wrote:
On Tue, May 01, 2012 at 04:37:36PM +0800, 彭勇 wrote:
there are some problem of init.d start sequence of openstack service.
all are at 20, which start before mysql and qpid
And also before libvirtd, which is 97
#!/bin/sh # # openstack-nova-api OpenStack Nova API Server # # chkconfig: - 20 80 # description: At the heart of the cloud framework is an API Server. \ # This API Server makes command and control of the \ # hypervisor, storage, and networking programmatically \ # available to users in realization of the definition \ # of cloud computing.
I don't think there's a need for any of these to be started early, since nothing depends on them. I'd suggest s/20 80/98 02/
Note the services should periodically reconnect to qpid etc., but yes the order is incorrect here. I'll change as suggested.
thanks, Pádraig.