Hi, In my simple experiment, i connected to a SHAREDFS storage server and then created a data domain But the createStorageDomain failed with code 351, which just says "Error creating a storage domain".
How to find out what the real reason behind the failure.
Surprisingly, the domain dir structure does get created, so looks like it worked, but still it gives failure as the return result, why ?
Sample code...
#!/usr/bin/python # GPLv2+
import sys import uuid import time
sys.path.append('/usr/share/vdsm')
import vdscli from storage.sd import SHAREDFS_DOMAIN, DATA_DOMAIN, ISO_DOMAIN from storage.volume import COW_FORMAT, SPARSE_VOL, LEAF_VOL, BLANK_UUID
spUUID = str(uuid.uuid4()) sdUUID = str(uuid.uuid4()) imgUUID = str(uuid.uuid4()) volUUID = str(uuid.uuid4())
print "spUUID = %s"%spUUID print "sdUUID = %s"%sdUUID print "imgUUID = %s"%imgUUID print "volUUID = %s"%volUUID
gluster_conn = "llm65.in.ibm.com:myvol"
s = vdscli.connect()
masterVersion = 1 hostID = 1
def vdsOK(d): print d if d['status']['code']: raise Exception(str(d)) return d
def waitTask(s, taskid): while vdsOK(s.getTaskStatus(taskid))['taskStatus']['taskState'] != 'finished': time.sleep(3) vdsOK(s.clearTask(taskid))
vdsOK(s.connectStorageServer(SHAREDFS_DOMAIN, "my gluster mount", [dict(id=1, spec=gluster_conn, vfs_type="glusterfs", mnt_options="")]))
vdsOK(s.createStorageDomain(SHAREDFS_DOMAIN, sdUUID, "my gluster domain", gluster_conn, DATA_DOMAIN, 0))
Output...
./dpk-sharedfs-vm.py spUUID = 852110d5-c3d2-456e-ae75-b72e929e9bae sdUUID = 1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe imgUUID = c29100e7-19cd-4a27-adc6-4c35cc5e690c volUUID = 1d074f24-8bf0-4b68-8a35-40c3f2c33723 {'status': {'message': 'OK', 'code': 0}, 'statuslist': [{'status': 0, 'id': 1}]} {'status': {'message': "Error creating a storage domain: ('storageType=6, sdUUID=1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe, domainName=my gluster domain, domClass=1, typeSpecificArg=llm65.in.ibm.com:myvol domVersion=0',)", 'code': 351}} Traceback (most recent call last): File "./dpk-sharedfs-vm.py", line 74, in <module> vdsOK(s.createStorageDomain(SHAREDFS_DOMAIN, sdUUID, "my gluster domain", gluster_conn, DATA_DOMAIN, 0)) File "./dpk-sharedfs-vm.py", line 62, in vdsOK raise Exception(str(d)) Exception: {'status': {'message': "Error creating a storage domain: ('storageType=6, sdUUID=1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe, domainName=my gluster domain, domClass=1, typeSpecificArg=llm65.in.ibm.com:myvol domVersion=0',)", 'code': 351}}
But it did create the dir structure...
]# find /rhev/data-center/mnt/llm65.in.ibm.com:myvol/ /rhev/data-center/mnt/llm65.in.ibm.com:myvol/ /rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe /rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md /rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/metadata /rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/leases /rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/outbox /rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/inbox /rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/ids /rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/images
# mount | grep gluster llm65.in.ibm.com:myvol on /rhev/data-center/mnt/llm65.in.ibm.com:myvol type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)
On 03/02/2012 11:27 PM, Deepak C Shetty wrote:
Hi, In my simple experiment, i connected to a SHAREDFS storage server and then created a data domain But the createStorageDomain failed with code 351, which just says "Error creating a storage domain".
How to find out what the real reason behind the failure.
Surprisingly, the domain dir structure does get created, so looks like it worked, but still it gives failure as the return result, why ?
Sample code...
#!/usr/bin/python # GPLv2+
import sys import uuid import time
sys.path.append('/usr/share/vdsm')
import vdscli from storage.sd import SHAREDFS_DOMAIN, DATA_DOMAIN, ISO_DOMAIN from storage.volume import COW_FORMAT, SPARSE_VOL, LEAF_VOL, BLANK_UUID
spUUID = str(uuid.uuid4()) sdUUID = str(uuid.uuid4()) imgUUID = str(uuid.uuid4()) volUUID = str(uuid.uuid4())
print "spUUID = %s"%spUUID print "sdUUID = %s"%sdUUID print "imgUUID = %s"%imgUUID print "volUUID = %s"%volUUID
gluster_conn = "llm65.in.ibm.com:myvol"
s = vdscli.connect()
masterVersion = 1 hostID = 1
def vdsOK(d): print d if d['status']['code']: raise Exception(str(d)) return d
def waitTask(s, taskid): while vdsOK(s.getTaskStatus(taskid))['taskStatus']['taskState'] != 'finished': time.sleep(3) vdsOK(s.clearTask(taskid))
vdsOK(s.connectStorageServer(SHAREDFS_DOMAIN, "my gluster mount", [dict(id=1, spec=gluster_conn, vfs_type="glusterfs", mnt_options="")]))
vdsOK(s.createStorageDomain(SHAREDFS_DOMAIN, sdUUID, "my gluster domain", gluster_conn, DATA_DOMAIN, 0))
Output...
./dpk-sharedfs-vm.py spUUID = 852110d5-c3d2-456e-ae75-b72e929e9bae sdUUID = 1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe imgUUID = c29100e7-19cd-4a27-adc6-4c35cc5e690c volUUID = 1d074f24-8bf0-4b68-8a35-40c3f2c33723 {'status': {'message': 'OK', 'code': 0}, 'statuslist': [{'status': 0, 'id': 1}]} {'status': {'message': "Error creating a storage domain: ('storageType=6, sdUUID=1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe, domainName=my gluster domain, domClass=1, typeSpecificArg=llm65.in.ibm.com:myvol domVersion=0',)", 'code': 351}} Traceback (most recent call last): File "./dpk-sharedfs-vm.py", line 74, in <module> vdsOK(s.createStorageDomain(SHAREDFS_DOMAIN, sdUUID, "my gluster domain", gluster_conn, DATA_DOMAIN, 0)) File "./dpk-sharedfs-vm.py", line 62, in vdsOK raise Exception(str(d)) Exception: {'status': {'message': "Error creating a storage domain: ('storageType=6, sdUUID=1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe, domainName=my gluster domain, domClass=1, typeSpecificArg=llm65.in.ibm.com:myvol domVersion=0',)", 'code': 351}}
But it did create the dir structure...
]# find /rhev/data-center/mnt/llm65.in.ibm.com:myvol/ /rhev/data-center/mnt/llm65.in.ibm.com:myvol/ /rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe
/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md
/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/metadata
/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/leases
/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/outbox
/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/inbox
/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/ids
/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/images
# mount | grep gluster llm65.in.ibm.com:myvol on /rhev/data-center/mnt/llm65.in.ibm.com:myvol type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)
Attaching the vdsm.log....
Thread-46::INFO::2012-03-03 04:49:16,092::nfsSD::64::Storage.StorageDomain::(create) sdUUID=1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe domainName=my gluster domain remotePath=llm65.in.ibm.com:myvol domClass=1 Thread-46::DEBUG::2012-03-03 04:49:16,111::persistentDict::175::Storage.PersistentDict::(__init__) Created a persistant dict with FileMetadataRW backend Thread-46::DEBUG::2012-03-03 04:49:16,113::persistentDict::216::Storage.PersistentDict::(refresh) read lines (FileMetadataRW)=[] Thread-46::WARNING::2012-03-03 04:49:16,113::persistentDict::238::Storage.PersistentDict::(refresh) data has no embedded checksum - trust it as it is Thread-46::DEBUG::2012-03-03 04:49:16,113::persistentDict::152::Storage.PersistentDict::(transaction) Starting transaction Thread-46::DEBUG::2012-03-03 04:49:16,114::persistentDict::158::Storage.PersistentDict::(transaction) Flushing changes Thread-46::DEBUG::2012-03-03 04:49:16,114::persistentDict::277::Storage.PersistentDict::(flush) about to write lines (FileMetadataRW)=['CLASS=Data', 'DESCRIPTION=my gluster domain', 'IOOPTIMEOUTSEC=1', 'LEASERETRIES=3', 'LEASETIMESEC=5', 'LOCKPOLICY=', 'LOCKRENEWALINTERVALSEC=5', 'POOL_UUID=', 'REMOTE_PATH=llm65.in.ibm.com:myvol', 'ROLE=Regular', 'SDUUID=1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe', 'TYPE=SHAREDFS', 'VERSION=0', '_SHA_CKSUM=c8ba67889d4b62ccd9fd368c584501404e8ee84e'] Thread-46::DEBUG::2012-03-03 04:49:16,118::persistentDict::160::Storage.PersistentDict::(transaction) Finished transaction Thread-46::DEBUG::2012-03-03 04:49:16,120::fileSD::98::Storage.StorageDomain::(__init__) Reading domain in path /rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe Thread-46::DEBUG::2012-03-03 04:49:16,120::persistentDict::175::Storage.PersistentDict::(__init__) Created a persistant dict with FileMetadataRW backend Thread-46::ERROR::2012-03-03 04:49:16,121::task::855::TaskManager.Task::(_setError) Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::Unexpected error Traceback (most recent call last): File "/usr/share/vdsm/storage/task.py", line 863, in _run return fn(*args, **kargs) File "/usr/share/vdsm/logUtils.py", line 38, in wrapper res = f(*args, **kwargs) File "/usr/share/vdsm/storage/hsm.py", line 1922, in createStorageDomain typeSpecificArg, storageType, domVersion) File "/usr/share/vdsm/storage/nfsSD.py", line 87, in create fsd = cls(os.path.join(mntPoint, sdUUID)) File "/usr/share/vdsm/storage/fileSD.py", line 104, in __init__ sdUUID = metadata[sd.DMDK_SDUUID] File "/usr/share/vdsm/storage/persistentDict.py", line 75, in __getitem__ return dec(self._dict[key]) File "/usr/share/vdsm/storage/persistentDict.py", line 183, in __getitem__ with self._accessWrapper(): File "/usr/lib64/python2.7/contextlib.py", line 17, in __enter__ return self.gen.next() File "/usr/share/vdsm/storage/persistentDict.py", line 137, in _accessWrapper self.refresh() File "/usr/share/vdsm/storage/persistentDict.py", line 214, in refresh lines = self._metaRW.readlines() File "/usr/share/vdsm/storage/fileSD.py", line 71, in readlines return misc.stripNewLines(self._oop.directReadLines(self._metafile)) File "/usr/share/vdsm/storage/processPool.py", line 53, in wrapper return self.runExternally(func, *args, **kwds) File "/usr/share/vdsm/storage/processPool.py", line 64, in runExternally return self._procPool.runExternally(*args, **kwargs) File "/usr/share/vdsm/storage/processPool.py", line 154, in runExternally raise err
OSError: [Errno 22] Invalid argument: '/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/metadata' Thread-46::DEBUG::2012-03-03 04:49:16,129::task::874::TaskManager.Task::(_run) Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::Task._run: 9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0 (6, '1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe', 'my gluster domain', 'llm65.in.ibm.com:myvol', 1, 0) {} failed - stopping task Thread-46::DEBUG::2012-03-03 04:49:16,130::task::1201::TaskManager.Task::(stop) Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::stopping in state preparing (force False) Thread-46::DEBUG::2012-03-03 04:49:16,130::task::980::TaskManager.Task::(_decref) Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::ref 1 aborting True Thread-46::INFO::2012-03-03 04:49:16,130::task::1159::TaskManager.Task::(prepare) Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::aborting: Task is aborted: "[Errno 22] Invalid argument: '/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/metadata'" - code 100 Thread-46::DEBUG::2012-03-03 04:49:16,130::task::1164::TaskManager.Task::(prepare) Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::Prepare: aborted: [Errno 22] Invalid argument: '/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/metadata' Thread-46::DEBUG::2012-03-03 04:49:16,130::task::980::TaskManager.Task::(_decref) Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::ref 0 aborting True Thread-46::DEBUG::2012-03-03 04:49:16,131::task::915::TaskManager.Task::(_doAbort) Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::Task._doAbort: force False Thread-46::DEBUG::2012-03-03 04:49:16,131::resourceManager::841::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} Thread-46::DEBUG::2012-03-03 04:49:16,131::task::588::TaskManager.Task::(_updateState) Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::moving from state preparing -> state aborting Thread-46::DEBUG::2012-03-03 04:49:16,131::task::537::TaskManager.Task::(__state_aborting) Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::_aborting: recover policy none Thread-46::DEBUG::2012-03-03 04:49:16,132::task::588::TaskManager.Task::(_updateState) Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::moving from state aborting -> state failed Thread-46::DEBUG::2012-03-03 04:49:16,132::resourceManager::806::ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {} Thread-46::DEBUG::2012-03-03 04:49:16,132::resourceManager::841::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} Thread-46::ERROR::2012-03-03 04:49:16,132::dispatcher::93::Storage.Dispatcher.Protect::(run) [Errno 22] Invalid argument: '/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/metadata' Traceback (most recent call last): File "/usr/share/vdsm/storage/dispatcher.py", line 85, in run result = ctask.prepare(self.func, *args, **kwargs) File "/usr/share/vdsm/storage/task.py", line 1166, in prepare raise self.error OSError: [Errno 22] Invalid argument: '/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/metadata'
On 03/02/2012 11:54 PM, Deepak C Shetty wrote:
On 03/02/2012 11:27 PM, Deepak C Shetty wrote:
Hi, In my simple experiment, i connected to a SHAREDFS storage server and then created a data domain But the createStorageDomain failed with code 351, which just says "Error creating a storage domain".
How to find out what the real reason behind the failure.
Surprisingly, the domain dir structure does get created, so looks like it worked, but still it gives failure as the return result, why ?
Sample code...
#!/usr/bin/python # GPLv2+
import sys import uuid import time
sys.path.append('/usr/share/vdsm')
import vdscli from storage.sd import SHAREDFS_DOMAIN, DATA_DOMAIN, ISO_DOMAIN from storage.volume import COW_FORMAT, SPARSE_VOL, LEAF_VOL, BLANK_UUID
spUUID = str(uuid.uuid4()) sdUUID = str(uuid.uuid4()) imgUUID = str(uuid.uuid4()) volUUID = str(uuid.uuid4())
print "spUUID = %s"%spUUID print "sdUUID = %s"%sdUUID print "imgUUID = %s"%imgUUID print "volUUID = %s"%volUUID
gluster_conn = "llm65.in.ibm.com:myvol"
s = vdscli.connect()
masterVersion = 1 hostID = 1
def vdsOK(d): print d if d['status']['code']: raise Exception(str(d)) return d
def waitTask(s, taskid): while vdsOK(s.getTaskStatus(taskid))['taskStatus']['taskState'] != 'finished': time.sleep(3) vdsOK(s.clearTask(taskid))
vdsOK(s.connectStorageServer(SHAREDFS_DOMAIN, "my gluster mount", [dict(id=1, spec=gluster_conn, vfs_type="glusterfs", mnt_options="")]))
vdsOK(s.createStorageDomain(SHAREDFS_DOMAIN, sdUUID, "my gluster domain", gluster_conn, DATA_DOMAIN, 0))
Output...
./dpk-sharedfs-vm.py spUUID = 852110d5-c3d2-456e-ae75-b72e929e9bae sdUUID = 1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe imgUUID = c29100e7-19cd-4a27-adc6-4c35cc5e690c volUUID = 1d074f24-8bf0-4b68-8a35-40c3f2c33723 {'status': {'message': 'OK', 'code': 0}, 'statuslist': [{'status': 0, 'id': 1}]} {'status': {'message': "Error creating a storage domain: ('storageType=6, sdUUID=1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe, domainName=my gluster domain, domClass=1, typeSpecificArg=llm65.in.ibm.com:myvol domVersion=0',)", 'code': 351}} Traceback (most recent call last): File "./dpk-sharedfs-vm.py", line 74, in <module> vdsOK(s.createStorageDomain(SHAREDFS_DOMAIN, sdUUID, "my gluster domain", gluster_conn, DATA_DOMAIN, 0)) File "./dpk-sharedfs-vm.py", line 62, in vdsOK raise Exception(str(d)) Exception: {'status': {'message': "Error creating a storage domain: ('storageType=6, sdUUID=1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe, domainName=my gluster domain, domClass=1, typeSpecificArg=llm65.in.ibm.com:myvol domVersion=0',)", 'code': 351}}
But it did create the dir structure...
]# find /rhev/data-center/mnt/llm65.in.ibm.com:myvol/ /rhev/data-center/mnt/llm65.in.ibm.com:myvol/ /rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe
/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md
/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/metadata
/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/leases
/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/outbox
/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/inbox
/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/ids
/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/images
# mount | grep gluster llm65.in.ibm.com:myvol on /rhev/data-center/mnt/llm65.in.ibm.com:myvol type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)
Attaching the vdsm.log....
Thread-46::INFO::2012-03-03 04:49:16,092::nfsSD::64::Storage.StorageDomain::(create) sdUUID=1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe domainName=my gluster domain remotePath=llm65.in.ibm.com:myvol domClass=1 Thread-46::DEBUG::2012-03-03 04:49:16,111::persistentDict::175::Storage.PersistentDict::(__init__) Created a persistant dict with FileMetadataRW backend Thread-46::DEBUG::2012-03-03 04:49:16,113::persistentDict::216::Storage.PersistentDict::(refresh) read lines (FileMetadataRW)=[] Thread-46::WARNING::2012-03-03 04:49:16,113::persistentDict::238::Storage.PersistentDict::(refresh) data has no embedded checksum - trust it as it is Thread-46::DEBUG::2012-03-03 04:49:16,113::persistentDict::152::Storage.PersistentDict::(transaction) Starting transaction Thread-46::DEBUG::2012-03-03 04:49:16,114::persistentDict::158::Storage.PersistentDict::(transaction) Flushing changes Thread-46::DEBUG::2012-03-03 04:49:16,114::persistentDict::277::Storage.PersistentDict::(flush) about to write lines (FileMetadataRW)=['CLASS=Data', 'DESCRIPTION=my gluster domain', 'IOOPTIMEOUTSEC=1', 'LEASERETRIES=3', 'LEASETIMESEC=5', 'LOCKPOLICY=', 'LOCKRENEWALINTERVALSEC=5', 'POOL_UUID=', 'REMOTE_PATH=llm65.in.ibm.com:myvol', 'ROLE=Regular', 'SDUUID=1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe', 'TYPE=SHAREDFS', 'VERSION=0', '_SHA_CKSUM=c8ba67889d4b62ccd9fd368c584501404e8ee84e'] Thread-46::DEBUG::2012-03-03 04:49:16,118::persistentDict::160::Storage.PersistentDict::(transaction) Finished transaction Thread-46::DEBUG::2012-03-03 04:49:16,120::fileSD::98::Storage.StorageDomain::(__init__) Reading domain in path /rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe Thread-46::DEBUG::2012-03-03 04:49:16,120::persistentDict::175::Storage.PersistentDict::(__init__) Created a persistant dict with FileMetadataRW backend Thread-46::ERROR::2012-03-03 04:49:16,121::task::855::TaskManager.Task::(_setError) Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::Unexpected error Traceback (most recent call last): File "/usr/share/vdsm/storage/task.py", line 863, in _run return fn(*args, **kargs) File "/usr/share/vdsm/logUtils.py", line 38, in wrapper res = f(*args, **kwargs) File "/usr/share/vdsm/storage/hsm.py", line 1922, in createStorageDomain typeSpecificArg, storageType, domVersion) File "/usr/share/vdsm/storage/nfsSD.py", line 87, in create fsd = cls(os.path.join(mntPoint, sdUUID)) File "/usr/share/vdsm/storage/fileSD.py", line 104, in __init__ sdUUID = metadata[sd.DMDK_SDUUID] File "/usr/share/vdsm/storage/persistentDict.py", line 75, in __getitem__ return dec(self._dict[key]) File "/usr/share/vdsm/storage/persistentDict.py", line 183, in __getitem__ with self._accessWrapper(): File "/usr/lib64/python2.7/contextlib.py", line 17, in __enter__ return self.gen.next() File "/usr/share/vdsm/storage/persistentDict.py", line 137, in _accessWrapper self.refresh() File "/usr/share/vdsm/storage/persistentDict.py", line 214, in refresh lines = self._metaRW.readlines() File "/usr/share/vdsm/storage/fileSD.py", line 71, in readlines return misc.stripNewLines(self._oop.directReadLines(self._metafile)) File "/usr/share/vdsm/storage/processPool.py", line 53, in wrapper return self.runExternally(func, *args, **kwds) File "/usr/share/vdsm/storage/processPool.py", line 64, in runExternally return self._procPool.runExternally(*args, **kwargs) File "/usr/share/vdsm/storage/processPool.py", line 154, in runExternally raise err
OSError: [Errno 22] Invalid argument: '/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/metadata' Thread-46::DEBUG::2012-03-03 04:49:16,129::task::874::TaskManager.Task::(_run) Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::Task._run: 9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0 (6, '1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe', 'my gluster domain', 'llm65.in.ibm.com:myvol', 1, 0) {} failed - stopping task Thread-46::DEBUG::2012-03-03 04:49:16,130::task::1201::TaskManager.Task::(stop) Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::stopping in state preparing (force False) Thread-46::DEBUG::2012-03-03 04:49:16,130::task::980::TaskManager.Task::(_decref) Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::ref 1 aborting True Thread-46::INFO::2012-03-03 04:49:16,130::task::1159::TaskManager.Task::(prepare) Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::aborting: Task is aborted: "[Errno 22] Invalid argument: '/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/metadata'"
- code 100
Thread-46::DEBUG::2012-03-03 04:49:16,130::task::1164::TaskManager.Task::(prepare) Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::Prepare: aborted: [Errno 22] Invalid argument: '/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/metadata' Thread-46::DEBUG::2012-03-03 04:49:16,130::task::980::TaskManager.Task::(_decref) Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::ref 0 aborting True Thread-46::DEBUG::2012-03-03 04:49:16,131::task::915::TaskManager.Task::(_doAbort) Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::Task._doAbort: force False Thread-46::DEBUG::2012-03-03 04:49:16,131::resourceManager::841::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} Thread-46::DEBUG::2012-03-03 04:49:16,131::task::588::TaskManager.Task::(_updateState) Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::moving from state preparing -> state aborting Thread-46::DEBUG::2012-03-03 04:49:16,131::task::537::TaskManager.Task::(__state_aborting) Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::_aborting: recover policy none Thread-46::DEBUG::2012-03-03 04:49:16,132::task::588::TaskManager.Task::(_updateState) Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::moving from state aborting -> state failed Thread-46::DEBUG::2012-03-03 04:49:16,132::resourceManager::806::ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {} Thread-46::DEBUG::2012-03-03 04:49:16,132::resourceManager::841::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} Thread-46::ERROR::2012-03-03 04:49:16,132::dispatcher::93::Storage.Dispatcher.Protect::(run) [Errno 22] Invalid argument: '/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/metadata' Traceback (most recent call last): File "/usr/share/vdsm/storage/dispatcher.py", line 85, in run result = ctask.prepare(self.func, *args, **kwargs) File "/usr/share/vdsm/storage/task.py", line 1166, in prepare raise self.error OSError: [Errno 22] Invalid argument: '/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/metadata'
Hi Saggie, Wondering if you could offer some help here...
I did some more debug and figured that the metafile (for which the above excp is being thrown) is opened in O_DIRECT|O_RDONLY mode in fileUtils.py. A sample python code i tried, throws the same excp (Errno 22) when trying to read any file using os.read(f,100) that is opened is O_DIRECT mode.
Going deeper into vdsm code, I see that in DirectFile.read()/readall(), libc.read is being used and not os.read.. libc.read is being fed the aligned buffers, so wondering why the Errno 22 is still coming ?
On Mon, Mar 05, 2012 at 12:04:36AM +0530, Deepak C Shetty wrote:
On 03/02/2012 11:54 PM, Deepak C Shetty wrote:
On 03/02/2012 11:27 PM, Deepak C Shetty wrote:
Hi, In my simple experiment, i connected to a SHAREDFS storage server and then created a data domain But the createStorageDomain failed with code 351, which just says "Error creating a storage domain".
How to find out what the real reason behind the failure.
Surprisingly, the domain dir structure does get created, so looks like it worked, but still it gives failure as the return result, why ?
Sample code...
#!/usr/bin/python # GPLv2+
import sys import uuid import time
sys.path.append('/usr/share/vdsm')
import vdscli from storage.sd import SHAREDFS_DOMAIN, DATA_DOMAIN, ISO_DOMAIN from storage.volume import COW_FORMAT, SPARSE_VOL, LEAF_VOL, BLANK_UUID
spUUID = str(uuid.uuid4()) sdUUID = str(uuid.uuid4()) imgUUID = str(uuid.uuid4()) volUUID = str(uuid.uuid4())
print "spUUID = %s"%spUUID print "sdUUID = %s"%sdUUID print "imgUUID = %s"%imgUUID print "volUUID = %s"%volUUID
gluster_conn = "llm65.in.ibm.com:myvol"
s = vdscli.connect()
masterVersion = 1 hostID = 1
def vdsOK(d): print d if d['status']['code']: raise Exception(str(d)) return d
def waitTask(s, taskid): while vdsOK(s.getTaskStatus(taskid))['taskStatus']['taskState'] != 'finished': time.sleep(3) vdsOK(s.clearTask(taskid))
vdsOK(s.connectStorageServer(SHAREDFS_DOMAIN, "my gluster mount", [dict(id=1, spec=gluster_conn, vfs_type="glusterfs", mnt_options="")]))
vdsOK(s.createStorageDomain(SHAREDFS_DOMAIN, sdUUID, "my gluster domain", gluster_conn, DATA_DOMAIN, 0))
Output...
./dpk-sharedfs-vm.py spUUID = 852110d5-c3d2-456e-ae75-b72e929e9bae sdUUID = 1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe imgUUID = c29100e7-19cd-4a27-adc6-4c35cc5e690c volUUID = 1d074f24-8bf0-4b68-8a35-40c3f2c33723 {'status': {'message': 'OK', 'code': 0}, 'statuslist': [{'status': 0, 'id': 1}]} {'status': {'message': "Error creating a storage domain: ('storageType=6, sdUUID=1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe, domainName=my gluster domain, domClass=1, typeSpecificArg=llm65.in.ibm.com:myvol domVersion=0',)", 'code': 351}} Traceback (most recent call last): File "./dpk-sharedfs-vm.py", line 74, in <module> vdsOK(s.createStorageDomain(SHAREDFS_DOMAIN, sdUUID, "my gluster domain", gluster_conn, DATA_DOMAIN, 0)) File "./dpk-sharedfs-vm.py", line 62, in vdsOK raise Exception(str(d)) Exception: {'status': {'message': "Error creating a storage domain: ('storageType=6, sdUUID=1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe, domainName=my gluster domain, domClass=1, typeSpecificArg=llm65.in.ibm.com:myvol domVersion=0',)", 'code': 351}}
But it did create the dir structure...
]# find /rhev/data-center/mnt/llm65.in.ibm.com:myvol/ /rhev/data-center/mnt/llm65.in.ibm.com:myvol/ /rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe
/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md
/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/metadata
/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/leases
/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/outbox
/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/inbox
/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/ids
/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/images
# mount | grep gluster llm65.in.ibm.com:myvol on /rhev/data-center/mnt/llm65.in.ibm.com:myvol type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)
Attaching the vdsm.log....
Thread-46::INFO::2012-03-03 04:49:16,092::nfsSD::64::Storage.StorageDomain::(create) sdUUID=1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe domainName=my gluster domain remotePath=llm65.in.ibm.com:myvol domClass=1 Thread-46::DEBUG::2012-03-03 04:49:16,111::persistentDict::175::Storage.PersistentDict::(__init__) Created a persistant dict with FileMetadataRW backend Thread-46::DEBUG::2012-03-03 04:49:16,113::persistentDict::216::Storage.PersistentDict::(refresh) read lines (FileMetadataRW)=[] Thread-46::WARNING::2012-03-03 04:49:16,113::persistentDict::238::Storage.PersistentDict::(refresh) data has no embedded checksum - trust it as it is Thread-46::DEBUG::2012-03-03 04:49:16,113::persistentDict::152::Storage.PersistentDict::(transaction) Starting transaction Thread-46::DEBUG::2012-03-03 04:49:16,114::persistentDict::158::Storage.PersistentDict::(transaction) Flushing changes Thread-46::DEBUG::2012-03-03 04:49:16,114::persistentDict::277::Storage.PersistentDict::(flush) about to write lines (FileMetadataRW)=['CLASS=Data', 'DESCRIPTION=my gluster domain', 'IOOPTIMEOUTSEC=1', 'LEASERETRIES=3', 'LEASETIMESEC=5', 'LOCKPOLICY=', 'LOCKRENEWALINTERVALSEC=5', 'POOL_UUID=', 'REMOTE_PATH=llm65.in.ibm.com:myvol', 'ROLE=Regular', 'SDUUID=1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe', 'TYPE=SHAREDFS', 'VERSION=0', '_SHA_CKSUM=c8ba67889d4b62ccd9fd368c584501404e8ee84e'] Thread-46::DEBUG::2012-03-03 04:49:16,118::persistentDict::160::Storage.PersistentDict::(transaction) Finished transaction Thread-46::DEBUG::2012-03-03 04:49:16,120::fileSD::98::Storage.StorageDomain::(__init__) Reading domain in path /rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe Thread-46::DEBUG::2012-03-03 04:49:16,120::persistentDict::175::Storage.PersistentDict::(__init__) Created a persistant dict with FileMetadataRW backend Thread-46::ERROR::2012-03-03 04:49:16,121::task::855::TaskManager.Task::(_setError) Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::Unexpected error Traceback (most recent call last): File "/usr/share/vdsm/storage/task.py", line 863, in _run return fn(*args, **kargs) File "/usr/share/vdsm/logUtils.py", line 38, in wrapper res = f(*args, **kwargs) File "/usr/share/vdsm/storage/hsm.py", line 1922, in createStorageDomain typeSpecificArg, storageType, domVersion) File "/usr/share/vdsm/storage/nfsSD.py", line 87, in create fsd = cls(os.path.join(mntPoint, sdUUID)) File "/usr/share/vdsm/storage/fileSD.py", line 104, in __init__ sdUUID = metadata[sd.DMDK_SDUUID] File "/usr/share/vdsm/storage/persistentDict.py", line 75, in __getitem__ return dec(self._dict[key]) File "/usr/share/vdsm/storage/persistentDict.py", line 183, in __getitem__ with self._accessWrapper(): File "/usr/lib64/python2.7/contextlib.py", line 17, in __enter__ return self.gen.next() File "/usr/share/vdsm/storage/persistentDict.py", line 137, in _accessWrapper self.refresh() File "/usr/share/vdsm/storage/persistentDict.py", line 214, in refresh lines = self._metaRW.readlines() File "/usr/share/vdsm/storage/fileSD.py", line 71, in readlines return misc.stripNewLines(self._oop.directReadLines(self._metafile)) File "/usr/share/vdsm/storage/processPool.py", line 53, in wrapper return self.runExternally(func, *args, **kwds) File "/usr/share/vdsm/storage/processPool.py", line 64, in runExternally return self._procPool.runExternally(*args, **kwargs) File "/usr/share/vdsm/storage/processPool.py", line 154, in runExternally raise err
OSError: [Errno 22] Invalid argument: '/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/metadata' Thread-46::DEBUG::2012-03-03 04:49:16,129::task::874::TaskManager.Task::(_run) Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::Task._run: 9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0 (6, '1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe', 'my gluster domain', 'llm65.in.ibm.com:myvol', 1, 0) {} failed - stopping task Thread-46::DEBUG::2012-03-03 04:49:16,130::task::1201::TaskManager.Task::(stop) Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::stopping in state preparing (force False) Thread-46::DEBUG::2012-03-03 04:49:16,130::task::980::TaskManager.Task::(_decref) Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::ref 1 aborting True Thread-46::INFO::2012-03-03 04:49:16,130::task::1159::TaskManager.Task::(prepare) Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::aborting: Task is aborted: "[Errno 22] Invalid argument: '/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/metadata'"
- code 100
Thread-46::DEBUG::2012-03-03 04:49:16,130::task::1164::TaskManager.Task::(prepare) Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::Prepare: aborted: [Errno 22] Invalid argument: '/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/metadata' Thread-46::DEBUG::2012-03-03 04:49:16,130::task::980::TaskManager.Task::(_decref) Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::ref 0 aborting True Thread-46::DEBUG::2012-03-03 04:49:16,131::task::915::TaskManager.Task::(_doAbort) Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::Task._doAbort: force False Thread-46::DEBUG::2012-03-03 04:49:16,131::resourceManager::841::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} Thread-46::DEBUG::2012-03-03 04:49:16,131::task::588::TaskManager.Task::(_updateState) Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::moving from state preparing -> state aborting Thread-46::DEBUG::2012-03-03 04:49:16,131::task::537::TaskManager.Task::(__state_aborting) Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::_aborting: recover policy none Thread-46::DEBUG::2012-03-03 04:49:16,132::task::588::TaskManager.Task::(_updateState) Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::moving from state aborting -> state failed Thread-46::DEBUG::2012-03-03 04:49:16,132::resourceManager::806::ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {} Thread-46::DEBUG::2012-03-03 04:49:16,132::resourceManager::841::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} Thread-46::ERROR::2012-03-03 04:49:16,132::dispatcher::93::Storage.Dispatcher.Protect::(run) [Errno 22] Invalid argument: '/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/metadata' Traceback (most recent call last): File "/usr/share/vdsm/storage/dispatcher.py", line 85, in run result = ctask.prepare(self.func, *args, **kwargs) File "/usr/share/vdsm/storage/task.py", line 1166, in prepare raise self.error OSError: [Errno 22] Invalid argument: '/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/metadata'
Hi Saggie, Wondering if you could offer some help here...
I did some more debug and figured that the metafile (for which the above excp is being thrown) is opened in O_DIRECT|O_RDONLY mode in fileUtils.py. A sample python code i tried, throws the same excp (Errno 22) when trying to read any file using os.read(f,100) that is opened is O_DIRECT mode.
Going deeper into vdsm code, I see that in DirectFile.read()/readall(), libc.read is being used and not os.read.. libc.read is being fed the aligned buffers, so wondering why the Errno 22 is still coming ?
Could it be that O_DIRECT is simply not supported on your gluster mount? Would the following script explode, too?
import storage.fileUtils
f = storage.fileUtils.open_ex('/gluster/mounted/file', 'dr') s = f.read()
On 03/06/2012 04:21 AM, Dan Kenigsberg wrote:
On Mon, Mar 05, 2012 at 12:04:36AM +0530, Deepak C Shetty wrote:
On 03/02/2012 11:54 PM, Deepak C Shetty wrote:
On 03/02/2012 11:27 PM, Deepak C Shetty wrote:
Hi, In my simple experiment, i connected to a SHAREDFS storage server and then created a data domain But the createStorageDomain failed with code 351, which just says "Error creating a storage domain".
How to find out what the real reason behind the failure.
Surprisingly, the domain dir structure does get created, so looks like it worked, but still it gives failure as the return result, why ?
Sample code...
#!/usr/bin/python # GPLv2+
import sys import uuid import time
sys.path.append('/usr/share/vdsm')
import vdscli from storage.sd import SHAREDFS_DOMAIN, DATA_DOMAIN, ISO_DOMAIN from storage.volume import COW_FORMAT, SPARSE_VOL, LEAF_VOL, BLANK_UUID spUUID = str(uuid.uuid4()) sdUUID = str(uuid.uuid4()) imgUUID = str(uuid.uuid4()) volUUID = str(uuid.uuid4())
print "spUUID = %s"%spUUID print "sdUUID = %s"%sdUUID print "imgUUID = %s"%imgUUID print "volUUID = %s"%volUUID
gluster_conn = "llm65.in.ibm.com:myvol"
s = vdscli.connect()
masterVersion = 1 hostID = 1
def vdsOK(d): print d if d['status']['code']: raise Exception(str(d)) return d
def waitTask(s, taskid): while vdsOK(s.getTaskStatus(taskid))['taskStatus']['taskState'] != 'finished': time.sleep(3) vdsOK(s.clearTask(taskid))
vdsOK(s.connectStorageServer(SHAREDFS_DOMAIN, "my gluster mount", [dict(id=1, spec=gluster_conn, vfs_type="glusterfs", mnt_options="")]))
vdsOK(s.createStorageDomain(SHAREDFS_DOMAIN, sdUUID, "my gluster domain", gluster_conn, DATA_DOMAIN, 0))
Output...
./dpk-sharedfs-vm.py spUUID = 852110d5-c3d2-456e-ae75-b72e929e9bae sdUUID = 1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe imgUUID = c29100e7-19cd-4a27-adc6-4c35cc5e690c volUUID = 1d074f24-8bf0-4b68-8a35-40c3f2c33723 {'status': {'message': 'OK', 'code': 0}, 'statuslist': [{'status': 0, 'id': 1}]} {'status': {'message': "Error creating a storage domain: ('storageType=6, sdUUID=1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe, domainName=my gluster domain, domClass=1, typeSpecificArg=llm65.in.ibm.com:myvol domVersion=0',)", 'code': 351}} Traceback (most recent call last): File "./dpk-sharedfs-vm.py", line 74, in<module> vdsOK(s.createStorageDomain(SHAREDFS_DOMAIN, sdUUID, "my gluster domain", gluster_conn, DATA_DOMAIN, 0)) File "./dpk-sharedfs-vm.py", line 62, in vdsOK raise Exception(str(d)) Exception: {'status': {'message': "Error creating a storage domain: ('storageType=6, sdUUID=1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe, domainName=my gluster domain, domClass=1, typeSpecificArg=llm65.in.ibm.com:myvol domVersion=0',)", 'code': 351}}
But it did create the dir structure...
]# find /rhev/data-center/mnt/llm65.in.ibm.com:myvol/ /rhev/data-center/mnt/llm65.in.ibm.com:myvol/ /rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe
/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md
/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/metadata
/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/leases
/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/outbox
/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/inbox
/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/ids
/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/images
# mount | grep gluster llm65.in.ibm.com:myvol on /rhev/data-center/mnt/llm65.in.ibm.com:myvol type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)
Attaching the vdsm.log....
Thread-46::INFO::2012-03-03 04:49:16,092::nfsSD::64::Storage.StorageDomain::(create) sdUUID=1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe domainName=my gluster domain remotePath=llm65.in.ibm.com:myvol domClass=1 Thread-46::DEBUG::2012-03-03 04:49:16,111::persistentDict::175::Storage.PersistentDict::(__init__) Created a persistant dict with FileMetadataRW backend Thread-46::DEBUG::2012-03-03 04:49:16,113::persistentDict::216::Storage.PersistentDict::(refresh) read lines (FileMetadataRW)=[] Thread-46::WARNING::2012-03-03 04:49:16,113::persistentDict::238::Storage.PersistentDict::(refresh) data has no embedded checksum - trust it as it is Thread-46::DEBUG::2012-03-03 04:49:16,113::persistentDict::152::Storage.PersistentDict::(transaction) Starting transaction Thread-46::DEBUG::2012-03-03 04:49:16,114::persistentDict::158::Storage.PersistentDict::(transaction) Flushing changes Thread-46::DEBUG::2012-03-03 04:49:16,114::persistentDict::277::Storage.PersistentDict::(flush) about to write lines (FileMetadataRW)=['CLASS=Data', 'DESCRIPTION=my gluster domain', 'IOOPTIMEOUTSEC=1', 'LEASERETRIES=3', 'LEASETIMESEC=5', 'LOCKPOLICY=', 'LOCKRENEWALINTERVALSEC=5', 'POOL_UUID=', 'REMOTE_PATH=llm65.in.ibm.com:myvol', 'ROLE=Regular', 'SDUUID=1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe', 'TYPE=SHAREDFS', 'VERSION=0', '_SHA_CKSUM=c8ba67889d4b62ccd9fd368c584501404e8ee84e'] Thread-46::DEBUG::2012-03-03 04:49:16,118::persistentDict::160::Storage.PersistentDict::(transaction) Finished transaction Thread-46::DEBUG::2012-03-03 04:49:16,120::fileSD::98::Storage.StorageDomain::(__init__) Reading domain in path /rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe Thread-46::DEBUG::2012-03-03 04:49:16,120::persistentDict::175::Storage.PersistentDict::(__init__) Created a persistant dict with FileMetadataRW backend Thread-46::ERROR::2012-03-03 04:49:16,121::task::855::TaskManager.Task::(_setError) Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::Unexpected error Traceback (most recent call last): File "/usr/share/vdsm/storage/task.py", line 863, in _run return fn(*args, **kargs) File "/usr/share/vdsm/logUtils.py", line 38, in wrapper res = f(*args, **kwargs) File "/usr/share/vdsm/storage/hsm.py", line 1922, in createStorageDomain typeSpecificArg, storageType, domVersion) File "/usr/share/vdsm/storage/nfsSD.py", line 87, in create fsd = cls(os.path.join(mntPoint, sdUUID)) File "/usr/share/vdsm/storage/fileSD.py", line 104, in __init__ sdUUID = metadata[sd.DMDK_SDUUID] File "/usr/share/vdsm/storage/persistentDict.py", line 75, in __getitem__ return dec(self._dict[key]) File "/usr/share/vdsm/storage/persistentDict.py", line 183, in __getitem__ with self._accessWrapper(): File "/usr/lib64/python2.7/contextlib.py", line 17, in __enter__ return self.gen.next() File "/usr/share/vdsm/storage/persistentDict.py", line 137, in _accessWrapper self.refresh() File "/usr/share/vdsm/storage/persistentDict.py", line 214, in refresh lines = self._metaRW.readlines() File "/usr/share/vdsm/storage/fileSD.py", line 71, in readlines return misc.stripNewLines(self._oop.directReadLines(self._metafile)) File "/usr/share/vdsm/storage/processPool.py", line 53, in wrapper return self.runExternally(func, *args, **kwds) File "/usr/share/vdsm/storage/processPool.py", line 64, in runExternally return self._procPool.runExternally(*args, **kwargs) File "/usr/share/vdsm/storage/processPool.py", line 154, in runExternally raise err
OSError: [Errno 22] Invalid argument: '/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/metadata' Thread-46::DEBUG::2012-03-03 04:49:16,129::task::874::TaskManager.Task::(_run) Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::Task._run: 9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0 (6, '1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe', 'my gluster domain', 'llm65.in.ibm.com:myvol', 1, 0) {} failed - stopping task Thread-46::DEBUG::2012-03-03 04:49:16,130::task::1201::TaskManager.Task::(stop) Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::stopping in state preparing (force False) Thread-46::DEBUG::2012-03-03 04:49:16,130::task::980::TaskManager.Task::(_decref) Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::ref 1 aborting True Thread-46::INFO::2012-03-03 04:49:16,130::task::1159::TaskManager.Task::(prepare) Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::aborting: Task is aborted: "[Errno 22] Invalid argument: '/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/metadata'"
- code 100
Thread-46::DEBUG::2012-03-03 04:49:16,130::task::1164::TaskManager.Task::(prepare) Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::Prepare: aborted: [Errno 22] Invalid argument: '/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/metadata' Thread-46::DEBUG::2012-03-03 04:49:16,130::task::980::TaskManager.Task::(_decref) Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::ref 0 aborting True Thread-46::DEBUG::2012-03-03 04:49:16,131::task::915::TaskManager.Task::(_doAbort) Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::Task._doAbort: force False Thread-46::DEBUG::2012-03-03 04:49:16,131::resourceManager::841::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} Thread-46::DEBUG::2012-03-03 04:49:16,131::task::588::TaskManager.Task::(_updateState) Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::moving from state preparing -> state aborting Thread-46::DEBUG::2012-03-03 04:49:16,131::task::537::TaskManager.Task::(__state_aborting) Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::_aborting: recover policy none Thread-46::DEBUG::2012-03-03 04:49:16,132::task::588::TaskManager.Task::(_updateState) Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::moving from state aborting -> state failed Thread-46::DEBUG::2012-03-03 04:49:16,132::resourceManager::806::ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {} Thread-46::DEBUG::2012-03-03 04:49:16,132::resourceManager::841::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} Thread-46::ERROR::2012-03-03 04:49:16,132::dispatcher::93::Storage.Dispatcher.Protect::(run) [Errno 22] Invalid argument: '/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/metadata' Traceback (most recent call last): File "/usr/share/vdsm/storage/dispatcher.py", line 85, in run result = ctask.prepare(self.func, *args, **kwargs) File "/usr/share/vdsm/storage/task.py", line 1166, in prepare raise self.error OSError: [Errno 22] Invalid argument: '/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/metadata'
Hi Saggie, Wondering if you could offer some help here...
I did some more debug and figured that the metafile (for which the above excp is being thrown) is opened in O_DIRECT|O_RDONLY mode in fileUtils.py. A sample python code i tried, throws the same excp (Errno 22) when trying to read any file using os.read(f,100) that is opened is O_DIRECT mode.
Going deeper into vdsm code, I see that in DirectFile.read()/readall(), libc.read is being used and not os.read.. libc.read is being fed the aligned buffers, so wondering why the Errno 22 is still coming ?
Could it be that O_DIRECT is simply not supported on your gluster mount? Would the following script explode, too?
import storage.fileUtils
f = storage.fileUtils.open_ex('/gluster/mounted/file', 'dr') s = f.read()
Will try to figure whether gluster mount supports O_DIRECT. Until then, i workaround by using readLines instead of directReadLines... that helps me get past the issue, now createStorageDomain seems successfull from the vdsOK print i see as below...
{'status': {'message': 'OK', 'code': 0}}
But, in vdsm.log, i see this...
Thread-31::DEBUG::2012-03-06 16:21:24,676::safelease::54::Storage.Misc.excCmd::(initLock) FAILED: <err> = 'sudo: sorry, a password is required to run sudo\n'; <rc> = 1 Thread-31::WARNING::2012-03-06 16:21:24,676::safelease::56::ClusterLock::(initLock) could not initialise spm lease (1): [] Thread-31::WARNING::2012-03-06 16:21:24,677::sd::328::Storage.StorageDomain::(initSPMlease) lease did not initialize successfully Traceback (most recent call last): File "/usr/share/vdsm/storage/sd.py", line 324, in initSPMlease safelease.ClusterLock.initLock(self._getLeasesFilePath()) File "/usr/share/vdsm/storage/safelease.py", line 57, in initLock raise se.ClusterLockInitError() ClusterLockInitError: Could not initialize cluster lock: () Thread-31::INFO::2012-03-06 16:21:24,677::logUtils::39::dispatcher::(wrapper) Run and protect: createStorageDomain, Return response: None
So wondering why the lock init is failing.. in fact if i try to createStoragePool, i get more issues.
Lastly, i figured that vdsm code does not use os.read but uses libc.read by passing a aligned buffer, in which case Errno 22 should have been avoided, rite ?
On Tue, Mar 06, 2012 at 10:57:49AM +0530, Deepak C Shetty wrote:
On 03/06/2012 04:21 AM, Dan Kenigsberg wrote:
On Mon, Mar 05, 2012 at 12:04:36AM +0530, Deepak C Shetty wrote:
On 03/02/2012 11:54 PM, Deepak C Shetty wrote:
On 03/02/2012 11:27 PM, Deepak C Shetty wrote:
Hi, In my simple experiment, i connected to a SHAREDFS storage server and then created a data domain But the createStorageDomain failed with code 351, which just says "Error creating a storage domain".
How to find out what the real reason behind the failure.
Surprisingly, the domain dir structure does get created, so looks like it worked, but still it gives failure as the return result, why ?
>Sample code...
#!/usr/bin/python # GPLv2+
import sys import uuid import time
sys.path.append('/usr/share/vdsm')
import vdscli from storage.sd import SHAREDFS_DOMAIN, DATA_DOMAIN, ISO_DOMAIN from storage.volume import COW_FORMAT, SPARSE_VOL, LEAF_VOL, BLANK_UUID spUUID = str(uuid.uuid4()) sdUUID = str(uuid.uuid4()) imgUUID = str(uuid.uuid4()) volUUID = str(uuid.uuid4())
print "spUUID = %s"%spUUID print "sdUUID = %s"%sdUUID print "imgUUID = %s"%imgUUID print "volUUID = %s"%volUUID
gluster_conn = "llm65.in.ibm.com:myvol"
s = vdscli.connect()
masterVersion = 1 hostID = 1
def vdsOK(d): print d if d['status']['code']: raise Exception(str(d)) return d
def waitTask(s, taskid): while vdsOK(s.getTaskStatus(taskid))['taskStatus']['taskState'] != 'finished': time.sleep(3) vdsOK(s.clearTask(taskid))
vdsOK(s.connectStorageServer(SHAREDFS_DOMAIN, "my gluster mount", [dict(id=1, spec=gluster_conn, vfs_type="glusterfs", mnt_options="")]))
vdsOK(s.createStorageDomain(SHAREDFS_DOMAIN, sdUUID, "my gluster domain", gluster_conn, DATA_DOMAIN, 0))
>Output...
./dpk-sharedfs-vm.py spUUID = 852110d5-c3d2-456e-ae75-b72e929e9bae sdUUID = 1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe imgUUID = c29100e7-19cd-4a27-adc6-4c35cc5e690c volUUID = 1d074f24-8bf0-4b68-8a35-40c3f2c33723 {'status': {'message': 'OK', 'code': 0}, 'statuslist': [{'status': 0, 'id': 1}]} {'status': {'message': "Error creating a storage domain: ('storageType=6, sdUUID=1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe, domainName=my gluster domain, domClass=1, typeSpecificArg=llm65.in.ibm.com:myvol domVersion=0',)", 'code': 351}} Traceback (most recent call last): File "./dpk-sharedfs-vm.py", line 74, in<module> vdsOK(s.createStorageDomain(SHAREDFS_DOMAIN, sdUUID, "my gluster domain", gluster_conn, DATA_DOMAIN, 0)) File "./dpk-sharedfs-vm.py", line 62, in vdsOK raise Exception(str(d)) Exception: {'status': {'message': "Error creating a storage domain: ('storageType=6, sdUUID=1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe, domainName=my gluster domain, domClass=1, typeSpecificArg=llm65.in.ibm.com:myvol domVersion=0',)", 'code': 351}}
>But it did create the dir structure...
]# find /rhev/data-center/mnt/llm65.in.ibm.com:myvol/ /rhev/data-center/mnt/llm65.in.ibm.com:myvol/ /rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe
/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md
/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/metadata
/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/leases
/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/outbox
/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/inbox
/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/ids
/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/images
# mount | grep gluster llm65.in.ibm.com:myvol on /rhev/data-center/mnt/llm65.in.ibm.com:myvol type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)
Attaching the vdsm.log....
Thread-46::INFO::2012-03-03 04:49:16,092::nfsSD::64::Storage.StorageDomain::(create) sdUUID=1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe domainName=my gluster domain remotePath=llm65.in.ibm.com:myvol domClass=1 Thread-46::DEBUG::2012-03-03 04:49:16,111::persistentDict::175::Storage.PersistentDict::(__init__) Created a persistant dict with FileMetadataRW backend Thread-46::DEBUG::2012-03-03 04:49:16,113::persistentDict::216::Storage.PersistentDict::(refresh) read lines (FileMetadataRW)=[] Thread-46::WARNING::2012-03-03 04:49:16,113::persistentDict::238::Storage.PersistentDict::(refresh) data has no embedded checksum - trust it as it is Thread-46::DEBUG::2012-03-03 04:49:16,113::persistentDict::152::Storage.PersistentDict::(transaction) Starting transaction Thread-46::DEBUG::2012-03-03 04:49:16,114::persistentDict::158::Storage.PersistentDict::(transaction) Flushing changes Thread-46::DEBUG::2012-03-03 04:49:16,114::persistentDict::277::Storage.PersistentDict::(flush) about to write lines (FileMetadataRW)=['CLASS=Data', 'DESCRIPTION=my gluster domain', 'IOOPTIMEOUTSEC=1', 'LEASERETRIES=3', 'LEASETIMESEC=5', 'LOCKPOLICY=', 'LOCKRENEWALINTERVALSEC=5', 'POOL_UUID=', 'REMOTE_PATH=llm65.in.ibm.com:myvol', 'ROLE=Regular', 'SDUUID=1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe', 'TYPE=SHAREDFS', 'VERSION=0', '_SHA_CKSUM=c8ba67889d4b62ccd9fd368c584501404e8ee84e'] Thread-46::DEBUG::2012-03-03 04:49:16,118::persistentDict::160::Storage.PersistentDict::(transaction) Finished transaction Thread-46::DEBUG::2012-03-03 04:49:16,120::fileSD::98::Storage.StorageDomain::(__init__) Reading domain in path /rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe Thread-46::DEBUG::2012-03-03 04:49:16,120::persistentDict::175::Storage.PersistentDict::(__init__) Created a persistant dict with FileMetadataRW backend Thread-46::ERROR::2012-03-03 04:49:16,121::task::855::TaskManager.Task::(_setError) Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::Unexpected error Traceback (most recent call last): File "/usr/share/vdsm/storage/task.py", line 863, in _run return fn(*args, **kargs) File "/usr/share/vdsm/logUtils.py", line 38, in wrapper res = f(*args, **kwargs) File "/usr/share/vdsm/storage/hsm.py", line 1922, in createStorageDomain typeSpecificArg, storageType, domVersion) File "/usr/share/vdsm/storage/nfsSD.py", line 87, in create fsd = cls(os.path.join(mntPoint, sdUUID)) File "/usr/share/vdsm/storage/fileSD.py", line 104, in __init__ sdUUID = metadata[sd.DMDK_SDUUID] File "/usr/share/vdsm/storage/persistentDict.py", line 75, in __getitem__ return dec(self._dict[key]) File "/usr/share/vdsm/storage/persistentDict.py", line 183, in __getitem__ with self._accessWrapper(): File "/usr/lib64/python2.7/contextlib.py", line 17, in __enter__ return self.gen.next() File "/usr/share/vdsm/storage/persistentDict.py", line 137, in _accessWrapper self.refresh() File "/usr/share/vdsm/storage/persistentDict.py", line 214, in refresh lines = self._metaRW.readlines() File "/usr/share/vdsm/storage/fileSD.py", line 71, in readlines return misc.stripNewLines(self._oop.directReadLines(self._metafile)) File "/usr/share/vdsm/storage/processPool.py", line 53, in wrapper return self.runExternally(func, *args, **kwds) File "/usr/share/vdsm/storage/processPool.py", line 64, in runExternally return self._procPool.runExternally(*args, **kwargs) File "/usr/share/vdsm/storage/processPool.py", line 154, in runExternally raise err
OSError: [Errno 22] Invalid argument: '/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/metadata' Thread-46::DEBUG::2012-03-03 04:49:16,129::task::874::TaskManager.Task::(_run) Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::Task._run: 9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0 (6, '1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe', 'my gluster domain', 'llm65.in.ibm.com:myvol', 1, 0) {} failed - stopping task Thread-46::DEBUG::2012-03-03 04:49:16,130::task::1201::TaskManager.Task::(stop) Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::stopping in state preparing (force False) Thread-46::DEBUG::2012-03-03 04:49:16,130::task::980::TaskManager.Task::(_decref) Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::ref 1 aborting True Thread-46::INFO::2012-03-03 04:49:16,130::task::1159::TaskManager.Task::(prepare) Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::aborting: Task is aborted: "[Errno 22] Invalid argument: '/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/metadata'"
- code 100
Thread-46::DEBUG::2012-03-03 04:49:16,130::task::1164::TaskManager.Task::(prepare) Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::Prepare: aborted: [Errno 22] Invalid argument: '/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/metadata' Thread-46::DEBUG::2012-03-03 04:49:16,130::task::980::TaskManager.Task::(_decref) Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::ref 0 aborting True Thread-46::DEBUG::2012-03-03 04:49:16,131::task::915::TaskManager.Task::(_doAbort) Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::Task._doAbort: force False Thread-46::DEBUG::2012-03-03 04:49:16,131::resourceManager::841::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} Thread-46::DEBUG::2012-03-03 04:49:16,131::task::588::TaskManager.Task::(_updateState) Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::moving from state preparing -> state aborting Thread-46::DEBUG::2012-03-03 04:49:16,131::task::537::TaskManager.Task::(__state_aborting) Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::_aborting: recover policy none Thread-46::DEBUG::2012-03-03 04:49:16,132::task::588::TaskManager.Task::(_updateState) Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::moving from state aborting -> state failed Thread-46::DEBUG::2012-03-03 04:49:16,132::resourceManager::806::ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {} Thread-46::DEBUG::2012-03-03 04:49:16,132::resourceManager::841::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} Thread-46::ERROR::2012-03-03 04:49:16,132::dispatcher::93::Storage.Dispatcher.Protect::(run) [Errno 22] Invalid argument: '/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/metadata' Traceback (most recent call last): File "/usr/share/vdsm/storage/dispatcher.py", line 85, in run result = ctask.prepare(self.func, *args, **kwargs) File "/usr/share/vdsm/storage/task.py", line 1166, in prepare raise self.error OSError: [Errno 22] Invalid argument: '/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/metadata'
Hi Saggie, Wondering if you could offer some help here...
I did some more debug and figured that the metafile (for which the above excp is being thrown) is opened in O_DIRECT|O_RDONLY mode in fileUtils.py. A sample python code i tried, throws the same excp (Errno 22) when trying to read any file using os.read(f,100) that is opened is O_DIRECT mode.
Going deeper into vdsm code, I see that in DirectFile.read()/readall(), libc.read is being used and not os.read.. libc.read is being fed the aligned buffers, so wondering why the Errno 22 is still coming ?
Could it be that O_DIRECT is simply not supported on your gluster mount? Would the following script explode, too?
import storage.fileUtils
f = storage.fileUtils.open_ex('/gluster/mounted/file', 'dr') s = f.read()
Will try to figure whether gluster mount supports O_DIRECT. Until then, i workaround by using readLines instead of directReadLines... that helps me get past the issue, now createStorageDomain seems successfull from the vdsOK print i see as below...
{'status': {'message': 'OK', 'code': 0}}
But, in vdsm.log, i see this...
Thread-31::DEBUG::2012-03-06 16:21:24,676::safelease::54::Storage.Misc.excCmd::(initLock) FAILED: <err> = 'sudo: sorry, a password is required to run sudo\n'; <rc> = 1 Thread-31::WARNING::2012-03-06 16:21:24,676::safelease::56::ClusterLock::(initLock) could not initialise spm lease (1): [] Thread-31::WARNING::2012-03-06 16:21:24,677::sd::328::Storage.StorageDomain::(initSPMlease) lease did not initialize successfully Traceback (most recent call last): File "/usr/share/vdsm/storage/sd.py", line 324, in initSPMlease safelease.ClusterLock.initLock(self._getLeasesFilePath()) File "/usr/share/vdsm/storage/safelease.py", line 57, in initLock raise se.ClusterLockInitError() ClusterLockInitError: Could not initialize cluster lock: () Thread-31::INFO::2012-03-06 16:21:24,677::logUtils::39::dispatcher::(wrapper) Run and protect: createStorageDomain, Return response: None
taking the SPM lock requires O_DIRECT, too. Let's start by understanding how to enable it over gluster.
So wondering why the lock init is failing.. in fact if i try to createStoragePool, i get more issues.
Lastly, i figured that vdsm code does not use os.read but uses libc.read by passing a aligned buffer, in which case Errno 22 should have been avoided, rite ?
O_DIRECT is notoriously fragile. Someone has to debug the issue and understand why it fails for you... hint, hint ;-)
On 03/06/2012 01:22 PM, Dan Kenigsberg wrote:
On Tue, Mar 06, 2012 at 10:57:49AM +0530, Deepak C Shetty wrote:
On 03/06/2012 04:21 AM, Dan Kenigsberg wrote:
On Mon, Mar 05, 2012 at 12:04:36AM +0530, Deepak C Shetty wrote:
On 03/02/2012 11:54 PM, Deepak C Shetty wrote:
On 03/02/2012 11:27 PM, Deepak C Shetty wrote:
Hi, In my simple experiment, i connected to a SHAREDFS storage server and then created a data domain But the createStorageDomain failed with code 351, which just says "Error creating a storage domain".
How to find out what the real reason behind the failure.
Surprisingly, the domain dir structure does get created, so looks like it worked, but still it gives failure as the return result, why ?
>> Sample code... #!/usr/bin/python # GPLv2+
import sys import uuid import time
sys.path.append('/usr/share/vdsm')
import vdscli from storage.sd import SHAREDFS_DOMAIN, DATA_DOMAIN, ISO_DOMAIN from storage.volume import COW_FORMAT, SPARSE_VOL, LEAF_VOL, BLANK_UUID spUUID = str(uuid.uuid4()) sdUUID = str(uuid.uuid4()) imgUUID = str(uuid.uuid4()) volUUID = str(uuid.uuid4())
print "spUUID = %s"%spUUID print "sdUUID = %s"%sdUUID print "imgUUID = %s"%imgUUID print "volUUID = %s"%volUUID
gluster_conn = "llm65.in.ibm.com:myvol"
s = vdscli.connect()
masterVersion = 1 hostID = 1
def vdsOK(d): print d if d['status']['code']: raise Exception(str(d)) return d
def waitTask(s, taskid): while vdsOK(s.getTaskStatus(taskid))['taskStatus']['taskState'] != 'finished': time.sleep(3) vdsOK(s.clearTask(taskid))
vdsOK(s.connectStorageServer(SHAREDFS_DOMAIN, "my gluster mount", [dict(id=1, spec=gluster_conn, vfs_type="glusterfs", mnt_options="")]))
vdsOK(s.createStorageDomain(SHAREDFS_DOMAIN, sdUUID, "my gluster domain", gluster_conn, DATA_DOMAIN, 0))
>> Output... ./dpk-sharedfs-vm.py spUUID = 852110d5-c3d2-456e-ae75-b72e929e9bae sdUUID = 1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe imgUUID = c29100e7-19cd-4a27-adc6-4c35cc5e690c volUUID = 1d074f24-8bf0-4b68-8a35-40c3f2c33723 {'status': {'message': 'OK', 'code': 0}, 'statuslist': [{'status': 0, 'id': 1}]} {'status': {'message': "Error creating a storage domain: ('storageType=6, sdUUID=1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe, domainName=my gluster domain, domClass=1, typeSpecificArg=llm65.in.ibm.com:myvol domVersion=0',)", 'code': 351}} Traceback (most recent call last): File "./dpk-sharedfs-vm.py", line 74, in<module> vdsOK(s.createStorageDomain(SHAREDFS_DOMAIN, sdUUID, "my gluster domain", gluster_conn, DATA_DOMAIN, 0)) File "./dpk-sharedfs-vm.py", line 62, in vdsOK raise Exception(str(d)) Exception: {'status': {'message': "Error creating a storage domain: ('storageType=6, sdUUID=1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe, domainName=my gluster domain, domClass=1, typeSpecificArg=llm65.in.ibm.com:myvol domVersion=0',)", 'code': 351}}
>> But it did create the dir structure... ]# find /rhev/data-center/mnt/llm65.in.ibm.com:myvol/ /rhev/data-center/mnt/llm65.in.ibm.com:myvol/ /rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe
/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md
/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/metadata
/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/leases
/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/outbox
/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/inbox
/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/ids
/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/images
# mount | grep gluster llm65.in.ibm.com:myvol on /rhev/data-center/mnt/llm65.in.ibm.com:myvol type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)
Attaching the vdsm.log....
Thread-46::INFO::2012-03-03 04:49:16,092::nfsSD::64::Storage.StorageDomain::(create) sdUUID=1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe domainName=my gluster domain remotePath=llm65.in.ibm.com:myvol domClass=1 Thread-46::DEBUG::2012-03-03 04:49:16,111::persistentDict::175::Storage.PersistentDict::(__init__) Created a persistant dict with FileMetadataRW backend Thread-46::DEBUG::2012-03-03 04:49:16,113::persistentDict::216::Storage.PersistentDict::(refresh) read lines (FileMetadataRW)=[] Thread-46::WARNING::2012-03-03 04:49:16,113::persistentDict::238::Storage.PersistentDict::(refresh) data has no embedded checksum - trust it as it is Thread-46::DEBUG::2012-03-03 04:49:16,113::persistentDict::152::Storage.PersistentDict::(transaction) Starting transaction Thread-46::DEBUG::2012-03-03 04:49:16,114::persistentDict::158::Storage.PersistentDict::(transaction) Flushing changes Thread-46::DEBUG::2012-03-03 04:49:16,114::persistentDict::277::Storage.PersistentDict::(flush) about to write lines (FileMetadataRW)=['CLASS=Data', 'DESCRIPTION=my gluster domain', 'IOOPTIMEOUTSEC=1', 'LEASERETRIES=3', 'LEASETIMESEC=5', 'LOCKPOLICY=', 'LOCKRENEWALINTERVALSEC=5', 'POOL_UUID=', 'REMOTE_PATH=llm65.in.ibm.com:myvol', 'ROLE=Regular', 'SDUUID=1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe', 'TYPE=SHAREDFS', 'VERSION=0', '_SHA_CKSUM=c8ba67889d4b62ccd9fd368c584501404e8ee84e'] Thread-46::DEBUG::2012-03-03 04:49:16,118::persistentDict::160::Storage.PersistentDict::(transaction) Finished transaction Thread-46::DEBUG::2012-03-03 04:49:16,120::fileSD::98::Storage.StorageDomain::(__init__) Reading domain in path /rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe Thread-46::DEBUG::2012-03-03 04:49:16,120::persistentDict::175::Storage.PersistentDict::(__init__) Created a persistant dict with FileMetadataRW backend Thread-46::ERROR::2012-03-03 04:49:16,121::task::855::TaskManager.Task::(_setError) Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::Unexpected error Traceback (most recent call last): File "/usr/share/vdsm/storage/task.py", line 863, in _run return fn(*args, **kargs) File "/usr/share/vdsm/logUtils.py", line 38, in wrapper res = f(*args, **kwargs) File "/usr/share/vdsm/storage/hsm.py", line 1922, in createStorageDomain typeSpecificArg, storageType, domVersion) File "/usr/share/vdsm/storage/nfsSD.py", line 87, in create fsd = cls(os.path.join(mntPoint, sdUUID)) File "/usr/share/vdsm/storage/fileSD.py", line 104, in __init__ sdUUID = metadata[sd.DMDK_SDUUID] File "/usr/share/vdsm/storage/persistentDict.py", line 75, in __getitem__ return dec(self._dict[key]) File "/usr/share/vdsm/storage/persistentDict.py", line 183, in __getitem__ with self._accessWrapper(): File "/usr/lib64/python2.7/contextlib.py", line 17, in __enter__ return self.gen.next() File "/usr/share/vdsm/storage/persistentDict.py", line 137, in _accessWrapper self.refresh() File "/usr/share/vdsm/storage/persistentDict.py", line 214, in refresh lines = self._metaRW.readlines() File "/usr/share/vdsm/storage/fileSD.py", line 71, in readlines return misc.stripNewLines(self._oop.directReadLines(self._metafile)) File "/usr/share/vdsm/storage/processPool.py", line 53, in wrapper return self.runExternally(func, *args, **kwds) File "/usr/share/vdsm/storage/processPool.py", line 64, in runExternally return self._procPool.runExternally(*args, **kwargs) File "/usr/share/vdsm/storage/processPool.py", line 154, in runExternally raise err
OSError: [Errno 22] Invalid argument: '/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/metadata' Thread-46::DEBUG::2012-03-03 04:49:16,129::task::874::TaskManager.Task::(_run) Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::Task._run: 9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0 (6, '1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe', 'my gluster domain', 'llm65.in.ibm.com:myvol', 1, 0) {} failed - stopping task Thread-46::DEBUG::2012-03-03 04:49:16,130::task::1201::TaskManager.Task::(stop) Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::stopping in state preparing (force False) Thread-46::DEBUG::2012-03-03 04:49:16,130::task::980::TaskManager.Task::(_decref) Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::ref 1 aborting True Thread-46::INFO::2012-03-03 04:49:16,130::task::1159::TaskManager.Task::(prepare) Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::aborting: Task is aborted: "[Errno 22] Invalid argument: '/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/metadata'"
- code 100
Thread-46::DEBUG::2012-03-03 04:49:16,130::task::1164::TaskManager.Task::(prepare) Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::Prepare: aborted: [Errno 22] Invalid argument: '/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/metadata' Thread-46::DEBUG::2012-03-03 04:49:16,130::task::980::TaskManager.Task::(_decref) Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::ref 0 aborting True Thread-46::DEBUG::2012-03-03 04:49:16,131::task::915::TaskManager.Task::(_doAbort) Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::Task._doAbort: force False Thread-46::DEBUG::2012-03-03 04:49:16,131::resourceManager::841::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} Thread-46::DEBUG::2012-03-03 04:49:16,131::task::588::TaskManager.Task::(_updateState) Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::moving from state preparing -> state aborting Thread-46::DEBUG::2012-03-03 04:49:16,131::task::537::TaskManager.Task::(__state_aborting) Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::_aborting: recover policy none Thread-46::DEBUG::2012-03-03 04:49:16,132::task::588::TaskManager.Task::(_updateState) Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::moving from state aborting -> state failed Thread-46::DEBUG::2012-03-03 04:49:16,132::resourceManager::806::ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {} Thread-46::DEBUG::2012-03-03 04:49:16,132::resourceManager::841::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} Thread-46::ERROR::2012-03-03 04:49:16,132::dispatcher::93::Storage.Dispatcher.Protect::(run) [Errno 22] Invalid argument: '/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/metadata' Traceback (most recent call last): File "/usr/share/vdsm/storage/dispatcher.py", line 85, in run result = ctask.prepare(self.func, *args, **kwargs) File "/usr/share/vdsm/storage/task.py", line 1166, in prepare raise self.error OSError: [Errno 22] Invalid argument: '/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/metadata'
Hi Saggie, Wondering if you could offer some help here...
I did some more debug and figured that the metafile (for which the above excp is being thrown) is opened in O_DIRECT|O_RDONLY mode in fileUtils.py. A sample python code i tried, throws the same excp (Errno 22) when trying to read any file using os.read(f,100) that is opened is O_DIRECT mode.
Going deeper into vdsm code, I see that in DirectFile.read()/readall(), libc.read is being used and not os.read.. libc.read is being fed the aligned buffers, so wondering why the Errno 22 is still coming ?
Could it be that O_DIRECT is simply not supported on your gluster mount? Would the following script explode, too?
import storage.fileUtils
f = storage.fileUtils.open_ex('/gluster/mounted/file', 'dr') s = f.read()
Will try to figure whether gluster mount supports O_DIRECT. Until then, i workaround by using readLines instead of directReadLines... that helps me get past the issue, now createStorageDomain seems successfull from the vdsOK print i see as below...
{'status': {'message': 'OK', 'code': 0}}
But, in vdsm.log, i see this...
Thread-31::DEBUG::2012-03-06 16:21:24,676::safelease::54::Storage.Misc.excCmd::(initLock) FAILED: <err> = 'sudo: sorry, a password is required to run sudo\n';<rc> = 1 Thread-31::WARNING::2012-03-06 16:21:24,676::safelease::56::ClusterLock::(initLock) could not initialise spm lease (1): [] Thread-31::WARNING::2012-03-06 16:21:24,677::sd::328::Storage.StorageDomain::(initSPMlease) lease did not initialize successfully Traceback (most recent call last): File "/usr/share/vdsm/storage/sd.py", line 324, in initSPMlease safelease.ClusterLock.initLock(self._getLeasesFilePath()) File "/usr/share/vdsm/storage/safelease.py", line 57, in initLock raise se.ClusterLockInitError() ClusterLockInitError: Could not initialize cluster lock: () Thread-31::INFO::2012-03-06 16:21:24,677::logUtils::39::dispatcher::(wrapper) Run and protect: createStorageDomain, Return response: None
taking the SPM lock requires O_DIRECT, too. Let's start by understanding how to enable it over gluster.
Ah, didn't realise that, looks like i missed that in the code.
From #gluster i figure that fuse still does not support O_DIRECT From linux-fsdevel, it looks like patches to enable O_DIRECT in fuse are just getting in.
So wondering why the lock init is failing.. in fact if i try to createStoragePool, i get more issues.
Lastly, i figured that vdsm code does not use os.read but uses libc.read by passing a aligned buffer, in which case Errno 22 should have been avoided, rite ?
O_DIRECT is notoriously fragile. Someone has to debug the issue and understand why it fails for you... hint, hint ;-)
vdsm-devel@lists.fedorahosted.org