Hello Federico,
These are your
requirements:
Version:
# rpm -aq | grep vdsm
vdsm-python-zombiereaper-4.14.6-0.el6.noarch
vdsm-xmlrpc-4.14.6-0.el6.noarch
vdsm-4.14.6-0.el6.x86_64
vdsm-python-4.14.6-0.el6.x86_64
vdsm-cli-4.14.6-0.el6.noarch
vdsm-hook-hostusb-4.14.6-0.el6.noarch
The vdsm.log content:
Thread-14::INFO::2014-05-26
07:01:01,287::logUtils::44::dispatcher::(wrapper) Run and
protect:
getTaskStatus(taskID='45cbd9d4-7a1c-47ff-abb2-92c0dc539554',
spUUID=None, options=None)
Thread-14::INFO::2014-05-26
07:01:01,288::logUtils::47::dispatcher::(wrapper) Run and
protect: getTaskStatus, Return response:
{'taskStatus': {'code': 358,
'message': 'Storage domain does not exist',
'taskState': 'finished',
'taskResult': 'cleanSuccess',
'taskID':
'45cbd9d4-7a1c-47ff-abb2-92c0dc539554'}}
Thread-14::INFO::2014-05-26
07:01:01,296::logUtils::44::dispatcher::(wrapper) Run and
protect:
getSpmStatus(spUUID='00000002-0002-0002-0002-00000000030b',
options=None)
Thread-14::INFO::2014-05-26
07:01:01,317::logUtils::47::dispatcher::(wrapper) Run and
protect: getSpmStatus, Return response: {'spm_st':
{'spmId': -1, 'spmStatus': 'Free',
'spmLver': -1}}
Thread-14::INFO::2014-05-26
07:01:01,388::logUtils::44::dispatcher::(wrapper) Run and
protect:
clearTask(taskID='45cbd9d4-7a1c-47ff-abb2-92c0dc539554',
spUUID=None, options=None)
Thread-14::INFO::2014-05-26
07:01:01,388::logUtils::47::dispatcher::(wrapper) Run and
protect: clearTask, Return response: None
Thread-17::ERROR::2014-05-26
07:01:07,404::sdc::137::Storage.StorageDomainCache::(_findDomain)
looking for unfetched domain
0ae15b52-821f-438a-9602-2494eac3ac5b
Thread-17::ERROR::2014-05-26
07:01:07,405::sdc::154::Storage.StorageDomainCache::(_findUnfetchedDomain)
looking for domain 0ae15b52-821f-438a-9602-2494eac3ac5b
Thread-17::WARNING::2014-05-26
07:01:07,672::lvm::377::Storage.LVM::(_reloadvgs) lvm vgs
failed: 5 [] [' Volume group
"0ae15b52-821f-438a-9602-2494eac3ac5b" not
found']
Thread-17::ERROR::2014-05-26
07:01:07,680::sdc::143::Storage.StorageDomainCache::(_findDomain)
domain 0ae15b52-821f-438a-9602-2494eac3ac5b not found
Traceback (most recent call last):
File
"/usr/share/vdsm/storage/sdc.py", line 141, in
_findDomain
dom =
findMethod(sdUUID)
File
"/usr/share/vdsm/storage/sdc.py", line 171, in
_findUnfetchedDomain
raise
se.StorageDomainDoesNotExist(sdUUID)
StorageDomainDoesNotExist: Storage domain does
not exist:
(u'0ae15b52-821f-438a-9602-2494eac3ac5b',)
Thread-17::ERROR::2014-05-26
07:01:07,681::domainMonitor::239::Storage.DomainMonitorThread::(_monitorDomain)
Error while collecting domain
0ae15b52-821f-438a-9602-2494eac3ac5b monitoring
information
Traceback (most recent call
last):
File
"/usr/share/vdsm/storage/domainMonitor.py", line
204, in _monitorDomain
self.domain =
sdCache.produce(self.sdUUID)
File
"/usr/share/vdsm/storage/sdc.py", line 98, in
produce
domain.getRealDomain()
File
"/usr/share/vdsm/storage/sdc.py", line 52, in
getRealDomain
return
self._cache._realProduce(self._sdUUID)
File "/usr/share/vdsm/storage/sdc.py", line 122,
in _realProduce
domain =
self._findDomain(sdUUID)
File
"/usr/share/vdsm/storage/sdc.py", line 141, in
_findDomain
dom =
findMethod(sdUUID)
File
"/usr/share/vdsm/storage/sdc.py", line 171, in
_findUnfetchedDomain
raise
se.StorageDomainDoesNotExist(sdUUID)
StorageDomainDoesNotExist: Storage domain does
not exist:
(u'0ae15b52-821f-438a-9602-2494eac3ac5b',)
Thread-16::ERROR::2014-05-26
07:01:10,508::sdc::137::Storage.StorageDomainCache::(_findDomain)
looking for unfetched domain
ba26bc14-3aff-48eb-816c-e02e81e5fd63
Thread-16::ERROR::2014-05-26
07:01:10,509::sdc::154::Storage.StorageDomainCache::(_findUnfetchedDomain)
looking for domain ba26bc14-3aff-48eb-816c-e02e81e5fd63
Thread-16::WARNING::2014-05-26
07:01:10,829::lvm::377::Storage.LVM::(_reloadvgs) lvm vgs
failed: 5 [] [' Volume group
"ba26bc14-3aff-48eb-816c-e02e81e5fd63" not
found']
Thread-16::ERROR::2014-05-26
07:01:10,836::sdc::143::Storage.StorageDomainCache::(_findDomain)
domain ba26bc14-3aff-48eb-816c-e02e81e5fd63 not found
Traceback (most recent call last):
File
"/usr/share/vdsm/storage/sdc.py", line 141, in
_findDomain
dom =
findMethod(sdUUID)
File
"/usr/share/vdsm/storage/sdc.py", line 171, in
_findUnfetchedDomain
raise
se.StorageDomainDoesNotExist(sdUUID)
StorageDomainDoesNotExist: Storage domain does
not exist:
(u'ba26bc14-3aff-48eb-816c-e02e81e5fd63',)
Thread-16::ERROR::2014-05-26
07:01:10,837::domainMonitor::239::Storage.DomainMonitorThread::(_monitorDomain)
Error while collecting domain
ba26bc14-3aff-48eb-816c-e02e81e5fd63 monitoring
information
Traceback (most recent call
last):
File
"/usr/share/vdsm/storage/domainMonitor.py", line
204, in _monitorDomain
self.domain =
sdCache.produce(self.sdUUID)
File
"/usr/share/vdsm/storage/sdc.py", line 98, in
produce
domain.getRealDomain()
File
"/usr/share/vdsm/storage/sdc.py", line 52, in
getRealDomain
return
self._cache._realProduce(self._sdUUID)
File "/usr/share/vdsm/storage/sdc.py", line 122,
in _realProduce
domain =
self._findDomain(sdUUID)
File
"/usr/share/vdsm/storage/sdc.py", line 141, in
_findDomain
dom =
findMethod(sdUUID)
File
"/usr/share/vdsm/storage/sdc.py", line 171, in
_findUnfetchedDomain
raise
se.StorageDomainDoesNotExist(sdUUID)
StorageDomainDoesNotExist: Storage domain does
not exist:
(u'ba26bc14-3aff-48eb-816c-e02e81e5fd63',)
Thread-14::INFO::2014-05-26
07:01:11,489::logUtils::44::dispatcher::(wrapper) Run and
protect:
getSpmStatus(spUUID='00000002-0002-0002-0002-00000000030b',
options=None)
Thread-14::INFO::2014-05-26
07:01:11,511::logUtils::47::dispatcher::(wrapper) Run and
protect: getSpmStatus, Return response: {'spm_st':
{'spmId': -1, 'spmStatus': 'Free',
'spmLver': -1}}
Thread-14::INFO::2014-05-26
07:01:11,516::logUtils::44::dispatcher::(wrapper) Run and
protect: getAllTasksStatuses(spUUID=None, options=None)
Thread-14::ERROR::2014-05-26
07:01:11,516::task::866::TaskManager.Task::(_setError)
Task=`74051611-a84b-43d6-8cb4-5e148dd59eed`::Unexpected
error
Traceback (most recent call last):
File
"/usr/share/vdsm/storage/task.py", line 873, in
_run
return fn(*args, **kargs)
File
"/usr/share/vdsm/logUtils.py", line 45, in
wrapper
res = f(*args, **kwargs)
File
"/usr/share/vdsm/storage/hsm.py", line 2114, in
getAllTasksStatuses
allTasksStatus =
sp.getAllTasksStatuses()
File
"/usr/share/vdsm/storage/securable.py", line 73,
in wrapper
raise
SecureError("Secured object is not in safe
state")
SecureError: Secured object is
not in safe state
Thread-14::INFO::2014-05-26
07:01:11,517::task::1168::TaskManager.Task::(prepare)
Task=`74051611-a84b-43d6-8cb4-5e148dd59eed`::aborting: Task
is aborted: u'Secured object is not in safe state' -
code 100
Thread-14::ERROR::2014-05-26
07:01:11,518::dispatcher::68::Storage.Dispatcher.Protect::(run)
Secured object is not in safe state
Traceback (most recent call last):
File
"/usr/share/vdsm/storage/dispatcher.py", line 60,
in run
result =
ctask.prepare(self.func, *args, **kwargs)
File "/usr/share/vdsm/storage/task.py", line 103,
in wrapper
return m(self, *a, **kw)
File
"/usr/share/vdsm/storage/task.py", line 1176, in
prepare
raise self.error
SecureError: Secured object is not in safe
state
Thread-14::INFO::2014-05-26
07:01:11,610::logUtils::44::dispatcher::(wrapper) Run and
protect:
getSpmStatus(spUUID='00000002-0002-0002-0002-00000000030b',
options=None)
Thread-14::INFO::2014-05-26
07:01:11,644::logUtils::47::dispatcher::(wrapper) Run and
protect: getSpmStatus, Return response: {'spm_st':
{'spmId': -1, 'spmStatus': 'Free',
'spmLver': -1}}
Thread-14::INFO::2014-05-26
07:01:11,679::logUtils::44::dispatcher::(wrapper) Run and
protect:
spmStart(spUUID='00000002-0002-0002-0002-00000000030b',
prevID=-1, prevLVER='-1', maxHostID=250,
domVersion='3', options=None)
Thread-14::INFO::2014-05-26
07:01:11,681::logUtils::47::dispatcher::(wrapper) Run and
protect: spmStart, Return response: None
7b861db2-05c7-44b1-aac7-b4ea018120cf::INFO::2014-05-26
07:01:11,988::clusterlock::184::SANLock::(acquireHostId)
Acquiring host id for domain
3e7aa4fd-fb27-47a3-9ddb-d7a97027723d (id: 12)
7b861db2-05c7-44b1-aac7-b4ea018120cf::WARNING::2014-05-26
07:01:11,989::fileUtils::167::Storage.fileUtils::(createdir)
Dir
/rhev/data-center/mnt/blockSD/3e7aa4fd-fb27-47a3-9ddb-d7a97027723d/images
already exists
7b861db2-05c7-44b1-aac7-b4ea018120cf::WARNING::2014-05-26
07:01:11,990::fileUtils::167::Storage.fileUtils::(createdir)
Dir
/rhev/data-center/mnt/blockSD/3e7aa4fd-fb27-47a3-9ddb-d7a97027723d/dom_md
already exists
7b861db2-05c7-44b1-aac7-b4ea018120cf::WARNING::2014-05-26
07:01:12,012::persistentDict::256::Storage.PersistentDict::(refresh)
data has no embedded checksum - trust it as it is
Thread-14::INFO::2014-05-26
07:01:12,329::logUtils::44::dispatcher::(wrapper) Run and
protect: repoStats(options=None)
Thread-14::INFO::2014-05-26
07:01:12,339::logUtils::47::dispatcher::(wrapper) Run and
protect: repoStats, Return response:
{u'3e7aa4fd-fb27-47a3-9ddb-d7a97027723d':
{'code': 0, 'version': 3,
'acquired': True, 'delay':
'0.00607831', 'lastCheck': '4.7',
'valid': True},
u'ba26bc14-3aff-48eb-816c-e02e81e5fd63':
{'code': 358, 'version': -1,
'acquired': False, 'delay': '0',
'lastCheck': '1.5', 'valid': False},
u'0ae15b52-821f-438a-9602-2494eac3ac5b':
{'code': 358, 'version': -1,
'acquired': False, 'delay': '0',
'lastCheck': '4.7', 'valid':
False}}
7b861db2-05c7-44b1-aac7-b4ea018120cf::INFO::2014-05-26
07:01:12,344::clusterlock::235::SANLock::(acquire) Acquiring
cluster lock for domain 3e7aa4fd-fb27-47a3-9ddb-d7a97027723d
(id: 12)
Thread-14::INFO::2014-05-26
07:01:12,739::logUtils::44::dispatcher::(wrapper) Run and
protect:
getTaskStatus(taskID='7b861db2-05c7-44b1-aac7-b4ea018120cf',
spUUID=None, options=None)
Thread-14::INFO::2014-05-26
07:01:12,742::logUtils::47::dispatcher::(wrapper) Run and
protect: getTaskStatus, Return response:
{'taskStatus': {'code': 0,
'message': 'Task is initializing',
'taskState': 'running',
'taskResult': '', 'taskID':
'7b861db2-05c7-44b1-aac7-b4ea018120cf'}}
7b861db2-05c7-44b1-aac7-b4ea018120cf::INFO::2014-05-26
07:01:13,347::sp::431::Storage.StoragePool::(_upgradePool)
Trying to upgrade master domain
`3e7aa4fd-fb27-47a3-9ddb-d7a97027723d`
7b861db2-05c7-44b1-aac7-b4ea018120cf::WARNING::2014-05-26
07:01:13,615::persistentDict::256::Storage.PersistentDict::(refresh)
data has no embedded checksum - trust it as it is
Thread-14::INFO::2014-05-26
07:01:13,758::logUtils::44::dispatcher::(wrapper) Run and
protect:
getTaskStatus(taskID='7b861db2-05c7-44b1-aac7-b4ea018120cf',
spUUID=None, options=None)
Thread-14::INFO::2014-05-26
07:01:13,759::logUtils::47::dispatcher::(wrapper) Run and
protect: getTaskStatus, Return response:
{'taskStatus': {'code': 0,
'message': 'Task is initializing',
'taskState': 'running',
'taskResult': '', 'taskID':
'7b861db2-05c7-44b1-aac7-b4ea018120cf'}}
7b861db2-05c7-44b1-aac7-b4ea018120cf::INFO::2014-05-26
07:01:13,878::sd::374::Storage.StorageDomain::(_registerResourceNamespaces)
Resource namespace
3e7aa4fd-fb27-47a3-9ddb-d7a97027723d_imageNS already
registered
7b861db2-05c7-44b1-aac7-b4ea018120cf::INFO::2014-05-26
07:01:13,878::sd::382::Storage.StorageDomain::(_registerResourceNamespaces)
Resource namespace
3e7aa4fd-fb27-47a3-9ddb-d7a97027723d_volumeNS already
registered
7b861db2-05c7-44b1-aac7-b4ea018120cf::INFO::2014-05-26
07:01:13,879::blockSD::456::Storage.StorageDomain::(_registerResourceNamespaces)
Resource namespace
3e7aa4fd-fb27-47a3-9ddb-d7a97027723d_lvmActivationNS already
registered
Thread-228151::ERROR::2014-05-26
07:01:14,188::sdc::137::Storage.StorageDomainCache::(_findDomain)
looking for unfetched domain
ba26bc14-3aff-48eb-816c-e02e81e5fd63
Thread-228152::ERROR::2014-05-26
07:01:14,191::sdc::137::Storage.StorageDomainCache::(_findDomain)
looking for unfetched domain
0ae15b52-821f-438a-9602-2494eac3ac5b
Thread-228151::ERROR::2014-05-26
07:01:14,192::sdc::154::Storage.StorageDomainCache::(_findUnfetchedDomain)
looking for domain ba26bc14-3aff-48eb-816c-e02e81e5fd63
Thread-228152::ERROR::2014-05-26
07:01:14,195::sdc::154::Storage.StorageDomainCache::(_findUnfetchedDomain)
looking for domain 0ae15b52-821f-438a-9602-2494eac3ac5b
7b861db2-05c7-44b1-aac7-b4ea018120cf::WARNING::2014-05-26
07:01:14,201::fileUtils::167::Storage.fileUtils::(createdir)
Dir
/rhev/data-center/mnt/blockSD/3e7aa4fd-fb27-47a3-9ddb-d7a97027723d/master
already exists
Thread-228152::WARNING::2014-05-26
07:01:14,513::lvm::377::Storage.LVM::(_reloadvgs) lvm vgs
failed: 5 [] [' Volume group
"0ae15b52-821f-438a-9602-2494eac3ac5b" not
found']
Thread-228151::WARNING::2014-05-26
07:01:14,513::lvm::377::Storage.LVM::(_reloadvgs) lvm vgs
failed: 5 [] [' Volume group
"ba26bc14-3aff-48eb-816c-e02e81e5fd63" not
found']
Thread-228152::ERROR::2014-05-26
07:01:14,553::sdc::143::Storage.StorageDomainCache::(_findDomain)
domain 0ae15b52-821f-438a-9602-2494eac3ac5b not found
Regards
--------------------------------------------
El lun, 5/26/14, Federico Simoncelli <fsimonce(a)redhat.com>
escribió:
Asunto: Re:
[vdsm] Problems with vdsm deleted Storages Domains
A: "David Andino" <david_andino(a)yahoo.com>
Cc:
vdsm-devel(a)lists.fedorahosted.org
Fecha: lunes, 26 de mayo de 2014, 04:54 am
Can you paste the exact
traceback that you see in the vdsm log?
Also, what vdsm version is it?
Thanks,
--
Federico
-----
Original Message
-----
> From:
"David Andino" <david_andino(a)yahoo.com>
> To: vdsm-devel(a)lists.fedorahosted.org
> Sent: Monday, May 26, 2014 12:37:35 AM
> Subject: [vdsm] Problems with vdsm
deleted
Storages Domains
>
>
Hello
everyone,
>
> I
have serious troubles with something that
recently happend
and this is our
> story.
>
> We have 12 nodes and
1
engine server. Our infrastructure is based in ISCSI
> and NFS (for Exports and ISOs).
Recently
my partner made some bad
practices
> with
our
NFS server and one thing we noted in the engine, was
that all of
> our nodes
were contending
the SPM and they never got
agreed which one would
> it take it and
the logs pointed to a
Metadata corruption
in the NFS Domains
>
and after hours of the contending situation we tried to
stabilized it.
>
>
Trying to stabilize our
cluster, we shutdown our NFS server
and
nothing
> happened. The cluster was
still contending the SPM. After many tests
like
> putting the nodes in maintain
mode,
reboot, shutdown all the cluster,
etc,
>
The last thing
we did was destroying from the configuration
(using the web
>
interface) the Export
and ISO Domains
leaving the Data Domain intact trying
>
to put online our guests, and that was (I
think) our big mistake. Because
> now
we
are getting a serious situation that our
Data Domain is not
getting
> online because the vdsm in
every node
are looking for the NFS domains we
>
already deleted.
>
> We were using vdsClient to consult the
information the vdsm is getting and it
>
says that we have 1 Storage Pool
Domain and 3 Storage
Domains, our ISCSI
and
> the 2 NFS
domains
that were deleted.
>
> We let only one node active and the rest
are in maintenance and the message
> is
that it still
contending for SPM. The vdsm.log says that the
node is not
> finding the
NFS Storages.
We tried to delete this
domains using vdsCLient
> but it says
that the node has to have the
SPM but it
can't take the SPM
>
because it can't find the NFS Domains so
is like a
circle.
>
> Now the
question would
be. Is there any way to delete this domains
from the
> vdsm configuration?. What do
I
have to do to break this circle and the
node
> leave to contending the SPM and
activate
our Data Domain again?.
>
> I appreciate all
your coments and help you
can share with
me.
>
> Regards
>
> David
>
_______________________________________________
> vdsm-devel mailing list
> vdsm-devel(a)lists.fedorahosted.org
>
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
>