Re: [vdsm] Rv: Re: Problems with vdsm deleted Storages Domains
by Federico Simoncelli
----- Original Message -----
> From: "David Andino" <david_andino(a)yahoo.com>
> To: "Federico Simoncelli" <fsimonce(a)redhat.com>
> Sent: Monday, May 26, 2014 4:09:44 PM
> Subject: Re: Rv: Re: [vdsm] Problems with vdsm deleted Storages Domains
>
> Hello Federico
>
> I have attached the entire file so you can read it.
>
> Let me know whether you need anything else
>
> Thanks
>
> David
The relevant traceback is:
7b861db2-05c7-44b1-aac7-b4ea018120cf::ERROR::2014-05-26 07:01:15,376::sp::329::Storage.StoragePool::(startSpm) Unexpected error
Traceback (most recent call last):
File "/usr/share/vdsm/storage/sp.py", line 296, in startSpm
self._updateDomainsRole()
File "/usr/share/vdsm/storage/securable.py", line 75, in wrapper
return method(self, *args, **kwargs)
File "/usr/share/vdsm/storage/sp.py", line 205, in _updateDomainsRole
domain = sdCache.produce(sdUUID)
File "/usr/share/vdsm/storage/sdc.py", line 98, in produce
domain.getRealDomain()
File "/usr/share/vdsm/storage/sdc.py", line 52, in getRealDomain
return self._cache._realProduce(self._sdUUID)
File "/usr/share/vdsm/storage/sdc.py", line 122, in _realProduce
domain = self._findDomain(sdUUID)
File "/usr/share/vdsm/storage/sdc.py", line 141, in _findDomain
dom = findMethod(sdUUID)
File "/usr/share/vdsm/storage/sdc.py", line 171, in _findUnfetchedDomain
raise se.StorageDomainDoesNotExist(sdUUID)
StorageDomainDoesNotExist: Storage domain does not exist: (u'ba26bc14-3aff-48eb-816c-e02e81e5fd63',)
The problem was fixed (and backported to 3.4) in:
http://gerrit.ovirt.org/25424
http://gerrit.ovirt.org/27194
That is going to be available in vdsm-4.14.8
With that build you'll be able to start the SPM and later on use
forcedDetachStorageDomain to remove the old domain.
--
Federico
9 years, 11 months
Rv: Re: Problems with vdsm deleted Storages Domains
by David Andino
> Hello Federico,
>
> These are your
> requirements:
>
> Version:
>
> # rpm -aq | grep vdsm
> vdsm-python-zombiereaper-4.14.6-0.el6.noarch
> vdsm-xmlrpc-4.14.6-0.el6.noarch
> vdsm-4.14.6-0.el6.x86_64
> vdsm-python-4.14.6-0.el6.x86_64
> vdsm-cli-4.14.6-0.el6.noarch
> vdsm-hook-hostusb-4.14.6-0.el6.noarch
>
> The vdsm.log content:
>
> Thread-14::INFO::2014-05-26
> 07:01:01,287::logUtils::44::dispatcher::(wrapper) Run and
> protect:
> getTaskStatus(taskID='45cbd9d4-7a1c-47ff-abb2-92c0dc539554',
> spUUID=None, options=None)
> Thread-14::INFO::2014-05-26
> 07:01:01,288::logUtils::47::dispatcher::(wrapper) Run and
> protect: getTaskStatus, Return response:
> {'taskStatus': {'code': 358,
> 'message': 'Storage domain does not exist',
> 'taskState': 'finished',
> 'taskResult': 'cleanSuccess',
> 'taskID':
> '45cbd9d4-7a1c-47ff-abb2-92c0dc539554'}}
> Thread-14::INFO::2014-05-26
> 07:01:01,296::logUtils::44::dispatcher::(wrapper) Run and
> protect:
> getSpmStatus(spUUID='00000002-0002-0002-0002-00000000030b',
> options=None)
> Thread-14::INFO::2014-05-26
> 07:01:01,317::logUtils::47::dispatcher::(wrapper) Run and
> protect: getSpmStatus, Return response: {'spm_st':
> {'spmId': -1, 'spmStatus': 'Free',
> 'spmLver': -1}}
> Thread-14::INFO::2014-05-26
> 07:01:01,388::logUtils::44::dispatcher::(wrapper) Run and
> protect:
> clearTask(taskID='45cbd9d4-7a1c-47ff-abb2-92c0dc539554',
> spUUID=None, options=None)
> Thread-14::INFO::2014-05-26
> 07:01:01,388::logUtils::47::dispatcher::(wrapper) Run and
> protect: clearTask, Return response: None
> Thread-17::ERROR::2014-05-26
> 07:01:07,404::sdc::137::Storage.StorageDomainCache::(_findDomain)
> looking for unfetched domain
> 0ae15b52-821f-438a-9602-2494eac3ac5b
> Thread-17::ERROR::2014-05-26
> 07:01:07,405::sdc::154::Storage.StorageDomainCache::(_findUnfetchedDomain)
> looking for domain 0ae15b52-821f-438a-9602-2494eac3ac5b
> Thread-17::WARNING::2014-05-26
> 07:01:07,672::lvm::377::Storage.LVM::(_reloadvgs) lvm vgs
> failed: 5 [] [' Volume group
> "0ae15b52-821f-438a-9602-2494eac3ac5b" not
> found']
> Thread-17::ERROR::2014-05-26
> 07:01:07,680::sdc::143::Storage.StorageDomainCache::(_findDomain)
> domain 0ae15b52-821f-438a-9602-2494eac3ac5b not found
> Traceback (most recent call last):
> File
> "/usr/share/vdsm/storage/sdc.py", line 141, in
> _findDomain
> dom =
> findMethod(sdUUID)
> File
> "/usr/share/vdsm/storage/sdc.py", line 171, in
> _findUnfetchedDomain
> raise
> se.StorageDomainDoesNotExist(sdUUID)
> StorageDomainDoesNotExist: Storage domain does
> not exist:
> (u'0ae15b52-821f-438a-9602-2494eac3ac5b',)
> Thread-17::ERROR::2014-05-26
> 07:01:07,681::domainMonitor::239::Storage.DomainMonitorThread::(_monitorDomain)
> Error while collecting domain
> 0ae15b52-821f-438a-9602-2494eac3ac5b monitoring
> information
> Traceback (most recent call
> last):
> File
> "/usr/share/vdsm/storage/domainMonitor.py", line
> 204, in _monitorDomain
> self.domain =
> sdCache.produce(self.sdUUID)
> File
> "/usr/share/vdsm/storage/sdc.py", line 98, in
> produce
> domain.getRealDomain()
> File
> "/usr/share/vdsm/storage/sdc.py", line 52, in
> getRealDomain
> return
> self._cache._realProduce(self._sdUUID)
>
> File "/usr/share/vdsm/storage/sdc.py", line 122,
> in _realProduce
> domain =
> self._findDomain(sdUUID)
> File
> "/usr/share/vdsm/storage/sdc.py", line 141, in
> _findDomain
> dom =
> findMethod(sdUUID)
> File
> "/usr/share/vdsm/storage/sdc.py", line 171, in
> _findUnfetchedDomain
> raise
> se.StorageDomainDoesNotExist(sdUUID)
> StorageDomainDoesNotExist: Storage domain does
> not exist:
> (u'0ae15b52-821f-438a-9602-2494eac3ac5b',)
> Thread-16::ERROR::2014-05-26
> 07:01:10,508::sdc::137::Storage.StorageDomainCache::(_findDomain)
> looking for unfetched domain
> ba26bc14-3aff-48eb-816c-e02e81e5fd63
> Thread-16::ERROR::2014-05-26
> 07:01:10,509::sdc::154::Storage.StorageDomainCache::(_findUnfetchedDomain)
> looking for domain ba26bc14-3aff-48eb-816c-e02e81e5fd63
> Thread-16::WARNING::2014-05-26
> 07:01:10,829::lvm::377::Storage.LVM::(_reloadvgs) lvm vgs
> failed: 5 [] [' Volume group
> "ba26bc14-3aff-48eb-816c-e02e81e5fd63" not
> found']
> Thread-16::ERROR::2014-05-26
> 07:01:10,836::sdc::143::Storage.StorageDomainCache::(_findDomain)
> domain ba26bc14-3aff-48eb-816c-e02e81e5fd63 not found
> Traceback (most recent call last):
> File
> "/usr/share/vdsm/storage/sdc.py", line 141, in
> _findDomain
> dom =
> findMethod(sdUUID)
> File
> "/usr/share/vdsm/storage/sdc.py", line 171, in
> _findUnfetchedDomain
> raise
> se.StorageDomainDoesNotExist(sdUUID)
> StorageDomainDoesNotExist: Storage domain does
> not exist:
> (u'ba26bc14-3aff-48eb-816c-e02e81e5fd63',)
> Thread-16::ERROR::2014-05-26
> 07:01:10,837::domainMonitor::239::Storage.DomainMonitorThread::(_monitorDomain)
> Error while collecting domain
> ba26bc14-3aff-48eb-816c-e02e81e5fd63 monitoring
> information
> Traceback (most recent call
> last):
> File
> "/usr/share/vdsm/storage/domainMonitor.py", line
> 204, in _monitorDomain
> self.domain =
> sdCache.produce(self.sdUUID)
> File
> "/usr/share/vdsm/storage/sdc.py", line 98, in
> produce
> domain.getRealDomain()
> File
> "/usr/share/vdsm/storage/sdc.py", line 52, in
> getRealDomain
> return
> self._cache._realProduce(self._sdUUID)
>
> File "/usr/share/vdsm/storage/sdc.py", line 122,
> in _realProduce
> domain =
> self._findDomain(sdUUID)
> File
> "/usr/share/vdsm/storage/sdc.py", line 141, in
> _findDomain
> dom =
> findMethod(sdUUID)
> File
> "/usr/share/vdsm/storage/sdc.py", line 171, in
> _findUnfetchedDomain
> raise
> se.StorageDomainDoesNotExist(sdUUID)
> StorageDomainDoesNotExist: Storage domain does
> not exist:
> (u'ba26bc14-3aff-48eb-816c-e02e81e5fd63',)
> Thread-14::INFO::2014-05-26
> 07:01:11,489::logUtils::44::dispatcher::(wrapper) Run and
> protect:
> getSpmStatus(spUUID='00000002-0002-0002-0002-00000000030b',
> options=None)
> Thread-14::INFO::2014-05-26
> 07:01:11,511::logUtils::47::dispatcher::(wrapper) Run and
> protect: getSpmStatus, Return response: {'spm_st':
> {'spmId': -1, 'spmStatus': 'Free',
> 'spmLver': -1}}
> Thread-14::INFO::2014-05-26
> 07:01:11,516::logUtils::44::dispatcher::(wrapper) Run and
> protect: getAllTasksStatuses(spUUID=None, options=None)
> Thread-14::ERROR::2014-05-26
> 07:01:11,516::task::866::TaskManager.Task::(_setError)
> Task=`74051611-a84b-43d6-8cb4-5e148dd59eed`::Unexpected
> error
> Traceback (most recent call last):
> File
> "/usr/share/vdsm/storage/task.py", line 873, in
> _run
> return fn(*args, **kargs)
> File
> "/usr/share/vdsm/logUtils.py", line 45, in
> wrapper
> res = f(*args, **kwargs)
> File
> "/usr/share/vdsm/storage/hsm.py", line 2114, in
> getAllTasksStatuses
> allTasksStatus =
> sp.getAllTasksStatuses()
> File
> "/usr/share/vdsm/storage/securable.py", line 73,
> in wrapper
> raise
> SecureError("Secured object is not in safe
> state")
> SecureError: Secured object is
> not in safe state
> Thread-14::INFO::2014-05-26
> 07:01:11,517::task::1168::TaskManager.Task::(prepare)
> Task=`74051611-a84b-43d6-8cb4-5e148dd59eed`::aborting: Task
> is aborted: u'Secured object is not in safe state' -
> code 100
> Thread-14::ERROR::2014-05-26
> 07:01:11,518::dispatcher::68::Storage.Dispatcher.Protect::(run)
> Secured object is not in safe state
> Traceback (most recent call last):
> File
> "/usr/share/vdsm/storage/dispatcher.py", line 60,
> in run
> result =
> ctask.prepare(self.func, *args, **kwargs)
>
> File "/usr/share/vdsm/storage/task.py", line 103,
> in wrapper
> return m(self, *a, **kw)
> File
> "/usr/share/vdsm/storage/task.py", line 1176, in
> prepare
> raise self.error
> SecureError: Secured object is not in safe
> state
> Thread-14::INFO::2014-05-26
> 07:01:11,610::logUtils::44::dispatcher::(wrapper) Run and
> protect:
> getSpmStatus(spUUID='00000002-0002-0002-0002-00000000030b',
> options=None)
> Thread-14::INFO::2014-05-26
> 07:01:11,644::logUtils::47::dispatcher::(wrapper) Run and
> protect: getSpmStatus, Return response: {'spm_st':
> {'spmId': -1, 'spmStatus': 'Free',
> 'spmLver': -1}}
> Thread-14::INFO::2014-05-26
> 07:01:11,679::logUtils::44::dispatcher::(wrapper) Run and
> protect:
> spmStart(spUUID='00000002-0002-0002-0002-00000000030b',
> prevID=-1, prevLVER='-1', maxHostID=250,
> domVersion='3', options=None)
> Thread-14::INFO::2014-05-26
> 07:01:11,681::logUtils::47::dispatcher::(wrapper) Run and
> protect: spmStart, Return response: None
> 7b861db2-05c7-44b1-aac7-b4ea018120cf::INFO::2014-05-26
> 07:01:11,988::clusterlock::184::SANLock::(acquireHostId)
> Acquiring host id for domain
> 3e7aa4fd-fb27-47a3-9ddb-d7a97027723d (id: 12)
> 7b861db2-05c7-44b1-aac7-b4ea018120cf::WARNING::2014-05-26
> 07:01:11,989::fileUtils::167::Storage.fileUtils::(createdir)
> Dir
> /rhev/data-center/mnt/blockSD/3e7aa4fd-fb27-47a3-9ddb-d7a97027723d/images
> already exists
> 7b861db2-05c7-44b1-aac7-b4ea018120cf::WARNING::2014-05-26
> 07:01:11,990::fileUtils::167::Storage.fileUtils::(createdir)
> Dir
> /rhev/data-center/mnt/blockSD/3e7aa4fd-fb27-47a3-9ddb-d7a97027723d/dom_md
> already exists
> 7b861db2-05c7-44b1-aac7-b4ea018120cf::WARNING::2014-05-26
> 07:01:12,012::persistentDict::256::Storage.PersistentDict::(refresh)
> data has no embedded checksum - trust it as it is
> Thread-14::INFO::2014-05-26
> 07:01:12,329::logUtils::44::dispatcher::(wrapper) Run and
> protect: repoStats(options=None)
> Thread-14::INFO::2014-05-26
> 07:01:12,339::logUtils::47::dispatcher::(wrapper) Run and
> protect: repoStats, Return response:
> {u'3e7aa4fd-fb27-47a3-9ddb-d7a97027723d':
> {'code': 0, 'version': 3,
> 'acquired': True, 'delay':
> '0.00607831', 'lastCheck': '4.7',
> 'valid': True},
> u'ba26bc14-3aff-48eb-816c-e02e81e5fd63':
> {'code': 358, 'version': -1,
> 'acquired': False, 'delay': '0',
> 'lastCheck': '1.5', 'valid': False},
> u'0ae15b52-821f-438a-9602-2494eac3ac5b':
> {'code': 358, 'version': -1,
> 'acquired': False, 'delay': '0',
> 'lastCheck': '4.7', 'valid':
> False}}
> 7b861db2-05c7-44b1-aac7-b4ea018120cf::INFO::2014-05-26
> 07:01:12,344::clusterlock::235::SANLock::(acquire) Acquiring
> cluster lock for domain 3e7aa4fd-fb27-47a3-9ddb-d7a97027723d
> (id: 12)
> Thread-14::INFO::2014-05-26
> 07:01:12,739::logUtils::44::dispatcher::(wrapper) Run and
> protect:
> getTaskStatus(taskID='7b861db2-05c7-44b1-aac7-b4ea018120cf',
> spUUID=None, options=None)
> Thread-14::INFO::2014-05-26
> 07:01:12,742::logUtils::47::dispatcher::(wrapper) Run and
> protect: getTaskStatus, Return response:
> {'taskStatus': {'code': 0,
> 'message': 'Task is initializing',
> 'taskState': 'running',
> 'taskResult': '', 'taskID':
> '7b861db2-05c7-44b1-aac7-b4ea018120cf'}}
> 7b861db2-05c7-44b1-aac7-b4ea018120cf::INFO::2014-05-26
> 07:01:13,347::sp::431::Storage.StoragePool::(_upgradePool)
> Trying to upgrade master domain
> `3e7aa4fd-fb27-47a3-9ddb-d7a97027723d`
> 7b861db2-05c7-44b1-aac7-b4ea018120cf::WARNING::2014-05-26
> 07:01:13,615::persistentDict::256::Storage.PersistentDict::(refresh)
> data has no embedded checksum - trust it as it is
> Thread-14::INFO::2014-05-26
> 07:01:13,758::logUtils::44::dispatcher::(wrapper) Run and
> protect:
> getTaskStatus(taskID='7b861db2-05c7-44b1-aac7-b4ea018120cf',
> spUUID=None, options=None)
> Thread-14::INFO::2014-05-26
> 07:01:13,759::logUtils::47::dispatcher::(wrapper) Run and
> protect: getTaskStatus, Return response:
> {'taskStatus': {'code': 0,
> 'message': 'Task is initializing',
> 'taskState': 'running',
> 'taskResult': '', 'taskID':
> '7b861db2-05c7-44b1-aac7-b4ea018120cf'}}
> 7b861db2-05c7-44b1-aac7-b4ea018120cf::INFO::2014-05-26
> 07:01:13,878::sd::374::Storage.StorageDomain::(_registerResourceNamespaces)
> Resource namespace
> 3e7aa4fd-fb27-47a3-9ddb-d7a97027723d_imageNS already
> registered
> 7b861db2-05c7-44b1-aac7-b4ea018120cf::INFO::2014-05-26
> 07:01:13,878::sd::382::Storage.StorageDomain::(_registerResourceNamespaces)
> Resource namespace
> 3e7aa4fd-fb27-47a3-9ddb-d7a97027723d_volumeNS already
> registered
> 7b861db2-05c7-44b1-aac7-b4ea018120cf::INFO::2014-05-26
> 07:01:13,879::blockSD::456::Storage.StorageDomain::(_registerResourceNamespaces)
> Resource namespace
> 3e7aa4fd-fb27-47a3-9ddb-d7a97027723d_lvmActivationNS already
> registered
> Thread-228151::ERROR::2014-05-26
> 07:01:14,188::sdc::137::Storage.StorageDomainCache::(_findDomain)
> looking for unfetched domain
> ba26bc14-3aff-48eb-816c-e02e81e5fd63
> Thread-228152::ERROR::2014-05-26
> 07:01:14,191::sdc::137::Storage.StorageDomainCache::(_findDomain)
> looking for unfetched domain
> 0ae15b52-821f-438a-9602-2494eac3ac5b
> Thread-228151::ERROR::2014-05-26
> 07:01:14,192::sdc::154::Storage.StorageDomainCache::(_findUnfetchedDomain)
> looking for domain ba26bc14-3aff-48eb-816c-e02e81e5fd63
> Thread-228152::ERROR::2014-05-26
> 07:01:14,195::sdc::154::Storage.StorageDomainCache::(_findUnfetchedDomain)
> looking for domain 0ae15b52-821f-438a-9602-2494eac3ac5b
> 7b861db2-05c7-44b1-aac7-b4ea018120cf::WARNING::2014-05-26
> 07:01:14,201::fileUtils::167::Storage.fileUtils::(createdir)
> Dir
> /rhev/data-center/mnt/blockSD/3e7aa4fd-fb27-47a3-9ddb-d7a97027723d/master
> already exists
> Thread-228152::WARNING::2014-05-26
> 07:01:14,513::lvm::377::Storage.LVM::(_reloadvgs) lvm vgs
> failed: 5 [] [' Volume group
> "0ae15b52-821f-438a-9602-2494eac3ac5b" not
> found']
> Thread-228151::WARNING::2014-05-26
> 07:01:14,513::lvm::377::Storage.LVM::(_reloadvgs) lvm vgs
> failed: 5 [] [' Volume group
> "ba26bc14-3aff-48eb-816c-e02e81e5fd63" not
> found']
> Thread-228152::ERROR::2014-05-26
> 07:01:14,553::sdc::143::Storage.StorageDomainCache::(_findDomain)
> domain 0ae15b52-821f-438a-9602-2494eac3ac5b not found
>
>
>
> Regards
>
>
> --------------------------------------------
> El lun, 5/26/14, Federico Simoncelli <fsimonce(a)redhat.com>
> escribió:
>
> Asunto: Re:
> [vdsm] Problems with vdsm deleted Storages Domains
> A: "David Andino" <david_andino(a)yahoo.com>
> Cc:
> vdsm-devel(a)lists.fedorahosted.org
> Fecha: lunes, 26 de mayo de 2014, 04:54 am
>
> Can you paste the exact
> traceback that you see in the vdsm log?
> Also, what vdsm version is it?
>
> Thanks,
> --
>
> Federico
>
> -----
> Original Message
> -----
> > From:
>
> "David Andino" <david_andino(a)yahoo.com>
> > To: vdsm-devel(a)lists.fedorahosted.org
> > Sent: Monday, May 26, 2014 12:37:35 AM
> > Subject: [vdsm] Problems with vdsm
> deleted
> Storages Domains
>
> >
> >
> Hello
> everyone,
> >
> > I
> have serious troubles with something that
> recently happend
> and this is our
> > story.
> >
> > We have 12 nodes and
> 1
> engine server. Our infrastructure is based in ISCSI
> > and NFS (for Exports and ISOs).
> Recently
> my partner made some bad
> practices
> > with
> our
> NFS server and one thing we noted in the engine, was
> that all of
> > our nodes
> were contending
> the SPM and they never got
> agreed which one would
> > it take it and
> the logs pointed to a
> Metadata corruption
> in the NFS Domains
> >
>
> and after hours of the contending situation we tried to
> stabilized it.
> >
> >
> Trying to stabilize our
> cluster, we shutdown our NFS server
> and
> nothing
> > happened. The cluster was
> still contending the SPM. After many tests
> like
> > putting the nodes in maintain
> mode,
> reboot, shutdown all the cluster,
> etc,
> >
> The last thing
> we did was destroying from the configuration
> (using the web
> >
> interface) the Export
> and ISO Domains
> leaving the Data Domain intact trying
> >
> to put online our guests, and that was (I
>
> think) our big mistake. Because
> > now
> we
> are getting a serious situation that our
> Data Domain is not
> getting
>
> > online because the vdsm in
> every node
> are looking for the NFS domains we
> >
> already deleted.
> >
>
> > We were using vdsClient to consult the
>
> information the vdsm is getting and it
>
> >
> says that we have 1 Storage Pool
> Domain and 3 Storage
> Domains, our ISCSI
> and
> > the 2 NFS
> domains
> that were deleted.
> >
>
> > We let only one node active and the rest
> are in maintenance and the message
> > is
> that it still
> contending for SPM. The vdsm.log says that the
> node is not
> > finding the
> NFS Storages.
> We tried to delete this
> domains using vdsCLient
> > but it says
> that the node has to have the
> SPM but it
> can't take the SPM
> >
> because it can't find the NFS Domains so
> is like a
> circle.
> >
> > Now the
> question would
> be. Is there any way to delete this domains
>
> from the
> > vdsm configuration?. What do
> I
> have to do to break this circle and the
> node
> > leave to contending the SPM and
> activate
> our Data Domain again?.
> >
> > I appreciate all
> your coments and help you
> can share with
> me.
> >
> > Regards
> >
> > David
> >
>
> _______________________________________________
> > vdsm-devel mailing list
>
> > vdsm-devel(a)lists.fedorahosted.org
> > https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
> >
>
>
9 years, 11 months
Fwd: Storage Functional Tests.
by Vered Volansky
----- Forwarded Message -----
> From: "Vered Volansky" <vered(a)redhat.com>
> To: "infra" <infra(a)ovirt.org>
> Sent: Monday, May 26, 2014 1:53:02 PM
> Subject: Storage Functional Tests.
>
> CI job vdsm_master_storage_functional_tests_localfs_gerrit is now on for any
> storage-related patch-set, as well as dependencies for the storage tests.
> (storageTests.py) .
> Please do not ignore test failures.
9 years, 11 months
Problems with vdsm deleted Storages Domains
by David Andino
Hello everyone,
I have serious troubles with something that recently happend and this is our story.
We have 12 nodes and 1 engine server. Our infrastructure is based in ISCSI and NFS (for Exports and ISOs). Recently my partner made some bad practices with our NFS server and one thing we noted in the engine, was that all of our nodes were contending the SPM and they never got agreed which one would it take it and the logs pointed to a Metadata corruption in the NFS Domains and after hours of the contending situation we tried to stabilized it.
Trying to stabilize our cluster, we shutdown our NFS server and nothing happened. The cluster was still contending the SPM. After many tests like putting the nodes in maintain mode, reboot, shutdown all the cluster, etc, The last thing we did was destroying from the configuration (using the web interface) the Export and ISO Domains leaving the Data Domain intact trying to put online our guests, and that was (I think) our big mistake. Because now we are getting a serious situation that our Data Domain is not getting online because the vdsm in every node are looking for the NFS domains we already deleted.
We were using vdsClient to consult the information the vdsm is getting and it says that we have 1 Storage Pool Domain and 3 Storage Domains, our ISCSI and the 2 NFS domains that were deleted.
We let only one node active and the rest are in maintenance and the message is that it still contending for SPM. The vdsm.log says that the node is not finding the NFS Storages. We tried to delete this domains using vdsCLient but it says that the node has to have the SPM but it can't take the SPM because it can't find the NFS Domains so is like a circle.
Now the question would be. Is there any way to delete this domains from the vdsm configuration?. What do I have to do to break this circle and the node leave to contending the SPM and activate our Data Domain again?.
I appreciate all your coments and help you can share with me.
Regards
David
9 years, 11 months
pep8 issue
by Amador Pahim
Building vdsm/master in F20, I've got:
./vdsm/virt/migration.py:223:19: E225 missing whitespace around operator
In vdsm/virt/migration.py:
218 e.err = (libvirt.VIR_ERR_OPERATION_ABORTED, # error code$
219 libvirt.VIR_FROM_QEMU, # error
domain$
220 'operation aborted', # error
message$
221 libvirt.VIR_ERR_WARNING, # error
level$
222 '', '', '', # str1, str2,
str3,$
223 -1, -1) # int1, int2$
224 raise e$
pep8 is not accepting negative integer. Instead, it is handling the
minus sign as an operator. Quick workaround is change -1 to int(-1).
Is this a known?
I'm using python-pep8-1.5.4-1.fc20.noarch
--
Pahim
9 years, 11 months
vdsm disabling logical volumes
by Jiri Moskovcak
Greetings vdsm developers!
While working on adding ISCSI support to the hosted engine tools, I ran
into a problem with vdms. It seems that when stopped vdsm deactivates
ALL logical volumes in it's volume group and when it starts it
reactivates only specific logical volumes. This is a problem for hosted
engine tools as they create logical volumes in the same volume group and
when vdsm deactivates the LVs the hosted engine tools don't have a way
to reactivate it, because the services drop the root permissions and are
running as vdsm and apparently only root can activate LVs. So far the
only suitable solution seems to be to change vdsm to only
deactivate/activate it's own LVs. I would be grateful for any hints.
Thank you,
Jirka
9 years, 11 months