vdsm API schema
by agl@us.ibm.com
For the past few weeks I have been working on creating a schema that fully
describes the vdsm API. I am mostly finished with that effort and I wanted to
share the results with the team. Attached are two files: the raw schema and an
html document with cross-linked type information.
This should already be useful in its current form, but I have bigger plans. I
would first like to get help to correct errors in the schema. Then, I will
start the process of writing a code generator that will create C/gObject code
that we can compile into a libvdsm with language bindings for python, java, etc.
Please take a look at the attached files and let me know what you think?
P.S. I tried to attach these to the oVirt Wiki, but they are not permitted file
types.
--
Adam Litke <agl(a)us.ibm.com>
IBM Linux Technology Center
11 years, 6 months
[RFC] GlusterFS domain specific changes
by M. Mohan Kumar
We are developing a GlusterFS server translator to export block devices
as regular files to the client. Using block devices to serve VM images
gives performance improvements, since it avoids some file system
bottlenecks in the host kernel. Goal is to use one block device(ie file
at the client side) per VM image and feed this file to QEMU to get the
performance improvements. QEMU will talk to glusterfs server directly
using libgfapi.
Currently we support only exporting Volume groups and Logical
Volumes. Logical volumes are exported as regular files to the client. In
GlusterFS terminology a volume capable of exporting block devices is
created by specifying the 'Volume Group' (ie VG in Logical Volume
management). Block Device translator(BD xlator) exports this volume
group as a directory and LVs under it as regular files. In the gluster
mount point creating a file results in creating a logical volume,
removing a file results in removing logical volume etc.
When a GlusterFS volume enabled with BD xlator is used, directory
creation in that gluster mount path is not supported because directory
maps to Volume groups in BD xlator. But it could be an issue in VDSM
environment when a new VDSM volume is created for GlusterFS domain, VDSM
mounts the storage domain and creates directories under that and create
files for vm image and other uses (like meta data).
Is it possible to modify this behavior in VDSM to use flat structure
instead of creating directories and VM images and other files underneath
it? ie for GlusterFS domain with BD xlator VDSM will not create any
directory and only creates all required files under the mount point
directory itself.
Note:
Patches to enable exporting block devices as regular files are available
in Gluster Gerrit system
http://review.gluster.com/3551
11 years, 8 months
Agenda for today's call
by abaron@redhat.com
Hi all,
I would like to discuss the following on today's call:
1. Gerrit vs. mailing list
2. mandatory unittest per patch
3. pep8
4. ??
If you have anything else you'd like to discuss, please reply to this email.
Regards,
Ayal.
11 years, 9 months
Fatal error when migrate the VM with disk on iscsi data center
by shuming@linux.vnet.ibm.com
Hi,
I am testing the VM migration in oVirt engine 3.1 bteta release and
successfully did the migration test on NFS date center. However, the
test of VM migration on iSCSI data center failed.
I dumped the error and warning segments from the vdsm logs on both the
source and destination host. Please ignore the time stamp out of sync
between the two hosts.
_
_*The vdsm log segment from the source host:*
66-4da5-8310-6dcc970e5367`::migration downtime thread exiting
Thread-115166::ERROR::2012-07-25
23:18:55,953::vm::176::vm.Vm::(_recover)
vmId=`11e06222-2d66-4da5-8310-6dcc970e5367`::operation failed: Failed to
connect to remote libvirt URI qemu+tcp://9.181.129.110/system
Dummy-115148::DEBUG::2012-07-25
23:18:55,981::__init__::1249::Storage.Misc.excCmd::(_log) 'dd
if=/rhev/data-center/0b9a4ea4-d487-11e1-b614-5254001498c4/mastersd/dom_md/inbox
iflag=direct,fullblock count=1 bs=1024000' (cwd None)
Thread-115166::ERROR::2012-07-25 23:18:56,028::vm::240::vm.Vm::(run)
vmId=`11e06222-2d66-4da5-8310-6dcc970e5367`::Failed to migrate
Traceback (most recent call last):
File "/usr/share/vdsm/vm.py", line 223, in run
self._startUnderlyingMigration()
File "/usr/share/vdsm/libvirtvm.py", line 451, in
_startUnderlyingMigration
None, maxBandwidth)
File "/usr/share/vdsm/libvirtvm.py", line 491, in f
ret = attr(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py",
line 82, in wrapper
ret = f(*args, **kwargs)
File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1034, in
migrateToURI2
if ret == -1: raise libvirtError ('virDomainMigrateToURI2()
failed', dom=self)
libvirtError: operation failed: Failed to connect to remote libvirt URI
qemu+tcp://9.181.129.110/system
*The vdsm log segment from the destination host:*
Thread-1577::INFO::2012-07-25
23:19:50,407::API::601::vds::(_getNetworkIp) network None: using 0
Thread-1577::INFO::2012-07-25 23:19:50,407::API::228::vds::(create)
vmContainerLock acquired by vm 11e06222-2d66-4da5-8310-6dcc970e5367
Thread-1578::DEBUG::2012-07-25
23:19:50,411::vm::564::vm.Vm::(_startUnderlyingVm)
vmId=`11e06222-2d66-4da5-8310-6dcc970e5367`::Start
Thread-1577::DEBUG::2012-07-25 23:19:50,411::API::244::vds::(create)
Total desktops after creation of 11e06222-2d66-4da5-8310-6dcc970e5367 is 1
Thread-1578::DEBUG::2012-07-25
23:19:50,411::vm::568::vm.Vm::(_startUnderlyingVm)
vmId=`11e06222-2d66-4da5-8310-6dcc970e5367`::_ongoingCreations acquired
Thread-1577::DEBUG::2012-07-25
23:19:50,412::libvirtvm::2438::vm.Vm::(waitForMigrationDestinationPrepare)
vmId=`11e06222-2d66-4da5-8310-6dcc970e5367`::migration destination:
waiting 36s for path preparation
Thread-1578::INFO::2012-07-25
23:19:50,412::libvirtvm::1285::vm.Vm::(_run)
vmId=`11e06222-2d66-4da5-8310-6dcc970e5367`::VM wrapper has started
Thread-1578::WARNING::2012-07-25
23:19:50,413::vm::398::vm.Vm::(getConfDevices)
vmId=`11e06222-2d66-4da5-8310-6dcc970e5367`::Unknown type found, device:
'{'device': 'unix', 'alias': 'channel0', 'type': 'channel', 'address':
{'bus': '0', 'controller': '0', 'type': 'virtio-serial', 'port': '1'}}'
found
Thread-1578::ERROR::2012-07-25
23:24:50,484::vm::604::vm.Vm::(_startUnderlyingVm)
vmId=`11e06222-2d66-4da5-8310-6dcc970e5367`::The vm start process failed
Traceback (most recent call last):
File "/usr/share/vdsm/vm.py", line 584, in _startUnderlyingVm
self._waitForIncomingMigrationFinish()
File "/usr/share/vdsm/libvirtvm.py", line 1572, in
_waitForIncomingMigrationFinish
self._connection.lookupByUUIDString(self.id),
File "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py",
line 82, in wrapper
ret = f(*args, **kwargs)
File "/usr/lib64/python2.7/site-packages/libvirt.py", line 2608, in
lookupByUUIDString
if ret is None:raise libvirtError('virDomainLookupByUUIDString()
failed', conn=self)
libvirtError: Domain not found: no domain with matching uuid
'11e06222-2d66-4da5-8310-6dcc970e5367' - Timed out (did not receive
success event)
Thread-1578::DEBUG::2012-07-25
23:24:50,485::vm::920::vm.Vm::(setDownStatus)
vmId=`11e06222-2d66-4da5-8310-6dcc970e5367`::Changed state to Down:
Domain not found: no domain with matching uuid '11e0622
Any idea?
--
Shu Ming <shuming(a)linux.vnet.ibm.com>
IBM China Systems and Technology Laboratory
11 years, 9 months
Using vdsm hook to exploit gluster backend of qemu
by deepakcs@linux.vnet.ibm.com
Hello,
Recently there were patches posted in qemu-devel to support gluster
as a block backend for qemu.
This introduced new way of specifying drive location to qemu as ...
-drive file=gluster:<volumefile>:<image name>
where...
volumefile is the gluster volume file name ( say gluster volume is
pre-configured on the host )
image name is the name of the image file on the gluster mount point
I wrote a vdsm standalone script using SHAREDFS ( which maps to PosixFs
) taking cues from http://www.ovirt.org/wiki/Vdsm_Standalone
The conndict passed to connectStorageServer is as below...
[dict(id=1, connection="kvmfs01-hs22:dpkvol", vfs_type="glusterfs",
mnt_options="")]
Here note that 'dpkvol' is the name of the gluster volume
I and am able to create and invoke a VM backed by a image file residing
on gluster mount.
But since this is SHAREDFS way, the qemu -drive cmdline generated via
VDSM is ...
-drive file=/rhev/datacentre/mnt/.... -- which eventually softlinks to
the image file on the gluster mount point.
I was looking to write a vdsm hook to be able to change the above to ....
-drive file=gluster:<volumefile>:<image name>
which means I would need access to some of the conndict params inside
the hook, esp. the 'connection' to extract the volume name.
1) In looking at the current VDSM code, i don't see a way for the hook
to know anything abt the storage domain setup. So the only
way is to have the user pass a custom param which provides the path to
the volumefile & image and use it in the hook. Is there
a better way ? Can i use the vdsm gluster plugin support inside the hook
to determine the volfile from the volname, assuming I
only take the volname as the custom param, and determine imagename from
the existing <source file = ..> tag ( basename is the
image name). Wouldn't it be better to provide a way for hooks to access
( readonly) storage domain parameters, so that they can
use that do implement the hook logic in a more saner way ?
2) In talking to Eduardo, it seems there are discussion going on to see
how prepareVolumePath and prepareImage could be exploited
to fit gluster ( and in future other types) based images. I am not very
clear on the image and volume code of vdsm, frankly its very
complex and hard to understand due to lack of comments.
I would appreciate if someone can guide me on what is the best way to
achive my goal (-drive file=gluster:<volumefile>:<image name>)
here. Any short term solutions if not perfect solution are also
appreciated, so that I can atleast have a working setup where I just
run my VDSM standaloen script and my qemu cmdline using gluster:... is
generated.
Currently I am using <qemu:commandline> tag facility of libvirt to
inject the needed qemu options and hardcoding the volname, imagename
but i would like to do this based on the conndict passed by the user
when creating SHAREDFS domain.
thanx,
deepak
11 years, 9 months
[RFC] An alternative way to provide a supported interface -- libvdsm
by Anthony Liguori
Hi,
I've been reading through the API threads here and considering the options. To
be honest, I worry a lot about the scope of these discussions and that there's a
tremendous amount of work before we have a useful end result.
I wonder if we can solve this problem by adding another layer of abstraction...
As Adam is currently building a schema for VDSM's XML-RPC, we could use the QAPI
code generators to build a libvdsm that provided a programmatic C interface for
the XML-RPC interface.
It would take some tweaking, but this could be made a supportable C interface.
The rules for having a supportable C interface are basically:
1) Never change function signatures
2) Never remove functions
3) Always allocate structures in the library and/or pad
4) Only add to structures, never remove or reorder
5) Provide flags that default to zero to indicate that fields/features are not
present.
6) Always zero-initialize structures
Having a libvdsm would allow the transport to change over time w/o affecting
end-users. There are lots of good tools for documenting C APIs and dealing with
versioning of C APIs.
While we can start out with a schema-generated API, over time, we can implement
libvdsm in an open-coded fashion allowing old APIs to be reimplemented in terms
of new APIs.
From a compatibility perspective, libvdsm would be fully backwards compatible
with old versions of VDSM (so it would keep XML-RPC support forever) but may
require new versions of libvdsm to talk to new versions of VDSM. That would
allow for APIs to be deprecated within VDSM without breaking old clients.
I think this would be an incremental approach to building a supportable API
today while still giving the flexibility to make changes in the long term.
And it should be fairly easy to generate a JNI binding and also port
ovirt-engine to use an interface like this (since it already uses the XML-RPC API).
Regards,
Anthony Liguori
11 years, 9 months
Additional info on patch reviewers
by dfediuck@redhat.com
Hi,
This may prove to be useful sometime...
It is possible to ask gerrit to show a reviewer name, along the verify or code review columns.
In order to set it, go to the settings link (up-right corner), and choose Preferences.
Check the line "Display Person Name In Review Category".
See the attached for sample and what to set.
--
/d
"This message will self destruct in the future. Or not."
11 years, 9 months
RFC: Proposal to support network disk type in PosixFS
by deepakcs@linux.vnet.ibm.com
Hello,
I am proposing a method for VDSM to exploit disk of 'network' type
under PosixFS.
Altho' I am taking Gluster as the storage backend example, it should
apply to any other backends (that support network disk type) as well.
Currently under PosixFS, the design is to mount the 'server:/export' and
use that as storage domain.
The libvirt XML generated for such a disk is something like below...
<disk device="disk" snapshot="no" type="file">
<source
file="/rhev/data-center/8fe261ea-43c2-4635-a08a-ccbafe0cde0e/4f31ea5c-c01e-4578-8353-8897b2d691b4/images/c94c9cf2-fa1c-4e43-8c77-f222dbfb032d/eff4db09-1fde-43cd-a75b-34054a64182b"/>
<target bus="ide" dev="hda"/>
<serial>c94c9cf2-fa1c-4e43-8c77-f222dbfb032d</serial>
<driver cache="none" error_policy="stop" io="threads" name="qemu"
type="raw"/>
</disk>
This works well, but does not help exploit the gluster block backend of
QEMU, since the QEMU cmdline generated is -drive
file='/rhev/data-center/....'
Gluster fits as a network block device in QEMU, similar to ceph and
sheepdog backend, QEMU already has.
The proposed libvirt XML for Gluster based disks is ... (WIP)
<disk type='network' device='disk'>
<driver name='qemu' type='raw'/>
<source protocol="gluster" name="volname:imgname">
<host name='server' port='xxx'/>
</source>
<target dev='vda' bus='virtio'/>
</disk>
This causes libvirt to generate QEMU cmdline like : -drive
file=gluster:server@port:volname:imgname. The imgname is relative the
gluster mount point.
I am proposing the below to help VDSM exploit disk as a network device
under PosixFS.
Here is a code snippet (taken from a vdsm standalone script) of how a
storage domain & VM are created in VDSM....
# When storage domain is mounted
gluster_conn = "kvmfs01-hs22:dpkvol" # gluster_server:volume_name
vdsOK(s.connectStorageServer(SHAREDFS_DOMAIN, "my gluster mount",
[dict(id=1, connection=gluster_conn, vfs_type="glusterfs", mnt_options="")])
# do other things...createStoragePool, SPM start etc...
...
...
# Now create a VM
vmId = str(uuid.uuid4())
vdsOK(
s.create(dict(vmId=vmId,
drives=[dict(poolID=spUUID, domainID=sdUUID,
imageID=imgUUID, volumeID=volUUID, disk_type="network",
protocol="gluster", connection=gluster_conn)], # Proposed way
#drives=[dict(poolID=spUUID, domainID=sdUUID,
imageID=imgUUID, volumeID=volUUID)], # Existing way
memSize=256,
display="vnc",
vmName="vm-backed-by-gluster",
)
)
)
1) User (engine in ovirt case) passes disk_type, protocol & connection
keywords as depicted above. NOTE: disk_type is used instead of just type
to avoid confusion with driver_type
-- protocol and connection are already available to User as he/she
used it as part of connectStorageServer ( connection and vfs_type )
-- disk_type is something that User chooses instead of default
(which is file type)
2) Based on these extra keywords passed, the getXML() of 'class Drive'
in libvirtvm.py can be modified to generate <disk type='network'...> as
shown above.
Some parsing would be needed to extract the server, volname. imgname
relative to gluster mount point can be extracted from drive['path']
which holds the fully qualified path.
3) Since these keywords are drive specific, User can choose which drives
he/she wants to use network protocol Vs file. Not passing these keywords
defaults to file, which is what happens today.
This approach would help VDSM to support network disk types under
PosixFS and thus provide the ability to the User to choose file or
network disk types on a per drive basis.
I will post a RFC patch soon ( awaiting libvirt changes ), comments welcome.
thanx,
deepak
11 years, 9 months
How should we handle aborted tasks? via engine, vdsClient or both?
by lyarwood@redhat.com
Hello all,
I'm looking into the use case around an aborted task within vdsmd.
AFAICT both engine and vdsClient are unable to deal with an aborted task
at present.
Within engine we call HSMStopTaskVDSCommand, this inturn calls stopTask
supplying _only_ the task_guid. vdsClient also calls stopTask with only
the task_guid argument.
Why does this matter? Well the only was you can kill an aborted task is
by passing the force=True argument to the stop method of the task object
itself. Without this the _incref method will throw a TaskAborted
exception and we will be unable to stop the task.
Could anyone give me some background on why this is the case before I
start looking into exposing the force option up the stack to engine and
vdsClient.
Relevant code snippets from task.py below.
Thanks in advance,
Lee
vdsm/storage/task.py
1198 def stop(self, force=False):
1199 self.log.debug("stopping in state %s (force %s)",
self.state, force)
1200 self._incref(force)
1201 try:
1202 if self.state.isDone():
1203 self.log.debug("Task already stopped (%s),
ignoring", self.state)
1204 return
1205 elif self.state.isRecovering() and not force and
(self.cleanPolicy == TaskCleanType.auto):
1206 self.log.debug("Task (%s) in recovery and force is
false, ignoring", self.state)
1207 return
1208
1209 self._aborting = True
1210 self._forceAbort = force
1211 finally:
1212 self._decref(force)
959 def _incref(self, force=False):
960 self.lock.acquire()
961 try:
962 if self.aborting() and (self._forceAbort or not force):
963 raise se.TaskAborted(unicode(self))
964
965 self.ref += 1
966 ref = self.ref
967 return ref
968 finally:
969 self.lock.release()
--
Lee Yarwood
Software Maintenance Engineer
Red Hat UK Ltd
200 Fowler Avenue IQ Farnborough, Farnborough, Hants GU14 7JP
Registered in England and Wales under Company Registration No. 03798903
Directors: Michael Cunningham (USA), Brendan Lane (Ireland), Matt
Parson(USA), Charlie Peters (USA)
GPG fingerprint : A5D1 9385 88CB 7E5F BE64 6618 BCA6 6E33 F672 2D76
11 years, 9 months
Verify the storage data integrity after some storage operations with test cases
by shuming@linux.vnet.ibm.com
Hi,
To verify the storage data integrity after some storage operations
like snapshot, merging by VDSM. Here are the test cases I am pondering.
I would like to know your feedback about these thoughts.
1) An customized ISO image with the agent required prepared for
bringing up a VM in VDSM
2) The test case will inform VDSM to create a VM from the customized ISO
image
3) The test case will install an IO application to the VM
3) The test case communicate with the VDSM to inform the IO application
in the VM to write some data intentionally.
4) The test case sends the commands to VDSM do some storage operation
like disk snapshot, volume merging, &etc.
Say snapshot operation here for an example.
5) VDSM then tell the test case the result of the operation like the
name of the snapshot.
6) Test case can read the snapshot made to verify the snapshot with the
data written in 3).
Note: currently, there is no tool to read the snapshot image
directly. We can restart the VM with the snapshot as
the active disk and tell the IO application in the VM to read the
data writen before for test case. And test case can compare
the data read with the data it informs the application in 3).
7) If the two data matches, the storage operation succeed or it fails.
In order to write such a test case, these VDSM features will be required.
1) VDSM can create a VM from a specific ISO image (Almost works)
2) Test case can install an IO application to the VM by VDSM (by
ovirt-agent?)
3) Test case must have some protocols with the IO application in VM for
passing the command to the VM and returning the result from the VM
to the test case(by ovirt-agent?).
4) The IO application can be seen as an test agent. We may extend the
existing agent like ovirt-agent as the IO application.
--
Shu Ming <shuming(a)linux.vnet.ibm.com>
IBM China Systems and Technology Laboratory
11 years, 9 months