于 2012-9-7 13:21, M. Mohan Kumar 写道:
On Thu, 6 Sep 2012 18:59:19 -0400 (EDT), Ayal Baron
<abaron(a)redhat.com> wrote:
>
> ----- Original Message -----
>> ----- Original Message -----
>>> From: "M. Mohan Kumar" <mohan(a)in.ibm.com>
>>> To: vdsm-devel(a)lists.fedorahosted.org
>>> Sent: Wednesday, July 25, 2012 1:26:15 PM
>>> Subject: [vdsm] [RFC] GlusterFS domain specific changes
>>>
>>>
>>> We are developing a GlusterFS server translator to export block
>>> devices
>>> as regular files to the client. Using block devices to serve VM
>>> images
>>> gives performance improvements, since it avoids some file system
>>> bottlenecks in the host kernel. Goal is to use one block device(ie
>>> file
>>> at the client side) per VM image and feed this file to QEMU to get
>>> the
>>> performance improvements. QEMU will talk to glusterfs server
>>> directly
>>> using libgfapi.
>>>
>>> Currently we support only exporting Volume groups and Logical
>>> Volumes. Logical volumes are exported as regular files to the
>>> client.
> Are you actually using LVM behind the scenes?
> If so, why bother with exposing the LVs as files and not raw block devices?
>
Ayal,
The idea is to provide a FS interface for managing block devices. One
can mount the Block Device Gluster Volume and create a LV and size it
just by
$ touch lv1
$ truncate -s5G lv1
And other file commands can be used to clone LVs, snapshot LVs
$ ln lv1 lv2 # clones
$ ln -s lv1 lv1.sn # creates snapshot
Do we have special reason to use
"ln"?
Why not use "cp" as the comannd to do the snapshot instead of "ln"?
By enabling this feature GlusterFS can directly export storage in
SAN. We are planning to add feature to export LUNs also as regular files
in future.
IMO, The major feature of GlusterFS is to export distributed local disks
to the clients.
If we have SAN in the backend, that means the storage block devices
should be exported
to clients natually. Why do we need GlusterSF to export the block
devices in SAN?
>>> In GlusterFS terminology a volume capable of exporting block
>>> devices is
>>> created by specifying the 'Volume Group' (ie VG in Logical Volume
>>> management). Block Device translator(BD xlator) exports this volume
>>> group as a directory and LVs under it as regular files. In the
>>> gluster
>>> mount point creating a file results in creating a logical volume,
>>> removing a file results in removing logical volume etc.
>>>
>>> When a GlusterFS volume enabled with BD xlator is used, directory
>>> creation in that gluster mount path is not supported because
>>> directory
>>> maps to Volume groups in BD xlator. But it could be an issue in
>>> VDSM
>>> environment when a new VDSM volume is created for GlusterFS domain,
>>> VDSM
>>> mounts the storage domain and creates directories under that and
>>> create
>>> files for vm image and other uses (like meta data).
>>> Is it possible to modify this behavior in VDSM to use flat
>>> structure
>>> instead of creating directories and VM images and other files
>>> underneath
>>> it? ie for GlusterFS domain with BD xlator VDSM will not create any
>>> directory and only creates all required files under the mount point
>>> directory itself.
>> From your description I think that the GlusterFS for block devices is
>> actually more similar to what happens with the regular block domains.
>> You should probably need to mount the share somewhere in the system
>> and
>> then use symlinks to point to the volumes.
>>
>> Create a regular block domain and look inside
>> /rhev/data-center/mnt/blockSD,
>> you'll probably get the idea of what I mean.
>>
>> That said we'd need to come up with a way of extending the LVs on the
>> gluster server when required (for thin provisioning).
> Why? if it's exposed as a file that probably means it supports sparseness. i.e.
if this becomes a new type of block domain it should only support 'preallocated'
images.
>
For start using the LVs we will always do truncate for the required
size, it will resize the LV. I didn't get what you are mentioning about
thin-provisioning, but I have a dumb code using dm-thin targets showing
BD xlators can be extended to use dm-thin targets for thin-provisioning.
_______________________________________________
vdsm-devel mailing list
vdsm-devel(a)lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel
--
---
舒明 Shu Ming
Open Virtualization Engineerning; CSTL, IBM Corp.
Tel: 86-10-82451626 Tieline: 9051626 E-mail: shuming(a)cn.ibm.com or
shuming(a)linux.vnet.ibm.com
Address: 3/F Ring Building, ZhongGuanCun Software Park, Haidian District, Beijing 100193,
PRC