rpms/kernel/devel kernel-2.6-pnfs-v2.6.30-rc4.patch, NONE, 1.1.2.1 config-generic, 1.282, 1.282.4.1 kernel.spec, 1.1539, 1.1539.4.1

Steve Dickson steved at fedoraproject.org
Mon May 4 14:44:34 UTC 2009


Author: steved

Update of /cvs/pkgs/rpms/kernel/devel
In directory cvs1.fedora.phx.redhat.com:/tmp/cvs-serv25882

Modified Files:
      Tag: kernel-2_6_30-pnfs_rc4
	config-generic kernel.spec 
Added Files:
      Tag: kernel-2_6_30-pnfs_rc4
	kernel-2.6-pnfs-v2.6.30-rc4.patch 
Log Message:
- Updated to latest pNFS code (v2.6.30-rc4)


kernel-2.6-pnfs-v2.6.30-rc4.patch:

--- NEW FILE kernel-2.6-pnfs-v2.6.30-rc4.patch ---
diff --git a/Documentation/filesystems/00-INDEX b/Documentation/filesystems/00-INDEX
index 8dd6db7..f15621e 100644
--- a/Documentation/filesystems/00-INDEX
+++ b/Documentation/filesystems/00-INDEX
@@ -66,6 +66,10 @@ mandatory-locking.txt
 	- info on the Linux implementation of Sys V mandatory file locking.
 ncpfs.txt
 	- info on Novell Netware(tm) filesystem using NCP protocol.
+nfs41-server.txt
+	- info on the Linux server implementation of NFSv4 minor version 1.
+nfs-rdma.txt
+	- how to install and setup the Linux NFS/RDMA client and server software.
 nfsroot.txt
 	- short guide on setting up a diskless box with NFS root filesystem.
 nilfs2.txt
diff --git a/Documentation/spnfs.txt b/Documentation/spnfs.txt
new file mode 100644
index 0000000..be5815d
--- /dev/null
+++ b/Documentation/spnfs.txt
@@ -0,0 +1,210 @@
+(c) 2007 Network Appliance Inc.
+
+spNFS
+-----
+
+An spNFS system consists of a Meta Data Server (MDS), a number of Client machines (C) and a number of Data Servers (DS).
+
+A file system is mounted by the clients from the MDS, and all file data
+is striped across the DSs.
+
+Identify the machines that will be filling each of these roles.
+
+The spnfs kernel will be installed on all machines: clients, the MDS and DSs.
+
+
+Building and installing the spNFS kernel
+----------------------------------------
+
+Get the spNFS kernel from:
+
+	git://linux-nfs.org/~dmuntz/spnfs.git
+
+add these options to your .config file
+
+	CONFIG_NETWORK_FILESYSTEMS=y
+	CONFIG_NFS_FS=y
+	CONFIG_NFSD=y
+	CONFIG_NFS_V4_1=y
+	CONFIG_NFSD_V4_1=y
+	CONFIG_PNFS=y
+	CONFIG_PNFSD=y
+	CONFIG_SPNFS=y
+
+By default, spNFS uses whole-file layouts.  Layout segments can be enabled
+by adding:
+
+	CONFIG_SPNFS_LAYOUTSEGMENTS=y
+
+to your .config file.
+
+Building and installation of kernel+modules is as usual.
+This kernel should be installed and booted on the client, MDS and DSs.
+
+
+Building nfs-utils
+------------------
+
+Get the nfs-utils package containing spnfsd from:
+
+	git://linux-nfs.org/~dmuntz/nfs-utils.git
+
+Follow the standard instructions for building nfs-utils.  We HIGHLY recommend
+NOT doing an install of the binaries generated by this build.  You will only
+need the spnfsd binary generated by this build, and the spnfsd.conf template.
+
+After building, the spnfsd daemon will be located in utils/spnfsd.  The spnfsd
+daemon will only be needed on the MDS.
+
+
+Installation
+------------
+
+The nfs-utils package contains a default spnfsd.conf file in
+utils/spnfsd/spnfsd.conf.  Copy this file to /etc/spnfsd.conf.
+
+By default, the DS-Mount-Directory is set to /spnfs (see spnfsd.conf).  Under
+this directory, mount points must be created for each DS to
+be used for pNFS data stripes.  These mount points are named by the ip address
+of the corresponding DS.  In the sample spnfsd.conf, there are two
+DSs defined (172.16.28.134 and 172.16.28.141).
+
+Following the sample spnfsd.conf,
+
+	mkdir /spnfs
+
+on the MDS (corresponding to DS-Mount-Directory).  Then
+
+	mkdir /spnfs/172.16.28.134
+	mkdir /spnfs/172.16.28.141
+
+to create the mount points for the DSs.
+
+On the DSs, chose a directory where data stripes will be created by the MDS.
+For the sample file, this directory is /pnfs, so on each DS execute:
+
+	mkdir /pnfs
+
+This directory is specified in the spnfsd.conf file by the DS*_ROOT option
+(where * is replaced by the DS number).  DS_ROOT is specified relative to
+the directory being exported by the DSs.  In our example, our DSs are exporting
+the root directory (/) and therefore our DS_ROOT is /pnfs.  On the DSs, we have
+the following entry in /etc/exports:
+
+	/ *(rw,fsid=0,insecure,no_root_squash,sync,no_subtree_check)
+
+N.B. If we had created a /exports directory and a /pnfs directory under
+/exports, and if we were exporting /exports, then DS_ROOT would still be /pnfs
+(not /exports/pnfs).
+
+It may be useful to add entries to /etc/fstab on the MDS to automatically
+mount the DS_ROOT file systems.  For this example, our MDS fstab would
+contain:
+
+	172.17.84.128:/pnfs /spnfs/172.17.84.128 nfs    defaults        1 2
+	172.17.84.122:/pnfs /spnfs/172.17.84.122 nfs    defaults        1 2
+
+The DS mounts must be performed manually or via fstab at this time (automatic
+mounting, directory creation, etc. are on the todo list).  To perform I/O
+through the MDS, the DS mounts MUST use NFSv3 at this time (this restriction
+will eventually be removed).
+
+
+On the MDS, choose a file system to use with spNFS and export it, e.g.:
+
+	/ *(rw,fsid=0,insecure,no_root_squash,sync,no_subtree_check)
+
+Make sure nfsd and all supporting processes are running on the MDS and DSs.
+
+
+Running
+-------
+
+If rpc_pipefs is not already mounted (if you're running idmapd it probably is),
+you may want to add the following line to /etc/fstab:
+
+	rpc_pipefs    /var/lib/nfs/rpc_pipefs rpc_pipefs defaults     0 0
+
+to automatically mount rpc_pipefs.
+
+With spnfsd.conf configured for your environment and the mounts mounted as
+described above, spnfsd can now be started.
+
+On the MDS, execute spnfsd:
+
+	spnfsd
+
+The executable is located in the directory where it was built, and
+may also have been installed elsewhere depending on how you built nfs-utils.
+It will run in the foreground by default, and in fact will do so despite
+any options suggesting the contrary (it's still a debugging build).
+
+On the client, make sure the nfslayoutdriver module is loaded:
+
+	modprobe nfslayoutdriver
+
+Then mount the file system from the MDS:
+
+	mount -t nfs4 mds:/ /mnt
+
+I/O through the MDS is now supported.  To use it, do not load the
+nfslayoutdriver on the client, and mount the MDS using NFSv4 or 4.1
+(NFSv2 and v3 are not yet supported).
+
+You may now use spNFS by performing file system activities in /mnt.
+If you create files in /mnt, you should see stripe files corresponding to
+new files being created on the DSs.  The current implementation names the
+stripe files based on the inode number of the file on the MDS.  For example,
+if you create a file foo in /mnt and do an 'ls -li /mnt/foo':
+
+	# ls -li foo
+	1233 -rw-r--r-- 1 root root 0 Nov 29 15:54 foo
+
+You should see stripe files on each under /pnfs (per the sample) named
+1233.  The file /pnfs/1233 on DS1 will contain the first <stripe size> bytes
+of data written to foo, DS2 will contain the next <stripe size> bytes, etc.
+Removing /mnt/foo will remove the corresponding stripe files on the DSs.
+Other file system operations should behave (mostly :-) as expected.
+
+
+Layout Segments
+---------------
+
+If the kernel is compiled to support layout segments, there will
+be two files created under /proc/fs/spnfs for controlling layout
+segment functionality.
+
+To enable layout segments, write a '1' to /proc/fs/spnfs/layoutseg, e.g.:
+
[...32666 lines suppressed...]
+	len = sock->ops->sendpage(sock, virt_to_page(xbufp->head[0].iov_base),
+			(unsigned long)xbufp->head[0].iov_base & ~PAGE_MASK,
+			xbufp->head[0].iov_len, flags);
+
+	if (len != xbufp->head[0].iov_len)
+		goto out;
+
+	/*
+	 * send page data
+	 *
+	 * Check the amount of data to be sent. If it is less than the
+	 * remaining page, then send it else send the current page
+	 */
+
+	size = PAGE_SIZE - base < pglen ? PAGE_SIZE - base : pglen;
+	while (pglen > 0) {
+		if (total_len == size)
+			flags = 0;
+		result = sock->ops->sendpage(sock, *pages, base, size, flags);
+		if (result > 0)
+			len += result;
+		if (result != size)
+			goto out;
+		total_len -= size;
+		pglen -= size;
+		size = PAGE_SIZE < pglen ? PAGE_SIZE : pglen;
+		base = 0;
+		pages++;
+	}
+	/*
+	 * send tail
+	 */
+	if (xbufp->tail[0].iov_len) {
+		result = sock->ops->sendpage(sock,
+			xbufp->tail[0].iov_base,
+			(unsigned long)xbufp->tail[0].iov_base & ~PAGE_MASK,
+			xbufp->tail[0].iov_len,
+			0);
+
+		if (result > 0)
+			len += result;
+	}
+out:
+	if (len != xbufp->len)
+		printk(KERN_NOTICE "Error sending entire callback!\n");
+
+	return len;
+}
+
+/*
+ * The send routine. Borrows from svc_send
+ */
+static int bc_send_request(struct rpc_task *task)
+{
+	struct rpc_rqst *req = task->tk_rqstp;
+	struct rpc_xprt *bc_xprt = req->rq_xprt;
+	struct svc_xprt	*xprt;
+	struct svc_sock         *svsk;
+	u32                     len;
+
+	dprintk("sending request with xid: %08x\n", ntohl(req->rq_xid));
+	/*
+	 * Get the server socket associated with this callback xprt
+	 */
+	svsk = bc_xprt->bc_sock;
+	xprt = &svsk->sk_xprt;
+
+	mutex_lock(&xprt->xpt_mutex);
+	if (test_bit(XPT_DEAD, &xprt->xpt_flags))
+		len = -ENOTCONN;
+	else
+		len = bc_sendto(req);
+	mutex_unlock(&xprt->xpt_mutex);
+
+	return 0;
+
+}
+
+/*
+ * The close routine. Since this is client initiated, we do nothing
+ */
+
+static void bc_close(struct rpc_xprt *xprt)
+{
+	return;
+}
+
+/*
+ * The xprt destroy routine. Again, because this connection is client
+ * initiated, we do nothing
+ */
+
+static void bc_destroy(struct rpc_xprt *xprt)
+{
+	return;
+}
+
 static struct rpc_xprt_ops xs_udp_ops = {
 	.set_buffer_size	= xs_udp_set_buffer_size,
 	.reserve_xprt		= xprt_reserve_xprt_cong,
@@ -1994,11 +2359,32 @@ static struct rpc_xprt_ops xs_tcp_ops = {
 	.buf_free		= rpc_free,
 	.send_request		= xs_tcp_send_request,
 	.set_retrans_timeout	= xprt_set_retrans_timeout_def,
+#if defined(CONFIG_NFS_V4_1)
+	.release_request	= bc_release_request,
+#endif /* CONFIG_NFS_V4_1 */
 	.close			= xs_tcp_shutdown,
 	.destroy		= xs_destroy,
 	.print_stats		= xs_tcp_print_stats,
 };
 
+/*
+ * The rpc_xprt_ops for the server backchannel
+ */
+
+static struct rpc_xprt_ops bc_tcp_ops = {
+	.reserve_xprt		= xprt_reserve_xprt,
+	.release_xprt		= xprt_release_xprt,
+	.set_port		= bc_set_port,
+	.connect		= bc_connect,
+	.buf_alloc		= bc_malloc,
+	.buf_free		= bc_free,
+	.send_request		= bc_send_request,
+	.set_retrans_timeout	= xprt_set_retrans_timeout_def,
+	.close			= bc_close,
+	.destroy		= bc_destroy,
+	.print_stats		= xs_tcp_print_stats,
+};
+
 static struct rpc_xprt *xs_setup_xprt(struct xprt_create *args,
 				      unsigned int slot_table_size)
 {
@@ -2131,13 +2517,29 @@ static struct rpc_xprt *xs_setup_tcp(struct xprt_create *args)
 	xprt->tsh_size = sizeof(rpc_fraghdr) / sizeof(u32);
 	xprt->max_payload = RPC_MAX_FRAGMENT_SIZE;
 
-	xprt->bind_timeout = XS_BIND_TO;
-	xprt->connect_timeout = XS_TCP_CONN_TO;
-	xprt->reestablish_timeout = XS_TCP_INIT_REEST_TO;
-	xprt->idle_timeout = XS_IDLE_DISC_TO;
+	if (args->bc_sock) {
+		/* backchannel */
+		xprt_set_bound(xprt);
+		INIT_DELAYED_WORK(&transport->connect_worker,
+				  bc_connect_worker);
+		xprt->bind_timeout = 0;
+		xprt->connect_timeout = 0;
+		xprt->reestablish_timeout = 0;
+		xprt->idle_timeout = (~0);
 
-	xprt->ops = &xs_tcp_ops;
-	xprt->timeout = &xs_tcp_default_timeout;
+		/*
+		 * The backchannel uses the same socket connection as the
+		 * forechannel
+		 */
+		xprt->bc_sock = args->bc_sock;
+		xprt->bc_sock->sk_bc_xprt = xprt;
+		transport->sock = xprt->bc_sock->sk_sock;
+		transport->inet = xprt->bc_sock->sk_sk;
+
+		xprt->ops = &bc_tcp_ops;
+
+		goto next;
+	}
 
 	switch (addr->sa_family) {
 	case AF_INET:
@@ -2145,13 +2547,29 @@ static struct rpc_xprt *xs_setup_tcp(struct xprt_create *args)
 			xprt_set_bound(xprt);
 
 		INIT_DELAYED_WORK(&transport->connect_worker, xs_tcp_connect_worker4);
-		xs_format_ipv4_peer_addresses(xprt, "tcp", RPCBIND_NETID_TCP);
 		break;
 	case AF_INET6:
 		if (((struct sockaddr_in6 *)addr)->sin6_port != htons(0))
 			xprt_set_bound(xprt);
 
 		INIT_DELAYED_WORK(&transport->connect_worker, xs_tcp_connect_worker6);
+		break;
+	}
+	xprt->bind_timeout = XS_BIND_TO;
+	xprt->connect_timeout = XS_TCP_CONN_TO;
+	xprt->reestablish_timeout = XS_TCP_INIT_REEST_TO;
+	xprt->idle_timeout = XS_IDLE_DISC_TO;
+
+	xprt->ops = &xs_tcp_ops;
+
+next:
+	xprt->timeout = &xs_tcp_default_timeout;
+
+	switch (addr->sa_family) {
+	case AF_INET:
+		xs_format_ipv4_peer_addresses(xprt, "tcp", RPCBIND_NETID_TCP);
+		break;
+	case AF_INET6:
 		xs_format_ipv6_peer_addresses(xprt, "tcp", RPCBIND_NETID_TCP6);
 		break;
 	default:


Index: config-generic
===================================================================
RCS file: /cvs/pkgs/rpms/kernel/devel/config-generic,v
retrieving revision 1.282
retrieving revision 1.282.4.1
diff -u -p -r1.282 -r1.282.4.1
--- config-generic	22 Apr 2009 15:40:04 -0000	1.282
+++ config-generic	4 May 2009 14:44:03 -0000	1.282.4.1
@@ -3145,6 +3145,14 @@ CONFIG_NFSD=m
 CONFIG_NFSD_V3=y
 CONFIG_NFSD_V3_ACL=y
 CONFIG_NFSD_V4=y
+CONFIG_NFS_V4_1=y
+CONFIG_PNFSD=y
+CONFIG_PNFS=y
+CONFIG_SPNFS=y
+CONFIG_PNFSD_LOCAL_EXPORT=y
+CONFIG_PNFS_PANLAYOUT=y
+CONFIG_PNFS_BLOCK=y
+CONFIG_SPNFS_LAYOUTSEGMENTS=y
 CONFIG_NFSD_TCP=y
 CONFIG_NFS_FSCACHE=y
 CONFIG_LOCKD=m


Index: kernel.spec
===================================================================
RCS file: /cvs/pkgs/rpms/kernel/devel/kernel.spec,v
retrieving revision 1.1539
retrieving revision 1.1539.4.1
diff -u -p -r1.1539 -r1.1539.4.1
--- kernel.spec	1 May 2009 18:29:15 -0000	1.1539
+++ kernel.spec	4 May 2009 14:44:03 -0000	1.1539.4.1
@@ -12,7 +12,7 @@ Summary: The Linux kernel
 # that the kernel isn't the stock distribution kernel, for example,
 # by setting the define to ".local" or ".bz123456"
 #
-# % define buildid .local
+%define buildid .pnfs_rc4
 
 # fedora_build defines which build revision of this kernel version we're
 # building. Rather than incrementing forever, as with the prior versioning
@@ -86,7 +86,7 @@ Summary: The Linux kernel
 # kernel-headers
 %define with_headers   %{?_without_headers:   0} %{?!_without_headers:   1}
 # kernel-firmware
-%define with_firmware  %{?_with_firmware:  1} %{?!_with_firmware:  0}
+%define with_firmware  %{?_with_firmware:  1} %{?!_with_firmware:  1}
 # kernel-debuginfo
 %define with_debuginfo %{?_without_debuginfo: 0} %{?!_without_debuginfo: 1}
 # kernel-bootwrapper (for creating zImages from kernel + initrd)
@@ -681,6 +681,7 @@ Patch9100: linux-2.6-iwl3945-remove-usel
 #snmp fixes
 Patch10000: linux-2.6-missing-rfc2465-stats.patch
 
+Patch20000: kernel-2.6-pnfs-v2.6.30-rc4.patch
 %endif
 
 BuildRoot: %{_tmppath}/kernel-%{KVERREL}-root
@@ -1243,6 +1244,7 @@ ApplyPatch linux-2.6-silence-acpi-blackl
 
 ApplyPatch linux-2.6-iwl3945-remove-useless-exports.patch
 
+ApplyPatch kernel-2.6-pnfs-v2.6.30-rc4.patch
 # END OF PATCH APPLICATIONS
 
 %endif
@@ -1836,6 +1838,9 @@ fi
 #	                ||----w |
 #	                ||     ||
 %changelog
+* Mon May  4 2009 Steve Dickson <steved at redhat.com>
+- Updated to latest pNFS code (v2.6.30-rc4)
+
 * Fri May 01 2009 Eric Sandeen <sandeen at redhat.com>
 - Fix ext4 corruption on partial write into prealloc block
 




More information about the scm-commits mailing list