rpms/kernel/F-11 linux-2.6-xfs-radix-tree-fix.patch, NONE, 1.1.2.1 kernel.spec, 1.1679.2.20, 1.1679.2.21

Eric Sandeen sandeen at fedoraproject.org
Tue Sep 1 20:56:52 UTC 2009


Author: sandeen

Update of /cvs/pkgs/rpms/kernel/F-11
In directory cvs1.fedora.phx.redhat.com:/tmp/cvs-serv15154

Modified Files:
      Tag: private-fedora-11-2_6_29_6
	kernel.spec 
Added Files:
      Tag: private-fedora-11-2_6_29_6
	linux-2.6-xfs-radix-tree-fix.patch 
Log Message:
* Tue Sep 01 2009 Eric Sandeen <sandeen at redhat.com>
- Fix NFS vs. XFS oops (#502236)


linux-2.6-xfs-radix-tree-fix.patch:
 fs/inode.c         |   38 ++++++++------
 fs/xfs/xfs_iget.c  |  140 +++++++++++++++++++++++++++--------------------------
 fs/xfs/xfs_inode.h |   17 ------
 fs/xfs/xfs_log.c   |    2 
 include/linux/fs.h |    3 -
 5 files changed, 99 insertions(+), 101 deletions(-)

--- NEW FILE linux-2.6-xfs-radix-tree-fix.patch ---
commit 54e346215e4fe2ca8c94c54e546cc61902060510
Author: Christoph Hellwig <hch at lst.de>
Date:   Fri Aug 7 14:38:25 2009 -0300

    vfs: fix inode_init_always calling convention

    Currently inode_init_always calls into ->destroy_inode if the additional
    initialization fails.  That's not only counter-intuitive because
    inode_init_always did not allocate the inode structure, but in case of
    XFS it's actively harmful as ->destroy_inode might delete the inode from
    a radix-tree that has never been added.  This in turn might end up
    deleting the inode for the same inum that has been instanciated by
    another process and cause lots of cause subtile problems.

    Also in the case of re-initializing a reclaimable inode in XFS it would
    free an inode we still want to keep alive.

    Signed-off-by: Christoph Hellwig <hch at lst.de>
    Reviewed-by: Eric Sandeen <sandeen at sandeen.net>

commit 2e00c97e2c1d2ffc9e26252ca26b237678b0b772
Author: Christoph Hellwig <hch at lst.de>
Date:   Fri Aug 7 14:38:29 2009 -0300

    vfs: add __destroy_inode

    When we want to tear down an inode that lost the add to the cache race
    in XFS we must not call into ->destroy_inode because that would delete
    the inode that won the race from the inode cache radix tree.
    
    This patch provides the __destroy_inode helper needed to fix this,
    the actual fix will be in th next patch.  As XFS was the only reason
    destroy_inode was exported we shift the export to the new __destroy_inode.

    Signed-off-by: Christoph Hellwig <hch at lst.de>
    Reviewed-by: Eric Sandeen <sandeen at sandeen.net>

commit b36ec0428a06fcbdb67d61e9e664154e5dd9a8c7
Author: Christoph Hellwig <hch at lst.de>
Date:   Fri Aug 7 14:38:34 2009 -0300

    xfs: fix freeing of inodes not yet added to the inode cache

    When freeing an inode that lost race getting added to the inode cache we
    must not call into ->destroy_inode, because that would delete the inode
    that won the race from the inode cache radix tree.
    
    This patch uses splits a new xfs_inode_free helper out of xfs_ireclaim
    and uses that plus __destroy_inode to make sure we really only free
    the memory allocted for the inode that lost the race, and not mess with
    the inode cache state.
    
    Signed-off-by: Christoph Hellwig <hch at lst.de>
    Reviewed-by: Eric Sandeen <sandeen at sandeen.net>
    Reported-by: Alex Samad <alex at samad.com.au>
    Reported-by: Andrew Randrianasulu <randrik at mail.ru>
    Reported-by: Stephane <sharnois at max-t.com>
    Reported-by: Tommy <tommy at news-service.com>
    Reported-by: Miah Gregory <mace at darksilence.net>
    Reported-by: Gabriel Barazer <gabriel at oxeva.fr>
    Reported-by: Leandro Lucarella <llucax at gmail.com>
    Reported-by: Daniel Burr <dburr at fami.com.au>
    Reported-by: Nickolay <newmail at spaces.ru>
    Reported-by: Michael Guntsche <mike at it-loops.com>
    Reported-by: Dan Carley <dan.carley+linuxkern-bugs at gmail.com>
    Reported-by: Michael Ole Olsen <gnu at gmx.net>
    Reported-by: Michael Weissenbacher <mw at dermichi.com>
    Reported-by: Martin Spott <Martin.Spott at mgras.net>
    Reported-by: Christian Kujau <lists at nerdbynature.de>
    Tested-by: Michael Guntsche <mike at it-loops.com>
    Tested-by: Dan Carley <dan.carley+linuxkern-bugs at gmail.com>
    Tested-by: Christian Kujau <lists at nerdbynature.de>

commit a8914f3a6d72c97328597a556a99daaf5cc288ae
Author: Christoph Hellwig <hch at lst.de>
Date:   Mon Aug 10 11:32:44 2009 -0300

    xfs: fix spin_is_locked assert on uni-processor builds

    Without SMP or preemption spin_is_locked always returns false,
    so we can't do an assert with it.  Instead use assert_spin_locked,
    which does the right thing on all builds.
    
    Signed-off-by: Christoph Hellwig <hch at lst.de>
    Reviewed-by: Eric Sandeen <sandeen at sandeen.net> 
    Reported-by: Johannes Engel <jcnengel at googlemail.com> 
    Tested-by: Johannes Engel <jcnengel at googlemail.com>
    Signed-off-by: Felix Blyakher <felixb at sgi.com>

Index: linux-2.6.29.noarch/fs/inode.c
===================================================================
--- linux-2.6.29.noarch.orig/fs/inode.c
+++ linux-2.6.29.noarch/fs/inode.c
@@ -117,7 +117,7 @@ static void wake_up_inode(struct inode *
  * These are initializations that need to be done on every inode
  * allocation as the fields are not initialised by slab allocation.
  */
-struct inode *inode_init_always(struct super_block *sb, struct inode *inode)
+int inode_init_always(struct super_block *sb, struct inode *inode)
 {
 	static const struct address_space_operations empty_aops;
 	static struct inode_operations empty_iops;
@@ -147,13 +147,8 @@ struct inode *inode_init_always(struct s
 	inode->i_cdev = NULL;
 	inode->i_rdev = 0;
 	inode->dirtied_when = 0;
-	if (security_inode_alloc(inode)) {
-		if (inode->i_sb->s_op->destroy_inode)
-			inode->i_sb->s_op->destroy_inode(inode);
-		else
-			kmem_cache_free(inode_cachep, (inode));
-		return NULL;
-	}
+	if (security_inode_alloc(inode))
+		return -ENOMEM;
 
 	spin_lock_init(&inode->i_lock);
 	lockdep_set_class(&inode->i_lock, &sb->s_type->i_lock_key);
@@ -188,7 +183,7 @@ struct inode *inode_init_always(struct s
 	inode->i_private = NULL;
 	inode->i_mapping = mapping;
 
-	return inode;
+	return 0;
 }
 EXPORT_SYMBOL(inode_init_always);
 
@@ -201,22 +196,35 @@ static struct inode *alloc_inode(struct 
 	else
 		inode = kmem_cache_alloc(inode_cachep, GFP_KERNEL);
 
-	if (inode)
-		return inode_init_always(sb, inode);
-	return NULL;
+	if (!inode)
+		return NULL;
+
+	if (unlikely(inode_init_always(sb, inode))) {
+		if (inode->i_sb->s_op->destroy_inode)
+			inode->i_sb->s_op->destroy_inode(inode);
+		else
+			kmem_cache_free(inode_cachep, inode);
+		return NULL;
+	}
+
+	return inode;
 }
 
-void destroy_inode(struct inode *inode) 
+void __destroy_inode(struct inode *inode)
 {
 	BUG_ON(inode_has_buffers(inode));
 	security_inode_free(inode);
+}
+EXPORT_SYMBOL(__destroy_inode);
+
+void destroy_inode(struct inode *inode)
+{
+	__destroy_inode(inode);
 	if (inode->i_sb->s_op->destroy_inode)
 		inode->i_sb->s_op->destroy_inode(inode);
 	else
 		kmem_cache_free(inode_cachep, (inode));
 }
-EXPORT_SYMBOL(destroy_inode);
-
 
 /*
  * These are initializations that only need to be done
Index: linux-2.6.29.noarch/fs/xfs/xfs_iget.c
===================================================================
--- linux-2.6.29.noarch.orig/fs/xfs/xfs_iget.c
+++ linux-2.6.29.noarch/fs/xfs/xfs_iget.c
@@ -63,21 +63,16 @@ xfs_inode_alloc(
 	ip = kmem_zone_alloc(xfs_inode_zone, KM_SLEEP);
 	if (!ip)
 		return NULL;
+	if (inode_init_always(mp->m_super, VFS_I(ip))) {
+		kmem_zone_free(xfs_inode_zone, ip);
+		return NULL;
+	}
 
 	ASSERT(atomic_read(&ip->i_iocount) == 0);
 	ASSERT(atomic_read(&ip->i_pincount) == 0);
 	ASSERT(!spin_is_locked(&ip->i_flags_lock));
 	ASSERT(completion_done(&ip->i_flush));
 
-	/*
-	 * initialise the VFS inode here to get failures
-	 * out of the way early.
-	 */
-	if (!inode_init_always(mp->m_super, VFS_I(ip))) {
-		kmem_zone_free(xfs_inode_zone, ip);
-		return NULL;
-	}
-
 	/* initialise the xfs inode */
 	ip->i_ino = ino;
 	ip->i_mount = mp;
@@ -117,6 +112,71 @@ xfs_inode_alloc(
 	return ip;
 }
 
+STATIC void
+xfs_inode_free(
+	struct xfs_inode	*ip)
+{
+	switch (ip->i_d.di_mode & S_IFMT) {
+	case S_IFREG:
+	case S_IFDIR:
+	case S_IFLNK:
+		xfs_idestroy_fork(ip, XFS_DATA_FORK);
+		break;
+	}
+
+	if (ip->i_afp)
+		xfs_idestroy_fork(ip, XFS_ATTR_FORK);
+
+#ifdef XFS_INODE_TRACE
+	ktrace_free(ip->i_trace);
+#endif
+#ifdef XFS_BMAP_TRACE
+	ktrace_free(ip->i_xtrace);
+#endif
+#ifdef XFS_BTREE_TRACE
+	ktrace_free(ip->i_btrace);
+#endif
+#ifdef XFS_RW_TRACE
+	ktrace_free(ip->i_rwtrace);
+#endif
+#ifdef XFS_ILOCK_TRACE
+	ktrace_free(ip->i_lock_trace);
+#endif
+#ifdef XFS_DIR2_TRACE
+	ktrace_free(ip->i_dir_trace);
+#endif
+
+	if (ip->i_itemp) {
+		/*
+		 * Only if we are shutting down the fs will we see an
+		 * inode still in the AIL. If it is there, we should remove
+		 * it to prevent a use-after-free from occurring.
+		 */
+		xfs_log_item_t	*lip = &ip->i_itemp->ili_item;
+		struct xfs_ail	*ailp = lip->li_ailp;
+
+		ASSERT(((lip->li_flags & XFS_LI_IN_AIL) == 0) ||
+				       XFS_FORCED_SHUTDOWN(ip->i_mount));
+		if (lip->li_flags & XFS_LI_IN_AIL) {
+			spin_lock(&ailp->xa_lock);
+			if (lip->li_flags & XFS_LI_IN_AIL)
+				xfs_trans_ail_delete(ailp, lip);
+			else
+				spin_unlock(&ailp->xa_lock);
+		}
+		xfs_inode_item_destroy(ip);
+		ip->i_itemp = NULL;
+	}
+
+	/* asserts to verify all state is correct here */
+	ASSERT(atomic_read(&ip->i_iocount) == 0);
+	ASSERT(atomic_read(&ip->i_pincount) == 0);
+	ASSERT(!spin_is_locked(&ip->i_flags_lock));
+	ASSERT(completion_done(&ip->i_flush));
+
+	kmem_zone_free(xfs_inode_zone, ip);
+}
+
 /*
  * Check the validity of the inode we just found it the cache
  */
@@ -161,7 +221,7 @@ xfs_iget_cache_hit(
 		 * errors cleanly, then tag it so it can be set up correctly
 		 * later.
 		 */
-		if (!inode_init_always(mp->m_super, VFS_I(ip))) {
+		if (inode_init_always(mp->m_super, VFS_I(ip))) {
 			error = ENOMEM;
 			goto out_error;
 		}
@@ -293,7 +353,8 @@ out_preload_end:
 	if (lock_flags)
 		xfs_iunlock(ip, lock_flags);
 out_destroy:
-	xfs_destroy_inode(ip);
+	__destroy_inode(VFS_I(ip));
+	xfs_inode_free(ip);
 	return error;
 }
 
@@ -501,62 +562,7 @@ xfs_ireclaim(
 	XFS_QM_DQDETACH(ip->i_mount, ip);
 	xfs_iunlock(ip, XFS_ILOCK_EXCL | XFS_IOLOCK_EXCL);
 
-	switch (ip->i_d.di_mode & S_IFMT) {
-	case S_IFREG:
-	case S_IFDIR:
-	case S_IFLNK:
-		xfs_idestroy_fork(ip, XFS_DATA_FORK);
-		break;
-	}
-
-	if (ip->i_afp)
-		xfs_idestroy_fork(ip, XFS_ATTR_FORK);
-
-#ifdef XFS_INODE_TRACE
-	ktrace_free(ip->i_trace);
-#endif
-#ifdef XFS_BMAP_TRACE
-	ktrace_free(ip->i_xtrace);
-#endif
-#ifdef XFS_BTREE_TRACE
-	ktrace_free(ip->i_btrace);
-#endif
-#ifdef XFS_RW_TRACE
-	ktrace_free(ip->i_rwtrace);
-#endif
-#ifdef XFS_ILOCK_TRACE
-	ktrace_free(ip->i_lock_trace);
-#endif
-#ifdef XFS_DIR2_TRACE
-	ktrace_free(ip->i_dir_trace);
-#endif
-	if (ip->i_itemp) {
-		/*
-		 * Only if we are shutting down the fs will we see an
-		 * inode still in the AIL. If it is there, we should remove
-		 * it to prevent a use-after-free from occurring.
-		 */
-		xfs_log_item_t	*lip = &ip->i_itemp->ili_item;
-		struct xfs_ail	*ailp = lip->li_ailp;
-
-		ASSERT(((lip->li_flags & XFS_LI_IN_AIL) == 0) ||
-				       XFS_FORCED_SHUTDOWN(ip->i_mount));
-		if (lip->li_flags & XFS_LI_IN_AIL) {
-			spin_lock(&ailp->xa_lock);
-			if (lip->li_flags & XFS_LI_IN_AIL)
-				xfs_trans_ail_delete(ailp, lip);
-			else
-				spin_unlock(&ailp->xa_lock);
-		}
-		xfs_inode_item_destroy(ip);
-		ip->i_itemp = NULL;
-	}
-	/* asserts to verify all state is correct here */
-	ASSERT(atomic_read(&ip->i_iocount) == 0);
-	ASSERT(atomic_read(&ip->i_pincount) == 0);
-	ASSERT(!spin_is_locked(&ip->i_flags_lock));
-	ASSERT(completion_done(&ip->i_flush));
-	kmem_zone_free(xfs_inode_zone, ip);
+	xfs_inode_free(ip);
 }
 
 /*
Index: linux-2.6.29.noarch/include/linux/fs.h
===================================================================
--- linux-2.6.29.noarch.orig/include/linux/fs.h
+++ linux-2.6.29.noarch/include/linux/fs.h
@@ -1906,7 +1906,7 @@ extern loff_t default_llseek(struct file
 
 extern loff_t vfs_llseek(struct file *file, loff_t offset, int origin);
 
-extern struct inode * inode_init_always(struct super_block *, struct inode *);
+extern int inode_init_always(struct super_block *, struct inode *);
 extern void inode_init_once(struct inode *);
 extern void inode_add_to_lists(struct super_block *, struct inode *);
 extern void iput(struct inode *);
@@ -1933,6 +1933,7 @@ extern void __iget(struct inode * inode)
 extern void iget_failed(struct inode *);
 extern void clear_inode(struct inode *);
 extern void destroy_inode(struct inode *);
+extern void __destroy_inode(struct inode *);
 extern struct inode *new_inode(struct super_block *);
 extern int should_remove_suid(struct dentry *);
 extern int file_remove_suid(struct file *);
Index: linux-2.6.29.noarch/fs/xfs/xfs_inode.h
===================================================================
--- linux-2.6.29.noarch.orig/fs/xfs/xfs_inode.h
+++ linux-2.6.29.noarch/fs/xfs/xfs_inode.h
@@ -309,23 +309,6 @@ static inline struct inode *VFS_I(struct
 }
 
 /*
- * Get rid of a partially initialized inode.
- *
- * We have to go through destroy_inode to make sure allocations
- * from init_inode_always like the security data are undone.
- *
- * We mark the inode bad so that it takes the short cut in
- * the reclaim path instead of going through the flush path
- * which doesn't make sense for an inode that has never seen the
- * light of day.
- */
-static inline void xfs_destroy_inode(struct xfs_inode *ip)
-{
-	make_bad_inode(VFS_I(ip));
-	return destroy_inode(VFS_I(ip));
-}
-
-/*
  * i_flags helper functions
  */
 static inline void
Index: linux-2.6.29.noarch/fs/xfs/xfs_log.c
===================================================================
--- linux-2.6.29.noarch.orig/fs/xfs/xfs_log.c
+++ linux-2.6.29.noarch/fs/xfs/xfs_log.c
@@ -3195,7 +3195,7 @@ try_again:
 STATIC void
 xlog_state_want_sync(xlog_t *log, xlog_in_core_t *iclog)
 {
-	ASSERT(spin_is_locked(&log->l_icloglock));
+	assert_spin_locked(&log->l_icloglock);
 
 	if (iclog->ic_state == XLOG_STATE_ACTIVE) {
 		xlog_state_switch_iclogs(log, iclog, 0);


Index: kernel.spec
===================================================================
RCS file: /cvs/pkgs/rpms/kernel/F-11/kernel.spec,v
retrieving revision 1.1679.2.20
retrieving revision 1.1679.2.21
diff -u -p -r1.1679.2.20 -r1.1679.2.21
--- kernel.spec	31 Aug 2009 19:31:16 -0000	1.1679.2.20
+++ kernel.spec	1 Sep 2009 20:56:52 -0000	1.1679.2.21
@@ -775,6 +775,7 @@ Patch3000: linux-2.6-btrfs-unstable-upda
 Patch3010: linux-2.6-relatime-by-default.patch
 Patch3020: linux-2.6-fiemap-header-install.patch
 Patch3030: linux-2.6-ecryptfs-overflow-fixes.patch
+Patch3031: linux-2.6-xfs-radix-tree-fix.patch
 
 Patch5000: linux-2.6-add-qcserial.patch
 
@@ -1296,6 +1297,7 @@ ApplyPatch linux-2.6-execshield.patch
 # ext4
 
 # xfs
+ApplyPatch linux-2.6-xfs-radix-tree-fix.patch
 
 # btrfs
 ApplyPatch linux-2.6-btrfs-unstable-update.patch
@@ -2186,6 +2188,9 @@ fi
 # and build.
 
 %changelog
+* Tue Sep 01 2009 Eric Sandeen <sandeen at redhat.com>
+- Fix NFS vs. XFS oops (#502236)
+
 * Mon Aug 31 2009 Jarod Wilson <jarod at redhat.com>
 - Fix audio on PVR-500 when used in same system as HVR-1800 (#480728)
 




More information about the scm-commits mailing list