[kernel/f16] readahead: fix pipeline break caused by block plug

Dave Jones davej at fedoraproject.org
Sun Feb 5 21:49:10 UTC 2012


commit 69d2ad30b8c9a671b232f1cefc9d6022dbcac318
Author: Dave Jones <davej at redhat.com>
Date:   Sun Feb 5 16:48:34 2012 -0500

    readahead: fix pipeline break caused by block plug
    
    Also, remove unnecessary block-stray-block-put-after-teardown.patch

 block-readahead-block-plug.patch           |   92 ++++++++++++++++++++++++++++
 block-stray-block-put-after-teardown.patch |   11 ---
 kernel.spec                                |    9 ++-
 3 files changed, 99 insertions(+), 13 deletions(-)
---
diff --git a/block-readahead-block-plug.patch b/block-readahead-block-plug.patch
new file mode 100644
index 0000000..0b7bca6
--- /dev/null
+++ b/block-readahead-block-plug.patch
@@ -0,0 +1,92 @@
+commit 3deaa7190a8da38453c4fabd9dec7f66d17fff67
+Author: Shaohua Li <shaohua.li at intel.com>
+Date:   Fri Feb 3 15:37:17 2012 -0800
+
+    readahead: fix pipeline break caused by block plug
+    
+    Herbert Poetzl reported a performance regression since 2.6.39.  The test
+    is a simple dd read, but with big block size.  The reason is:
+    
+    T1: ra (A, A+128k), (A+128k, A+256k)
+    T2: lock_page for page A, submit the 256k
+    T3: hit page A+128K, ra (A+256k, A+384). the range isn't submitted
+    because of plug and there isn't any lock_page till we hit page A+256k
+    because all pages from A to A+256k is in memory
+    T4: hit page A+256k, ra (A+384, A+ 512). Because of plug, the range isn't
+    submitted again.
+    T5: lock_page A+256k, so (A+256k, A+512k) will be submitted. The task is
+    waitting for (A+256k, A+512k) finish.
+    
+    There is no request to disk in T3 and T4, so readahead pipeline breaks.
+    
+    We really don't need block plug for generic_file_aio_read() for buffered
+    I/O.  The readahead already has plug and has fine grained control when I/O
+    should be submitted.  Deleting plug for buffered I/O fixes the regression.
+    
+    One side effect is plug makes the request size 256k, the size is 128k
+    without it.  This is because default ra size is 128k and not a reason we
+    need plug here.
+    
+    Vivek said:
+    
+    : We submit some readahead IO to device request queue but because of nested
+    : plug, queue never gets unplugged.  When read logic reaches a page which is
+    : not in page cache, it waits for page to be read from the disk
+    : (lock_page_killable()) and that time we flush the plug list.
+    :
+    : So effectively read ahead logic is kind of broken in parts because of
+    : nested plugging.  Removing top level plug (generic_file_aio_read()) for
+    : buffered reads, will allow unplugging queue earlier for readahead.
+    
+    Signed-off-by: Shaohua Li <shaohua.li at intel.com>
+    Signed-off-by: Wu Fengguang <fengguang.wu at intel.com>
+    Reported-by: Herbert Poetzl <herbert at 13thfloor.at>
+    Tested-by: Eric Dumazet <eric.dumazet at gmail.com>
+    Cc: Christoph Hellwig <hch at infradead.org>
+    Cc: Jens Axboe <axboe at kernel.dk>
+    Cc: Vivek Goyal <vgoyal at redhat.com>
+    Cc: <stable at vger.kernel.org>
+    Signed-off-by: Andrew Morton <akpm at linux-foundation.org>
+    Signed-off-by: Linus Torvalds <torvalds at linux-foundation.org>
+
+diff --git a/mm/filemap.c b/mm/filemap.c
+index 97f49ed..b662757 100644
+--- a/mm/filemap.c
++++ b/mm/filemap.c
+@@ -1400,15 +1400,12 @@ generic_file_aio_read(struct kiocb *iocb, const struct iovec *iov,
+ 	unsigned long seg = 0;
+ 	size_t count;
+ 	loff_t *ppos = &iocb->ki_pos;
+-	struct blk_plug plug;
+ 
+ 	count = 0;
+ 	retval = generic_segment_checks(iov, &nr_segs, &count, VERIFY_WRITE);
+ 	if (retval)
+ 		return retval;
+ 
+-	blk_start_plug(&plug);
+-
+ 	/* coalesce the iovecs and go direct-to-BIO for O_DIRECT */
+ 	if (filp->f_flags & O_DIRECT) {
+ 		loff_t size;
+@@ -1424,8 +1421,12 @@ generic_file_aio_read(struct kiocb *iocb, const struct iovec *iov,
+ 			retval = filemap_write_and_wait_range(mapping, pos,
+ 					pos + iov_length(iov, nr_segs) - 1);
+ 			if (!retval) {
++				struct blk_plug plug;
++
++				blk_start_plug(&plug);
+ 				retval = mapping->a_ops->direct_IO(READ, iocb,
+ 							iov, pos, nr_segs);
++				blk_finish_plug(&plug);
+ 			}
+ 			if (retval > 0) {
+ 				*ppos = pos + retval;
+@@ -1481,7 +1482,6 @@ generic_file_aio_read(struct kiocb *iocb, const struct iovec *iov,
+ 			break;
+ 	}
+ out:
+-	blk_finish_plug(&plug);
+ 	return retval;
+ }
+ EXPORT_SYMBOL(generic_file_aio_read);
diff --git a/kernel.spec b/kernel.spec
index ac2780f..84f278c 100644
--- a/kernel.spec
+++ b/kernel.spec
@@ -743,7 +743,8 @@ Patch3500: jbd-jbd2-validate-sb-s_first-in-journal_get_superblo.patch
 
 Patch12016: disable-i8042-check-on-apple-mac.patch
 
-Patch12026: block-stray-block-put-after-teardown.patch
+Patch12025: block-readahead-block-plug.patch
+
 Patch12030: epoll-limit-paths.patch
 
 Patch12303: dmar-disable-when-ricoh-multifunction.patch
@@ -1449,7 +1450,7 @@ ApplyOptionalPatch linux-2.6-v4l-dvb-experimental.patch
 ApplyPatch disable-i8042-check-on-apple-mac.patch
 
 ApplyPatch epoll-limit-paths.patch
-ApplyPatch block-stray-block-put-after-teardown.patch
+ApplyPatch block-readahead-block-plug.patch
 
 # rhbz#605888
 ApplyPatch dmar-disable-when-ricoh-multifunction.patch
@@ -2290,6 +2291,10 @@ fi
 # and build.
 
 %changelog
+* Sun Feb 05 2012 Dave Jones <davej at redhat.com>
+- Remove unnecessary block-stray-block-put-after-teardown.patch
+- readahead: fix pipeline break caused by block plug
+
 * Fri Feb 03 2012 Josh Boyer <jwboyer at redhat.com> 3.2.3-2
 - Drop patch that was NAKed upstream (rhbz 783211)
 


More information about the scm-commits mailing list