[kernel/f16] Add poll_requested_events

mchehab mchehab at fedoraproject.org
Tue Apr 10 13:29:01 UTC 2012


commit 15d159a8875ac4063340ee4debfb89eeda70116f
Author: Mauro Carvalho Chehab <mchehab at redhat.com>
Date:   Tue Apr 10 10:27:45 2012 -0300

    Add poll_requested_events
    
    The new events interface requires this function in order to work.
    
    Signed-off-by: Mauro Carvalho Chehab <mchehab at redhat.com>

 add-poll-requested-events.patch |  393 +++++++++++++++++++++++++++++++++++++++
 kernel.spec                     |    7 +-
 2 files changed, 398 insertions(+), 2 deletions(-)
---
diff --git a/add-poll-requested-events.patch b/add-poll-requested-events.patch
new file mode 100644
index 0000000..dafd359
--- /dev/null
+++ b/add-poll-requested-events.patch
@@ -0,0 +1,393 @@
+commit 626cf236608505d376e4799adb4f7eb00a8594af
+Author: Hans Verkuil <hans.verkuil at cisco.com>
+Date:   Fri Mar 23 15:02:27 2012 -0700
+
+    poll: add poll_requested_events() and poll_does_not_wait() functions
+    
+    In some cases the poll() implementation in a driver has to do different
+    things depending on the events the caller wants to poll for.  An example
+    is when a driver needs to start a DMA engine if the caller polls for
+    POLLIN, but doesn't want to do that if POLLIN is not requested but instead
+    only POLLOUT or POLLPRI is requested.  This is something that can happen
+    in the video4linux subsystem among others.
+    
+    Unfortunately, the current epoll/poll/select implementation doesn't
+    provide that information reliably.  The poll_table_struct does have it: it
+    has a key field with the event mask.  But once a poll() call matches one
+    or more bits of that mask any following poll() calls are passed a NULL
+    poll_table pointer.
+    
+    Also, the eventpoll implementation always left the key field at ~0 instead
+    of using the requested events mask.
+    
+    This was changed in eventpoll.c so the key field now contains the actual
+    events that should be polled for as set by the caller.
+    
+    The solution to the NULL poll_table pointer is to set the qproc field to
+    NULL in poll_table once poll() matches the events, not the poll_table
+    pointer itself.  That way drivers can obtain the mask through a new
+    poll_requested_events inline.
+    
+    The poll_table_struct can still be NULL since some kernel code calls it
+    internally (netfs_state_poll() in ./drivers/staging/pohmelfs/netfs.h).  In
+    that case poll_requested_events() returns ~0 (i.e.  all events).
+    
+    Very rarely drivers might want to know whether poll_wait will actually
+    wait.  If another earlier file descriptor in the set already matched the
+    events the caller wanted to wait for, then the kernel will return from the
+    select() call without waiting.  This might be useful information in order
+    to avoid doing expensive work.
+    
+    A new helper function poll_does_not_wait() is added that drivers can use
+    to detect this situation.  This is now used in sock_poll_wait() in
+    include/net/sock.h.  This was the only place in the kernel that needed
+    this information.
+    
+    Drivers should no longer access any of the poll_table internals, but use
+    the poll_requested_events() and poll_does_not_wait() access functions
+    instead.  In order to enforce that the poll_table fields are now prepended
+    with an underscore and a comment was added warning against using them
+    directly.
+    
+    This required a change in unix_dgram_poll() in unix/af_unix.c which used
+    the key field to get the requested events.  It's been replaced by a call
+    to poll_requested_events().
+    
+    For qproc it was especially important to change its name since the
+    behavior of that field changes with this patch since this function pointer
+    can now be NULL when that wasn't possible in the past.
+    
+    Any driver accessing the qproc or key fields directly will now fail to compile.
+    
+    Some notes regarding the correctness of this patch: the driver's poll()
+    function is called with a 'struct poll_table_struct *wait' argument.  This
+    pointer may or may not be NULL, drivers can never rely on it being one or
+    the other as that depends on whether or not an earlier file descriptor in
+    the select()'s fdset matched the requested events.
+    
+    There are only three things a driver can do with the wait argument:
+    
+    1) obtain the key field:
+    
+    	events = wait ? wait->key : ~0;
+    
+       This will still work although it should be replaced with the new
+       poll_requested_events() function (which does exactly the same).
+       This will now even work better, since wait is no longer set to NULL
+       unnecessarily.
+    
+    2) use the qproc callback. This could be deadly since qproc can now be
+       NULL. Renaming qproc should prevent this from happening. There are no
+       kernel drivers that actually access this callback directly, BTW.
+    
+    3) test whether wait == NULL to determine whether poll would return without
+       waiting. This is no longer sufficient as the correct test is now
+       wait == NULL || wait->_qproc == NULL.
+    
+       However, the worst that can happen here is a slight performance hit in
+       the case where wait != NULL and wait->_qproc == NULL. In that case the
+       driver will assume that poll_wait() will actually add the fd to the set
+       of waiting file descriptors. Of course, poll_wait() will not do that
+       since it tests for wait->_qproc. This will not break anything, though.
+    
+       There is only one place in the whole kernel where this happens
+       (sock_poll_wait() in include/net/sock.h) and that code will be replaced
+       by a call to poll_does_not_wait() in the next patch.
+    
+       Note that even if wait->_qproc != NULL drivers cannot rely on poll_wait()
+       actually waiting. The next file descriptor from the set might match the
+       event mask and thus any possible waits will never happen.
+    
+    Signed-off-by: Hans Verkuil <hans.verkuil at cisco.com>
+    Reviewed-by: Jonathan Corbet <corbet at lwn.net>
+    Reviewed-by: Al Viro <viro at zeniv.linux.org.uk>
+    Cc: Davide Libenzi <davidel at xmailserver.org>
+    Signed-off-by: Hans de Goede <hdegoede at redhat.com>
+    Cc: Mauro Carvalho Chehab <mchehab at infradead.org>
+    Cc: David Miller <davem at davemloft.net>
+    Cc: Eric Dumazet <eric.dumazet at gmail.com>
+    Signed-off-by: Andrew Morton <akpm at linux-foundation.org>
+    Signed-off-by: Linus Torvalds <torvalds at linux-foundation.org>
+
+diff --git a/fs/eventpoll.c b/fs/eventpoll.c
+index 4d9d3a4..ca30007 100644
+--- a/fs/eventpoll.c
++++ b/fs/eventpoll.c
+@@ -699,9 +699,12 @@ static int ep_read_events_proc(struct eventpoll *ep, struct list_head *head,
+ 			       void *priv)
+ {
+ 	struct epitem *epi, *tmp;
++	poll_table pt;
+ 
++	init_poll_funcptr(&pt, NULL);
+ 	list_for_each_entry_safe(epi, tmp, head, rdllink) {
+-		if (epi->ffd.file->f_op->poll(epi->ffd.file, NULL) &
++		pt._key = epi->event.events;
++		if (epi->ffd.file->f_op->poll(epi->ffd.file, &pt) &
+ 		    epi->event.events)
+ 			return POLLIN | POLLRDNORM;
+ 		else {
+@@ -1097,6 +1100,7 @@ static int ep_insert(struct eventpoll *ep, struct epoll_event *event,
+ 	/* Initialize the poll table using the queue callback */
+ 	epq.epi = epi;
+ 	init_poll_funcptr(&epq.pt, ep_ptable_queue_proc);
++	epq.pt._key = event->events;
+ 
+ 	/*
+ 	 * Attach the item to the poll hooks and get current event bits.
+@@ -1191,6 +1195,9 @@ static int ep_modify(struct eventpoll *ep, struct epitem *epi, struct epoll_even
+ {
+ 	int pwake = 0;
+ 	unsigned int revents;
++	poll_table pt;
++
++	init_poll_funcptr(&pt, NULL);
+ 
+ 	/*
+ 	 * Set the new event interest mask before calling f_op->poll();
+@@ -1198,13 +1205,14 @@ static int ep_modify(struct eventpoll *ep, struct epitem *epi, struct epoll_even
+ 	 * f_op->poll() call and the new event set registering.
+ 	 */
+ 	epi->event.events = event->events;
++	pt._key = event->events;
+ 	epi->event.data = event->data; /* protected by mtx */
+ 
+ 	/*
+ 	 * Get current event bits. We can safely use the file* here because
+ 	 * its usage count has been increased by the caller of this function.
+ 	 */
+-	revents = epi->ffd.file->f_op->poll(epi->ffd.file, NULL);
++	revents = epi->ffd.file->f_op->poll(epi->ffd.file, &pt);
+ 
+ 	/*
+ 	 * If the item is "hot" and it is not registered inside the ready
+@@ -1239,6 +1247,9 @@ static int ep_send_events_proc(struct eventpoll *ep, struct list_head *head,
+ 	unsigned int revents;
+ 	struct epitem *epi;
+ 	struct epoll_event __user *uevent;
++	poll_table pt;
++
++	init_poll_funcptr(&pt, NULL);
+ 
+ 	/*
+ 	 * We can loop without lock because we are passed a task private list.
+@@ -1251,7 +1262,8 @@ static int ep_send_events_proc(struct eventpoll *ep, struct list_head *head,
+ 
+ 		list_del_init(&epi->rdllink);
+ 
+-		revents = epi->ffd.file->f_op->poll(epi->ffd.file, NULL) &
++		pt._key = epi->event.events;
++		revents = epi->ffd.file->f_op->poll(epi->ffd.file, &pt) &
+ 			epi->event.events;
+ 
+ 		/*
+diff --git a/fs/select.c b/fs/select.c
+index e782258..ecfd0b1 100644
+--- a/fs/select.c
++++ b/fs/select.c
+@@ -223,7 +223,7 @@ static void __pollwait(struct file *filp, wait_queue_head_t *wait_address,
+ 	get_file(filp);
+ 	entry->filp = filp;
+ 	entry->wait_address = wait_address;
+-	entry->key = p->key;
++	entry->key = p->_key;
+ 	init_waitqueue_func_entry(&entry->wait, pollwake);
+ 	entry->wait.private = pwq;
+ 	add_wait_queue(wait_address, &entry->wait);
+@@ -386,13 +386,11 @@ get_max:
+ static inline void wait_key_set(poll_table *wait, unsigned long in,
+ 				unsigned long out, unsigned long bit)
+ {
+-	if (wait) {
+-		wait->key = POLLEX_SET;
+-		if (in & bit)
+-			wait->key |= POLLIN_SET;
+-		if (out & bit)
+-			wait->key |= POLLOUT_SET;
+-	}
++	wait->_key = POLLEX_SET;
++	if (in & bit)
++		wait->_key |= POLLIN_SET;
++	if (out & bit)
++		wait->_key |= POLLOUT_SET;
+ }
+ 
+ int do_select(int n, fd_set_bits *fds, struct timespec *end_time)
+@@ -414,7 +412,7 @@ int do_select(int n, fd_set_bits *fds, struct timespec *end_time)
+ 	poll_initwait(&table);
+ 	wait = &table.pt;
+ 	if (end_time && !end_time->tv_sec && !end_time->tv_nsec) {
+-		wait = NULL;
++		wait->_qproc = NULL;
+ 		timed_out = 1;
+ 	}
+ 
+@@ -459,17 +457,17 @@ int do_select(int n, fd_set_bits *fds, struct timespec *end_time)
+ 					if ((mask & POLLIN_SET) && (in & bit)) {
+ 						res_in |= bit;
+ 						retval++;
+-						wait = NULL;
++						wait->_qproc = NULL;
+ 					}
+ 					if ((mask & POLLOUT_SET) && (out & bit)) {
+ 						res_out |= bit;
+ 						retval++;
+-						wait = NULL;
++						wait->_qproc = NULL;
+ 					}
+ 					if ((mask & POLLEX_SET) && (ex & bit)) {
+ 						res_ex |= bit;
+ 						retval++;
+-						wait = NULL;
++						wait->_qproc = NULL;
+ 					}
+ 				}
+ 			}
+@@ -481,7 +479,7 @@ int do_select(int n, fd_set_bits *fds, struct timespec *end_time)
+ 				*rexp = res_ex;
+ 			cond_resched();
+ 		}
+-		wait = NULL;
++		wait->_qproc = NULL;
+ 		if (retval || timed_out || signal_pending(current))
+ 			break;
+ 		if (table.error) {
+@@ -720,7 +718,7 @@ struct poll_list {
+  * interested in events matching the pollfd->events mask, and the result
+  * matching that mask is both recorded in pollfd->revents and returned. The
+  * pwait poll_table will be used by the fd-provided poll handler for waiting,
+- * if non-NULL.
++ * if pwait->_qproc is non-NULL.
+  */
+ static inline unsigned int do_pollfd(struct pollfd *pollfd, poll_table *pwait)
+ {
+@@ -738,9 +736,7 @@ static inline unsigned int do_pollfd(struct pollfd *pollfd, poll_table *pwait)
+ 		if (file != NULL) {
+ 			mask = DEFAULT_POLLMASK;
+ 			if (file->f_op && file->f_op->poll) {
+-				if (pwait)
+-					pwait->key = pollfd->events |
+-							POLLERR | POLLHUP;
++				pwait->_key = pollfd->events|POLLERR|POLLHUP;
+ 				mask = file->f_op->poll(file, pwait);
+ 			}
+ 			/* Mask out unneeded events. */
+@@ -763,7 +759,7 @@ static int do_poll(unsigned int nfds,  struct poll_list *list,
+ 
+ 	/* Optimise the no-wait case */
+ 	if (end_time && !end_time->tv_sec && !end_time->tv_nsec) {
+-		pt = NULL;
++		pt->_qproc = NULL;
+ 		timed_out = 1;
+ 	}
+ 
+@@ -781,22 +777,22 @@ static int do_poll(unsigned int nfds,  struct poll_list *list,
+ 			for (; pfd != pfd_end; pfd++) {
+ 				/*
+ 				 * Fish for events. If we found one, record it
+-				 * and kill the poll_table, so we don't
++				 * and kill poll_table->_qproc, so we don't
+ 				 * needlessly register any other waiters after
+ 				 * this. They'll get immediately deregistered
+ 				 * when we break out and return.
+ 				 */
+ 				if (do_pollfd(pfd, pt)) {
+ 					count++;
+-					pt = NULL;
++					pt->_qproc = NULL;
+ 				}
+ 			}
+ 		}
+ 		/*
+ 		 * All waiters have already been registered, so don't provide
+-		 * a poll_table to them on the next loop iteration.
++		 * a poll_table->_qproc to them on the next loop iteration.
+ 		 */
+-		pt = NULL;
++		pt->_qproc = NULL;
+ 		if (!count) {
+ 			count = wait->error;
+ 			if (signal_pending(current))
+diff --git a/include/linux/poll.h b/include/linux/poll.h
+index cf40010..48fe8bc 100644
+--- a/include/linux/poll.h
++++ b/include/linux/poll.h
+@@ -32,21 +32,46 @@ struct poll_table_struct;
+  */
+ typedef void (*poll_queue_proc)(struct file *, wait_queue_head_t *, struct poll_table_struct *);
+ 
++/*
++ * Do not touch the structure directly, use the access functions
++ * poll_does_not_wait() and poll_requested_events() instead.
++ */
+ typedef struct poll_table_struct {
+-	poll_queue_proc qproc;
+-	unsigned long key;
++	poll_queue_proc _qproc;
++	unsigned long _key;
+ } poll_table;
+ 
+ static inline void poll_wait(struct file * filp, wait_queue_head_t * wait_address, poll_table *p)
+ {
+-	if (p && wait_address)
+-		p->qproc(filp, wait_address, p);
++	if (p && p->_qproc && wait_address)
++		p->_qproc(filp, wait_address, p);
++}
++
++/*
++ * Return true if it is guaranteed that poll will not wait. This is the case
++ * if the poll() of another file descriptor in the set got an event, so there
++ * is no need for waiting.
++ */
++static inline bool poll_does_not_wait(const poll_table *p)
++{
++	return p == NULL || p->_qproc == NULL;
++}
++
++/*
++ * Return the set of events that the application wants to poll for.
++ * This is useful for drivers that need to know whether a DMA transfer has
++ * to be started implicitly on poll(). You typically only want to do that
++ * if the application is actually polling for POLLIN and/or POLLOUT.
++ */
++static inline unsigned long poll_requested_events(const poll_table *p)
++{
++	return p ? p->_key : ~0UL;
+ }
+ 
+ static inline void init_poll_funcptr(poll_table *pt, poll_queue_proc qproc)
+ {
+-	pt->qproc = qproc;
+-	pt->key   = ~0UL; /* all events enabled */
++	pt->_qproc = qproc;
++	pt->_key   = ~0UL; /* all events enabled */
+ }
+ 
+ struct poll_table_entry {
+diff --git a/include/net/sock.h b/include/net/sock.h
+index 04bc0b3..a6ba1f8 100644
+--- a/include/net/sock.h
++++ b/include/net/sock.h
+@@ -1854,7 +1854,7 @@ static inline bool wq_has_sleeper(struct socket_wq *wq)
+ static inline void sock_poll_wait(struct file *filp,
+ 		wait_queue_head_t *wait_address, poll_table *p)
+ {
+-	if (p && wait_address) {
++	if (!poll_does_not_wait(p) && wait_address) {
+ 		poll_wait(filp, wait_address, p);
+ 		/*
+ 		 * We need to be sure we are in sync with the
+diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
+index eb4277c..d510353 100644
+--- a/net/unix/af_unix.c
++++ b/net/unix/af_unix.c
+@@ -2206,7 +2206,7 @@ static unsigned int unix_dgram_poll(struct file *file, struct socket *sock,
+ 	}
+ 
+ 	/* No write status requested, avoid expensive OUT tests. */
+-	if (wait && !(wait->key & (POLLWRBAND | POLLWRNORM | POLLOUT)))
++	if (!(poll_requested_events(wait) & (POLLWRBAND|POLLWRNORM|POLLOUT)))
+ 		return mask;
+ 
+ 	writable = unix_writable(sk);
diff --git a/kernel.spec b/kernel.spec
index 2494ae3..c21dc3e 100644
--- a/kernel.spec
+++ b/kernel.spec
@@ -708,7 +708,9 @@ Patch1900: linux-2.6-intel-iommu-igfx.patch
 Patch2802: linux-2.6-silence-acpi-blacklist.patch
 
 # media patches
-Patch2903: drivers-media-update.patch
+#   add-poll-requested-events.patch was added for 3.4
+Patch2900: add-poll-requested-events.patch
+Patch2901: drivers-media-update.patch
 
 # fs fixes
 
@@ -1411,8 +1413,9 @@ ApplyPatch linux-2.6-intel-iommu-igfx.patch
 ApplyPatch linux-2.6-silence-acpi-blacklist.patch
 ApplyPatch quite-apm.patch
 
-# V4L/DVB updates/fixes/experimental drivers
+# Media (V4L/DVB/IR) updates/fixes/experimental drivers
 #  apply if non-empty
+ApplyPatch add-poll-requested-events.patch
 ApplyOptionalPatch drivers-media-update.patch
 
 # Patches headed upstream


More information about the scm-commits mailing list