<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/fs/eventpoll.c, branch v3.0</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>Fix common misspellings</title>
<updated>2011-03-31T14:26:23+00:00</updated>
<author>
<name>Lucas De Marchi</name>
<email>lucas.demarchi@profusion.mobi</email>
</author>
<published>2011-03-31T01:57:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=25985edcedea6396277003854657b5f3cb31a628'/>
<id>25985edcedea6396277003854657b5f3cb31a628</id>
<content type='text'>
Fixes generated by 'codespell' and manually reviewed.

Signed-off-by: Lucas De Marchi &lt;lucas.demarchi@profusion.mobi&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Fixes generated by 'codespell' and manually reviewed.

Signed-off-by: Lucas De Marchi &lt;lucas.demarchi@profusion.mobi&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>epoll: fix compiler warning and optimize the non-blocking path</title>
<updated>2011-03-23T00:44:15+00:00</updated>
<author>
<name>Shawn Bohrer</name>
<email>shawn.bohrer@gmail.com</email>
</author>
<published>2011-03-22T23:34:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=f4d93ad74c18143abd3067ca3c8ffba7d00addf4'/>
<id>f4d93ad74c18143abd3067ca3c8ffba7d00addf4</id>
<content type='text'>
Add a comment to ep_poll(), rename labels a bit clearly, fix a warning of
unused variable from gcc and optimize the non-blocking path a little.

Hinted-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Davide Libenzi &lt;davidel@xmailserver.org&gt;

hannes@cmpxchg.org:

: The non-blocking ep_poll path optimization introduced skipping over the
: return value setup.
:
: Initialize it properly, my userspace gets upset by epoll_wait() returning
: random things.
:
: In addition, remove the reinitialization at the fetch_events label, the
: return value is garuanteed to be zero when execution reaches there.

[hannes@cmpxchg.org: fix initialization]
Signed-off-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Shawn Bohrer &lt;shawn.bohrer@gmail.com&gt;
Acked-by: Davide Libenzi &lt;davidel@xmailserver.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add a comment to ep_poll(), rename labels a bit clearly, fix a warning of
unused variable from gcc and optimize the non-blocking path a little.

Hinted-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Davide Libenzi &lt;davidel@xmailserver.org&gt;

hannes@cmpxchg.org:

: The non-blocking ep_poll path optimization introduced skipping over the
: return value setup.
:
: Initialize it properly, my userspace gets upset by epoll_wait() returning
: random things.
:
: In addition, remove the reinitialization at the fetch_events label, the
: return value is garuanteed to be zero when execution reaches there.

[hannes@cmpxchg.org: fix initialization]
Signed-off-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Shawn Bohrer &lt;shawn.bohrer@gmail.com&gt;
Acked-by: Davide Libenzi &lt;davidel@xmailserver.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>epoll: move ready event check into proper inline</title>
<updated>2011-03-23T00:44:15+00:00</updated>
<author>
<name>Davide Libenzi</name>
<email>davidel@xmailserver.org</email>
</author>
<published>2011-03-22T23:34:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=3fb0e584a68cd1c5085e69be441f2ad032aaee72'/>
<id>3fb0e584a68cd1c5085e69be441f2ad032aaee72</id>
<content type='text'>
Move the event readiness check into a proper inline, and use it uniformly
inside ep_poll() code.  Events in the -&gt;ovflist are no less ready than the
ones in -&gt;rdllist.

Signed-off-by: Davide Libenzi &lt;davidel@xmailserver.org&gt;
Cc: Shawn Bohrer &lt;shawn.bohrer@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Move the event readiness check into a proper inline, and use it uniformly
inside ep_poll() code.  Events in the -&gt;ovflist are no less ready than the
ones in -&gt;rdllist.

Signed-off-by: Davide Libenzi &lt;davidel@xmailserver.org&gt;
Cc: Shawn Bohrer &lt;shawn.bohrer@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial</title>
<updated>2011-03-18T17:37:40+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2011-03-18T17:37:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=e16b396ce314b2bcdfe6c173fe075bf8e3432368'/>
<id>e16b396ce314b2bcdfe6c173fe075bf8e3432368</id>
<content type='text'>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (47 commits)
  doc: CONFIG_UNEVICTABLE_LRU doesn't exist anymore
  Update cpuset info &amp; webiste for cgroups
  dcdbas: force SMI to happen when expected
  arch/arm/Kconfig: remove one to many l's in the word.
  asm-generic/user.h: Fix spelling in comment
  drm: fix printk typo 'sracth'
  Remove one to many n's in a word
  Documentation/filesystems/romfs.txt: fixing link to genromfs
  drivers:scsi Change printk typo initate -&gt; initiate
  serial, pch uart: Remove duplicate inclusion of linux/pci.h header
  fs/eventpoll.c: fix spelling
  mm: Fix out-of-date comments which refers non-existent functions
  drm: Fix printk typo 'failled'
  coh901318.c: Change initate to initiate.
  mbox-db5500.c Change initate to initiate.
  edac: correct i82975x error-info reported
  edac: correct i82975x mci initialisation
  edac: correct commented info
  fs: update comments to point correct document
  target: remove duplicate include of target/target_core_device.h from drivers/target/target_core_hba.c
  ...

Trivial conflict in fs/eventpoll.c (spelling vs addition)
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (47 commits)
  doc: CONFIG_UNEVICTABLE_LRU doesn't exist anymore
  Update cpuset info &amp; webiste for cgroups
  dcdbas: force SMI to happen when expected
  arch/arm/Kconfig: remove one to many l's in the word.
  asm-generic/user.h: Fix spelling in comment
  drm: fix printk typo 'sracth'
  Remove one to many n's in a word
  Documentation/filesystems/romfs.txt: fixing link to genromfs
  drivers:scsi Change printk typo initate -&gt; initiate
  serial, pch uart: Remove duplicate inclusion of linux/pci.h header
  fs/eventpoll.c: fix spelling
  mm: Fix out-of-date comments which refers non-existent functions
  drm: Fix printk typo 'failled'
  coh901318.c: Change initate to initiate.
  mbox-db5500.c Change initate to initiate.
  edac: correct i82975x error-info reported
  edac: correct i82975x mci initialisation
  edac: correct commented info
  fs: update comments to point correct document
  target: remove duplicate include of target/target_core_device.h from drivers/target/target_core_hba.c
  ...

Trivial conflict in fs/eventpoll.c (spelling vs addition)
</pre>
</div>
</content>
</entry>
<entry>
<title>epoll: prevent creating circular epoll structures</title>
<updated>2011-02-25T23:07:36+00:00</updated>
<author>
<name>Davide Libenzi</name>
<email>davidel@xmailserver.org</email>
</author>
<published>2011-02-25T22:44:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=22bacca48a1755f79b7e0f192ddb9fbb7fc6e64e'/>
<id>22bacca48a1755f79b7e0f192ddb9fbb7fc6e64e</id>
<content type='text'>
In several places, an epoll fd can call another file's -&gt;f_op-&gt;poll()
method with ep-&gt;mtx held.  This is in general unsafe, because that other
file could itself be an epoll fd that contains the original epoll fd.

The code defends against this possibility in its own -&gt;poll() method using
ep_call_nested, but there are several other unsafe calls to -&gt;poll
elsewhere that can be made to deadlock.  For example, the following simple
program causes the call in ep_insert recursively call the original fd's
-&gt;poll, leading to deadlock:

 #include &lt;unistd.h&gt;
 #include &lt;sys/epoll.h&gt;

 int main(void) {
     int e1, e2, p[2];
     struct epoll_event evt = {
         .events = EPOLLIN
     };

     e1 = epoll_create(1);
     e2 = epoll_create(2);
     pipe(p);

     epoll_ctl(e2, EPOLL_CTL_ADD, e1, &amp;evt);
     epoll_ctl(e1, EPOLL_CTL_ADD, p[0], &amp;evt);
     write(p[1], p, sizeof p);
     epoll_ctl(e1, EPOLL_CTL_ADD, e2, &amp;evt);

     return 0;
 }

On insertion, check whether the inserted file is itself a struct epoll,
and if so, do a recursive walk to detect whether inserting this file would
create a loop of epoll structures, which could lead to deadlock.

[nelhage@ksplice.com: Use epmutex to serialize concurrent inserts]
Signed-off-by: Davide Libenzi &lt;davidel@xmailserver.org&gt;
Signed-off-by: Nelson Elhage &lt;nelhage@ksplice.com&gt;
Reported-by: Nelson Elhage &lt;nelhage@ksplice.com&gt;
Tested-by: Nelson Elhage &lt;nelhage@ksplice.com&gt;
Cc: &lt;stable@kernel.org&gt;		[2.6.34+, possibly earlier]
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In several places, an epoll fd can call another file's -&gt;f_op-&gt;poll()
method with ep-&gt;mtx held.  This is in general unsafe, because that other
file could itself be an epoll fd that contains the original epoll fd.

The code defends against this possibility in its own -&gt;poll() method using
ep_call_nested, but there are several other unsafe calls to -&gt;poll
elsewhere that can be made to deadlock.  For example, the following simple
program causes the call in ep_insert recursively call the original fd's
-&gt;poll, leading to deadlock:

 #include &lt;unistd.h&gt;
 #include &lt;sys/epoll.h&gt;

 int main(void) {
     int e1, e2, p[2];
     struct epoll_event evt = {
         .events = EPOLLIN
     };

     e1 = epoll_create(1);
     e2 = epoll_create(2);
     pipe(p);

     epoll_ctl(e2, EPOLL_CTL_ADD, e1, &amp;evt);
     epoll_ctl(e1, EPOLL_CTL_ADD, p[0], &amp;evt);
     write(p[1], p, sizeof p);
     epoll_ctl(e1, EPOLL_CTL_ADD, e2, &amp;evt);

     return 0;
 }

On insertion, check whether the inserted file is itself a struct epoll,
and if so, do a recursive walk to detect whether inserting this file would
create a loop of epoll structures, which could lead to deadlock.

[nelhage@ksplice.com: Use epmutex to serialize concurrent inserts]
Signed-off-by: Davide Libenzi &lt;davidel@xmailserver.org&gt;
Signed-off-by: Nelson Elhage &lt;nelhage@ksplice.com&gt;
Reported-by: Nelson Elhage &lt;nelhage@ksplice.com&gt;
Tested-by: Nelson Elhage &lt;nelhage@ksplice.com&gt;
Cc: &lt;stable@kernel.org&gt;		[2.6.34+, possibly earlier]
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>fs/eventpoll.c: fix spelling</title>
<updated>2011-02-17T16:38:09+00:00</updated>
<author>
<name>Daniel Baluta</name>
<email>daniel.baluta@gmail.com</email>
</author>
<published>2011-01-30T21:42:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=bf6a41db7726e6c09b9c6ac993457b7260473406'/>
<id>bf6a41db7726e6c09b9c6ac993457b7260473406</id>
<content type='text'>
eventpoll.c has wonderful comments but some annoying typos
sneaked in:
	* toepoll_ctl -&gt; to epoll_ctl
	* rapresent -&gt; represents
	* sructure -&gt; structure
	* machanism -&gt; mechanism
	* trasfering -&gt; transferring

Signed-off-by: Daniel Baluta &lt;daniel.baluta@gmail.com&gt;
Signed-off-by: Jiri Kosina &lt;jkosina@suse.cz&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
eventpoll.c has wonderful comments but some annoying typos
sneaked in:
	* toepoll_ctl -&gt; to epoll_ctl
	* rapresent -&gt; represents
	* sructure -&gt; structure
	* machanism -&gt; mechanism
	* trasfering -&gt; transferring

Signed-off-by: Daniel Baluta &lt;daniel.baluta@gmail.com&gt;
Signed-off-by: Jiri Kosina &lt;jkosina@suse.cz&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>epoll: epoll_wait() should not use timespec_add_ns()</title>
<updated>2011-02-03T00:03:18+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>eric.dumazet@gmail.com</email>
</author>
<published>2011-02-01T23:52:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=0781b909b5586f4db720b5d1838b78f9d8e42f14'/>
<id>0781b909b5586f4db720b5d1838b78f9d8e42f14</id>
<content type='text'>
commit 95aac7b1cd224f ("epoll: make epoll_wait() use the hrtimer range
feature") added a performance regression because it uses timespec_add_ns()
with potential very large 'ns' values.

[akpm@linux-foundation.org: s/epoll_set_mstimeout/ep_set_mstimeout/, per Davide]
Reported-by: Simon Kirby &lt;sim@hostway.ca&gt;
Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Cc: Shawn Bohrer &lt;shawn.bohrer@gmail.com&gt;
Acked-by: Davide Libenzi &lt;davidel@xmailserver.org&gt;
Cc: &lt;stable@kernel.org&gt;		[2.6.37.x]
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 95aac7b1cd224f ("epoll: make epoll_wait() use the hrtimer range
feature") added a performance regression because it uses timespec_add_ns()
with potential very large 'ns' values.

[akpm@linux-foundation.org: s/epoll_set_mstimeout/ep_set_mstimeout/, per Davide]
Reported-by: Simon Kirby &lt;sim@hostway.ca&gt;
Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Cc: Shawn Bohrer &lt;shawn.bohrer@gmail.com&gt;
Acked-by: Davide Libenzi &lt;davidel@xmailserver.org&gt;
Cc: &lt;stable@kernel.org&gt;		[2.6.37.x]
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>epoll: convert max_user_watches to long</title>
<updated>2011-01-13T16:03:12+00:00</updated>
<author>
<name>Robin Holt</name>
<email>holt@sgi.com</email>
</author>
<published>2011-01-13T01:00:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=52bd19f7691b2ea6433aef0ef94c08c57efd7e79'/>
<id>52bd19f7691b2ea6433aef0ef94c08c57efd7e79</id>
<content type='text'>
On a 16TB machine, max_user_watches has an integer overflow.  Convert it
to use a long and handle the associated fallout.

Signed-off-by: Robin Holt &lt;holt@sgi.com&gt;
Cc: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Acked-by: Davide Libenzi &lt;davidel@xmailserver.org&gt;
Cc: Pekka Enberg &lt;penberg@cs.helsinki.fi&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
On a 16TB machine, max_user_watches has an integer overflow.  Convert it
to use a long and handle the associated fallout.

Signed-off-by: Robin Holt &lt;holt@sgi.com&gt;
Cc: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Acked-by: Davide Libenzi &lt;davidel@xmailserver.org&gt;
Cc: Pekka Enberg &lt;penberg@cs.helsinki.fi&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>epoll: make epoll_wait() use the hrtimer range feature</title>
<updated>2010-10-28T01:03:18+00:00</updated>
<author>
<name>Shawn Bohrer</name>
<email>shawn.bohrer@gmail.com</email>
</author>
<published>2010-10-27T22:34:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=95aac7b1cd224f568fb83937044cd303ff11b029'/>
<id>95aac7b1cd224f568fb83937044cd303ff11b029</id>
<content type='text'>
This make epoll use hrtimers for the timeout value which prevents
epoll_wait() from timing out up to a millisecond early.

This mirrors the behavior of select() and poll().

Signed-off-by: Shawn Bohrer &lt;shawn.bohrer@gmail.com&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Acked-by: Davide Libenzi &lt;davidel@xmailserver.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This make epoll use hrtimers for the timeout value which prevents
epoll_wait() from timing out up to a millisecond early.

This mirrors the behavior of select() and poll().

Signed-off-by: Shawn Bohrer &lt;shawn.bohrer@gmail.com&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Acked-by: Davide Libenzi &lt;davidel@xmailserver.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>llseek: automatically add .llseek fop</title>
<updated>2010-10-15T13:53:27+00:00</updated>
<author>
<name>Arnd Bergmann</name>
<email>arnd@arndb.de</email>
</author>
<published>2010-08-15T16:52:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=6038f373a3dc1f1c26496e60b6c40b164716f07e'/>
<id>6038f373a3dc1f1c26496e60b6c40b164716f07e</id>
<content type='text'>
All file_operations should get a .llseek operation so we can make
nonseekable_open the default for future file operations without a
.llseek pointer.

The three cases that we can automatically detect are no_llseek, seq_lseek
and default_llseek. For cases where we can we can automatically prove that
the file offset is always ignored, we use noop_llseek, which maintains
the current behavior of not returning an error from a seek.

New drivers should normally not use noop_llseek but instead use no_llseek
and call nonseekable_open at open time.  Existing drivers can be converted
to do the same when the maintainer knows for certain that no user code
relies on calling seek on the device file.

The generated code is often incorrectly indented and right now contains
comments that clarify for each added line why a specific variant was
chosen. In the version that gets submitted upstream, the comments will
be gone and I will manually fix the indentation, because there does not
seem to be a way to do that using coccinelle.

Some amount of new code is currently sitting in linux-next that should get
the same modifications, which I will do at the end of the merge window.

Many thanks to Julia Lawall for helping me learn to write a semantic
patch that does all this.

===== begin semantic patch =====
// This adds an llseek= method to all file operations,
// as a preparation for making no_llseek the default.
//
// The rules are
// - use no_llseek explicitly if we do nonseekable_open
// - use seq_lseek for sequential files
// - use default_llseek if we know we access f_pos
// - use noop_llseek if we know we don't access f_pos,
//   but we still want to allow users to call lseek
//
@ open1 exists @
identifier nested_open;
@@
nested_open(...)
{
&lt;+...
nonseekable_open(...)
...+&gt;
}

@ open exists@
identifier open_f;
identifier i, f;
identifier open1.nested_open;
@@
int open_f(struct inode *i, struct file *f)
{
&lt;+...
(
nonseekable_open(...)
|
nested_open(...)
)
...+&gt;
}

@ read disable optional_qualifier exists @
identifier read_f;
identifier f, p, s, off;
type ssize_t, size_t, loff_t;
expression E;
identifier func;
@@
ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off)
{
&lt;+...
(
   *off = E
|
   *off += E
|
   func(..., off, ...)
|
   E = *off
)
...+&gt;
}

@ read_no_fpos disable optional_qualifier exists @
identifier read_f;
identifier f, p, s, off;
type ssize_t, size_t, loff_t;
@@
ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off)
{
... when != off
}

@ write @
identifier write_f;
identifier f, p, s, off;
type ssize_t, size_t, loff_t;
expression E;
identifier func;
@@
ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off)
{
&lt;+...
(
  *off = E
|
  *off += E
|
  func(..., off, ...)
|
  E = *off
)
...+&gt;
}

@ write_no_fpos @
identifier write_f;
identifier f, p, s, off;
type ssize_t, size_t, loff_t;
@@
ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off)
{
... when != off
}

@ fops0 @
identifier fops;
@@
struct file_operations fops = {
 ...
};

@ has_llseek depends on fops0 @
identifier fops0.fops;
identifier llseek_f;
@@
struct file_operations fops = {
...
 .llseek = llseek_f,
...
};

@ has_read depends on fops0 @
identifier fops0.fops;
identifier read_f;
@@
struct file_operations fops = {
...
 .read = read_f,
...
};

@ has_write depends on fops0 @
identifier fops0.fops;
identifier write_f;
@@
struct file_operations fops = {
...
 .write = write_f,
...
};

@ has_open depends on fops0 @
identifier fops0.fops;
identifier open_f;
@@
struct file_operations fops = {
...
 .open = open_f,
...
};

// use no_llseek if we call nonseekable_open
////////////////////////////////////////////
@ nonseekable1 depends on !has_llseek &amp;&amp; has_open @
identifier fops0.fops;
identifier nso ~= "nonseekable_open";
@@
struct file_operations fops = {
...  .open = nso, ...
+.llseek = no_llseek, /* nonseekable */
};

@ nonseekable2 depends on !has_llseek @
identifier fops0.fops;
identifier open.open_f;
@@
struct file_operations fops = {
...  .open = open_f, ...
+.llseek = no_llseek, /* open uses nonseekable */
};

// use seq_lseek for sequential files
/////////////////////////////////////
@ seq depends on !has_llseek @
identifier fops0.fops;
identifier sr ~= "seq_read";
@@
struct file_operations fops = {
...  .read = sr, ...
+.llseek = seq_lseek, /* we have seq_read */
};

// use default_llseek if there is a readdir
///////////////////////////////////////////
@ fops1 depends on !has_llseek &amp;&amp; !nonseekable1 &amp;&amp; !nonseekable2 &amp;&amp; !seq @
identifier fops0.fops;
identifier readdir_e;
@@
// any other fop is used that changes pos
struct file_operations fops = {
... .readdir = readdir_e, ...
+.llseek = default_llseek, /* readdir is present */
};

// use default_llseek if at least one of read/write touches f_pos
/////////////////////////////////////////////////////////////////
@ fops2 depends on !fops1 &amp;&amp; !has_llseek &amp;&amp; !nonseekable1 &amp;&amp; !nonseekable2 &amp;&amp; !seq @
identifier fops0.fops;
identifier read.read_f;
@@
// read fops use offset
struct file_operations fops = {
... .read = read_f, ...
+.llseek = default_llseek, /* read accesses f_pos */
};

@ fops3 depends on !fops1 &amp;&amp; !fops2 &amp;&amp; !has_llseek &amp;&amp; !nonseekable1 &amp;&amp; !nonseekable2 &amp;&amp; !seq @
identifier fops0.fops;
identifier write.write_f;
@@
// write fops use offset
struct file_operations fops = {
... .write = write_f, ...
+	.llseek = default_llseek, /* write accesses f_pos */
};

// Use noop_llseek if neither read nor write accesses f_pos
///////////////////////////////////////////////////////////

@ fops4 depends on !fops1 &amp;&amp; !fops2 &amp;&amp; !fops3 &amp;&amp; !has_llseek &amp;&amp; !nonseekable1 &amp;&amp; !nonseekable2 &amp;&amp; !seq @
identifier fops0.fops;
identifier read_no_fpos.read_f;
identifier write_no_fpos.write_f;
@@
// write fops use offset
struct file_operations fops = {
...
 .write = write_f,
 .read = read_f,
...
+.llseek = noop_llseek, /* read and write both use no f_pos */
};

@ depends on has_write &amp;&amp; !has_read &amp;&amp; !fops1 &amp;&amp; !fops2 &amp;&amp; !has_llseek &amp;&amp; !nonseekable1 &amp;&amp; !nonseekable2 &amp;&amp; !seq @
identifier fops0.fops;
identifier write_no_fpos.write_f;
@@
struct file_operations fops = {
... .write = write_f, ...
+.llseek = noop_llseek, /* write uses no f_pos */
};

@ depends on has_read &amp;&amp; !has_write &amp;&amp; !fops1 &amp;&amp; !fops2 &amp;&amp; !has_llseek &amp;&amp; !nonseekable1 &amp;&amp; !nonseekable2 &amp;&amp; !seq @
identifier fops0.fops;
identifier read_no_fpos.read_f;
@@
struct file_operations fops = {
... .read = read_f, ...
+.llseek = noop_llseek, /* read uses no f_pos */
};

@ depends on !has_read &amp;&amp; !has_write &amp;&amp; !fops1 &amp;&amp; !fops2 &amp;&amp; !has_llseek &amp;&amp; !nonseekable1 &amp;&amp; !nonseekable2 &amp;&amp; !seq @
identifier fops0.fops;
@@
struct file_operations fops = {
...
+.llseek = noop_llseek, /* no read or write fn */
};
===== End semantic patch =====

Signed-off-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: Julia Lawall &lt;julia@diku.dk&gt;
Cc: Christoph Hellwig &lt;hch@infradead.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
All file_operations should get a .llseek operation so we can make
nonseekable_open the default for future file operations without a
.llseek pointer.

The three cases that we can automatically detect are no_llseek, seq_lseek
and default_llseek. For cases where we can we can automatically prove that
the file offset is always ignored, we use noop_llseek, which maintains
the current behavior of not returning an error from a seek.

New drivers should normally not use noop_llseek but instead use no_llseek
and call nonseekable_open at open time.  Existing drivers can be converted
to do the same when the maintainer knows for certain that no user code
relies on calling seek on the device file.

The generated code is often incorrectly indented and right now contains
comments that clarify for each added line why a specific variant was
chosen. In the version that gets submitted upstream, the comments will
be gone and I will manually fix the indentation, because there does not
seem to be a way to do that using coccinelle.

Some amount of new code is currently sitting in linux-next that should get
the same modifications, which I will do at the end of the merge window.

Many thanks to Julia Lawall for helping me learn to write a semantic
patch that does all this.

===== begin semantic patch =====
// This adds an llseek= method to all file operations,
// as a preparation for making no_llseek the default.
//
// The rules are
// - use no_llseek explicitly if we do nonseekable_open
// - use seq_lseek for sequential files
// - use default_llseek if we know we access f_pos
// - use noop_llseek if we know we don't access f_pos,
//   but we still want to allow users to call lseek
//
@ open1 exists @
identifier nested_open;
@@
nested_open(...)
{
&lt;+...
nonseekable_open(...)
...+&gt;
}

@ open exists@
identifier open_f;
identifier i, f;
identifier open1.nested_open;
@@
int open_f(struct inode *i, struct file *f)
{
&lt;+...
(
nonseekable_open(...)
|
nested_open(...)
)
...+&gt;
}

@ read disable optional_qualifier exists @
identifier read_f;
identifier f, p, s, off;
type ssize_t, size_t, loff_t;
expression E;
identifier func;
@@
ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off)
{
&lt;+...
(
   *off = E
|
   *off += E
|
   func(..., off, ...)
|
   E = *off
)
...+&gt;
}

@ read_no_fpos disable optional_qualifier exists @
identifier read_f;
identifier f, p, s, off;
type ssize_t, size_t, loff_t;
@@
ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off)
{
... when != off
}

@ write @
identifier write_f;
identifier f, p, s, off;
type ssize_t, size_t, loff_t;
expression E;
identifier func;
@@
ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off)
{
&lt;+...
(
  *off = E
|
  *off += E
|
  func(..., off, ...)
|
  E = *off
)
...+&gt;
}

@ write_no_fpos @
identifier write_f;
identifier f, p, s, off;
type ssize_t, size_t, loff_t;
@@
ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off)
{
... when != off
}

@ fops0 @
identifier fops;
@@
struct file_operations fops = {
 ...
};

@ has_llseek depends on fops0 @
identifier fops0.fops;
identifier llseek_f;
@@
struct file_operations fops = {
...
 .llseek = llseek_f,
...
};

@ has_read depends on fops0 @
identifier fops0.fops;
identifier read_f;
@@
struct file_operations fops = {
...
 .read = read_f,
...
};

@ has_write depends on fops0 @
identifier fops0.fops;
identifier write_f;
@@
struct file_operations fops = {
...
 .write = write_f,
...
};

@ has_open depends on fops0 @
identifier fops0.fops;
identifier open_f;
@@
struct file_operations fops = {
...
 .open = open_f,
...
};

// use no_llseek if we call nonseekable_open
////////////////////////////////////////////
@ nonseekable1 depends on !has_llseek &amp;&amp; has_open @
identifier fops0.fops;
identifier nso ~= "nonseekable_open";
@@
struct file_operations fops = {
...  .open = nso, ...
+.llseek = no_llseek, /* nonseekable */
};

@ nonseekable2 depends on !has_llseek @
identifier fops0.fops;
identifier open.open_f;
@@
struct file_operations fops = {
...  .open = open_f, ...
+.llseek = no_llseek, /* open uses nonseekable */
};

// use seq_lseek for sequential files
/////////////////////////////////////
@ seq depends on !has_llseek @
identifier fops0.fops;
identifier sr ~= "seq_read";
@@
struct file_operations fops = {
...  .read = sr, ...
+.llseek = seq_lseek, /* we have seq_read */
};

// use default_llseek if there is a readdir
///////////////////////////////////////////
@ fops1 depends on !has_llseek &amp;&amp; !nonseekable1 &amp;&amp; !nonseekable2 &amp;&amp; !seq @
identifier fops0.fops;
identifier readdir_e;
@@
// any other fop is used that changes pos
struct file_operations fops = {
... .readdir = readdir_e, ...
+.llseek = default_llseek, /* readdir is present */
};

// use default_llseek if at least one of read/write touches f_pos
/////////////////////////////////////////////////////////////////
@ fops2 depends on !fops1 &amp;&amp; !has_llseek &amp;&amp; !nonseekable1 &amp;&amp; !nonseekable2 &amp;&amp; !seq @
identifier fops0.fops;
identifier read.read_f;
@@
// read fops use offset
struct file_operations fops = {
... .read = read_f, ...
+.llseek = default_llseek, /* read accesses f_pos */
};

@ fops3 depends on !fops1 &amp;&amp; !fops2 &amp;&amp; !has_llseek &amp;&amp; !nonseekable1 &amp;&amp; !nonseekable2 &amp;&amp; !seq @
identifier fops0.fops;
identifier write.write_f;
@@
// write fops use offset
struct file_operations fops = {
... .write = write_f, ...
+	.llseek = default_llseek, /* write accesses f_pos */
};

// Use noop_llseek if neither read nor write accesses f_pos
///////////////////////////////////////////////////////////

@ fops4 depends on !fops1 &amp;&amp; !fops2 &amp;&amp; !fops3 &amp;&amp; !has_llseek &amp;&amp; !nonseekable1 &amp;&amp; !nonseekable2 &amp;&amp; !seq @
identifier fops0.fops;
identifier read_no_fpos.read_f;
identifier write_no_fpos.write_f;
@@
// write fops use offset
struct file_operations fops = {
...
 .write = write_f,
 .read = read_f,
...
+.llseek = noop_llseek, /* read and write both use no f_pos */
};

@ depends on has_write &amp;&amp; !has_read &amp;&amp; !fops1 &amp;&amp; !fops2 &amp;&amp; !has_llseek &amp;&amp; !nonseekable1 &amp;&amp; !nonseekable2 &amp;&amp; !seq @
identifier fops0.fops;
identifier write_no_fpos.write_f;
@@
struct file_operations fops = {
... .write = write_f, ...
+.llseek = noop_llseek, /* write uses no f_pos */
};

@ depends on has_read &amp;&amp; !has_write &amp;&amp; !fops1 &amp;&amp; !fops2 &amp;&amp; !has_llseek &amp;&amp; !nonseekable1 &amp;&amp; !nonseekable2 &amp;&amp; !seq @
identifier fops0.fops;
identifier read_no_fpos.read_f;
@@
struct file_operations fops = {
... .read = read_f, ...
+.llseek = noop_llseek, /* read uses no f_pos */
};

@ depends on !has_read &amp;&amp; !has_write &amp;&amp; !fops1 &amp;&amp; !fops2 &amp;&amp; !has_llseek &amp;&amp; !nonseekable1 &amp;&amp; !nonseekable2 &amp;&amp; !seq @
identifier fops0.fops;
@@
struct file_operations fops = {
...
+.llseek = noop_llseek, /* no read or write fn */
};
===== End semantic patch =====

Signed-off-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: Julia Lawall &lt;julia@diku.dk&gt;
Cc: Christoph Hellwig &lt;hch@infradead.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
