<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/fs/select.c, branch v2.6.24</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>fs/select, remove unused macros</title>
<updated>2007-10-19T18:53:41+00:00</updated>
<author>
<name>Jiri Slaby</name>
<email>jirislaby@gmail.com</email>
</author>
<published>2007-10-19T06:40:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=1276b103c20603835d9b903cae099125e8c2c5a3'/>
<id>1276b103c20603835d9b903cae099125e8c2c5a3</id>
<content type='text'>
fs/select, remove unused macros

this is due to preparation for global BIT macro

Signed-off-by: Jiri Slaby &lt;jirislaby@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
fs/select, remove unused macros

this is due to preparation for global BIT macro

Signed-off-by: Jiri Slaby &lt;jirislaby@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Use ERESTART_RESTARTBLOCK if poll() is interrupted by a signal</title>
<updated>2007-10-17T15:42:53+00:00</updated>
<author>
<name>Chris Wright</name>
<email>chrisw@sous-sol.org</email>
</author>
<published>2007-10-17T06:27:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=3075d9da0b4ccc88959db30de80ebd11d2dde175'/>
<id>3075d9da0b4ccc88959db30de80ebd11d2dde175</id>
<content type='text'>
Lomesh reported poll returning EINTR during suspend/resume cycle.  This is
caused by the STOP/CONT cycle that the freezer uses, generating a pending
signal for what in effect is an ignored signal.  In general poll is a
little eager in returning EINTR, when it could try not bother userspace and
simply restart the syscall.  Both select and ppoll do use ERESTARTNOHAND to
restart the syscall.  Oleg points out that simply using ERESTARTNOHAND will
cause poll to restart with original timeout value.  which could ultimately
lead to process never returning to userspace.  Instead use
ERESTART_RESTARTBLOCK, and restart poll with updated timeout value.
Inspired by Manfred's use ERESTARTNOHAND in poll patch.

[bunk@kernel.org: do_restart_poll() can become static]
Cc: Manfred Spraul &lt;manfred@colorfullife.com&gt;
Cc: Oleg Nesterov &lt;oleg@tv-sign.ru&gt;
Cc: Roland McGrath &lt;roland@redhat.com&gt;
Cc: "Agarwal, Lomesh" &lt;lomesh.agarwal@intel.com&gt;
Signed-off-by: Chris Wright &lt;chrisw@sous-sol.org&gt;
Signed-off-by: Adrian Bunk &lt;bunk@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Lomesh reported poll returning EINTR during suspend/resume cycle.  This is
caused by the STOP/CONT cycle that the freezer uses, generating a pending
signal for what in effect is an ignored signal.  In general poll is a
little eager in returning EINTR, when it could try not bother userspace and
simply restart the syscall.  Both select and ppoll do use ERESTARTNOHAND to
restart the syscall.  Oleg points out that simply using ERESTARTNOHAND will
cause poll to restart with original timeout value.  which could ultimately
lead to process never returning to userspace.  Instead use
ERESTART_RESTARTBLOCK, and restart poll with updated timeout value.
Inspired by Manfred's use ERESTARTNOHAND in poll patch.

[bunk@kernel.org: do_restart_poll() can become static]
Cc: Manfred Spraul &lt;manfred@colorfullife.com&gt;
Cc: Oleg Nesterov &lt;oleg@tv-sign.ru&gt;
Cc: Roland McGrath &lt;roland@redhat.com&gt;
Cc: "Agarwal, Lomesh" &lt;lomesh.agarwal@intel.com&gt;
Signed-off-by: Chris Wright &lt;chrisw@sous-sol.org&gt;
Signed-off-by: Adrian Bunk &lt;bunk@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>do_poll: return -EINTR when signalled</title>
<updated>2007-10-17T15:42:48+00:00</updated>
<author>
<name>Oleg Nesterov</name>
<email>oleg@tv-sign.ru</email>
</author>
<published>2007-10-17T06:26:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=9bf084f70ffde6521d113593b89461a5bd2a303b'/>
<id>9bf084f70ffde6521d113593b89461a5bd2a303b</id>
<content type='text'>
do_poll() checks signal_pending() but returns 0 when interrupted.  This means
the caller has to check signal_pending() again.

Change it to return -EINTR when signal_pending() and count == 0.

Signed-off-by: Oleg Nesterov &lt;oleg@tv-sign.ru&gt;
Cc: Andi Kleen &lt;ak@suse.de&gt;
Cc: Davide Libenzi &lt;davidel@xmailserver.org&gt;
Cc: Vadim Lobanov &lt;vlobanov@speakeasy.net&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
do_poll() checks signal_pending() but returns 0 when interrupted.  This means
the caller has to check signal_pending() again.

Change it to return -EINTR when signal_pending() and count == 0.

Signed-off-by: Oleg Nesterov &lt;oleg@tv-sign.ru&gt;
Cc: Andi Kleen &lt;ak@suse.de&gt;
Cc: Davide Libenzi &lt;davidel@xmailserver.org&gt;
Cc: Vadim Lobanov &lt;vlobanov@speakeasy.net&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>do_sys_poll: simplify playing with on-stack data</title>
<updated>2007-10-17T15:42:48+00:00</updated>
<author>
<name>Oleg Nesterov</name>
<email>oleg@tv-sign.ru</email>
</author>
<published>2007-10-17T06:26:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=252e5725cfb55a89e54888317856903fef9d5031'/>
<id>252e5725cfb55a89e54888317856903fef9d5031</id>
<content type='text'>
Cleanup. Lessens both the source and compiled code (100 bytes) and imho makes
the code much more understandable.

With this patch "struct poll_list *head" always points to on-stack stack_pps,
so we can remove all "is it on-stack" and "was it initialized" checks.

Also, move poll_initwait/poll_freewait and -EINTR detection closer to the
do_poll()'s callsite.

[akpm@linux-foundation.org: fix warning (size_t != uint)]
Signed-off-by: Oleg Nesterov &lt;oleg@tv-sign.ru&gt;
Looks-good-to: Andi Kleen &lt;ak@suse.de&gt;
Cc: Davide Libenzi &lt;davidel@xmailserver.org&gt;
Cc: Vadim Lobanov &lt;vlobanov@speakeasy.net&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Cleanup. Lessens both the source and compiled code (100 bytes) and imho makes
the code much more understandable.

With this patch "struct poll_list *head" always points to on-stack stack_pps,
so we can remove all "is it on-stack" and "was it initialized" checks.

Also, move poll_initwait/poll_freewait and -EINTR detection closer to the
do_poll()'s callsite.

[akpm@linux-foundation.org: fix warning (size_t != uint)]
Signed-off-by: Oleg Nesterov &lt;oleg@tv-sign.ru&gt;
Looks-good-to: Andi Kleen &lt;ak@suse.de&gt;
Cc: Davide Libenzi &lt;davidel@xmailserver.org&gt;
Cc: Vadim Lobanov &lt;vlobanov@speakeasy.net&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Fix select on /proc files without -&gt;poll</title>
<updated>2007-09-12T00:21:20+00:00</updated>
<author>
<name>Alexey Dobriyan</name>
<email>adobriyan@gmail.com</email>
</author>
<published>2007-09-11T22:23:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=dd23aae4f5edf4e1dbd8f7f8013a754ba3253f48'/>
<id>dd23aae4f5edf4e1dbd8f7f8013a754ba3253f48</id>
<content type='text'>
Taneli Vähäkangas &lt;vahakang@cs.helsinki.fi&gt; reported that commit
786d7e1612f0b0adb6046f19b906609e4fe8b1ba aka "Fix rmmod/read/write races
in /proc entries" broke SBCL + SLIME combo.

The old code in do_select() used DEFAULT_POLLMASK, if couldn't find
-&gt;poll handler.  The new code makes -&gt;poll always there and returns 0 by
default, which is not correct.  Return DEFAULT_POLLMASK instead.

Steps to reproduce:

	install emacs, SBCL, SLIME
	emacs
	M-x slime	in *inferior-lisp* buffer
	[watch it doing "Connecting to Swank on port X.."]

Please, apply before 2.6.23.

P.S.: why SBCL can't just read(2) /proc/cpuinfo is a mystery.

Signed-off-by: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Cc: T Taneli Vahakangas &lt;vahakang@cs.helsinki.fi&gt;
Cc: Oleg Nesterov &lt;oleg@tv-sign.ru&gt;
Cc: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Taneli Vähäkangas &lt;vahakang@cs.helsinki.fi&gt; reported that commit
786d7e1612f0b0adb6046f19b906609e4fe8b1ba aka "Fix rmmod/read/write races
in /proc entries" broke SBCL + SLIME combo.

The old code in do_select() used DEFAULT_POLLMASK, if couldn't find
-&gt;poll handler.  The new code makes -&gt;poll always there and returns 0 by
default, which is not correct.  Return DEFAULT_POLLMASK instead.

Steps to reproduce:

	install emacs, SBCL, SLIME
	emacs
	M-x slime	in *inferior-lisp* buffer
	[watch it doing "Connecting to Swank on port X.."]

Please, apply before 2.6.23.

P.S.: why SBCL can't just read(2) /proc/cpuinfo is a mystery.

Signed-off-by: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Cc: T Taneli Vahakangas &lt;vahakang@cs.helsinki.fi&gt;
Cc: Oleg Nesterov &lt;oleg@tv-sign.ru&gt;
Cc: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Style fix in fs/select.c</title>
<updated>2007-05-09T05:10:02+00:00</updated>
<author>
<name>WANG Cong</name>
<email>xiyou.wangcong@gmail.com</email>
</author>
<published>2007-05-09T05:10:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=ccf6780dc3d228f380e17b6858b93fc48e40afd4'/>
<id>ccf6780dc3d228f380e17b6858b93fc48e40afd4</id>
<content type='text'>
Signed-off-by: WANG Cong  &lt;xiyou.wangcong@gmail.com&gt;
Signed-off-by: Adrian Bunk &lt;bunk@stusta.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Signed-off-by: WANG Cong  &lt;xiyou.wangcong@gmail.com&gt;
Signed-off-by: Adrian Bunk &lt;bunk@stusta.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ROUND_UP macro cleanup in fs/(select|compat|readdir).c</title>
<updated>2007-05-08T18:15:09+00:00</updated>
<author>
<name>Milind Arun Choudhary</name>
<email>milindchoudhary@gmail.com</email>
</author>
<published>2007-05-08T07:29:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=022a1692444cd683ef42f637cc717db4d8fd9378'/>
<id>022a1692444cd683ef42f637cc717db4d8fd9378</id>
<content type='text'>
ROUND_UP macro cleanup use,ALIGN or DIV_ROUND_UP where ever appropriate.

Signed-off-by: Milind Arun Choudhary &lt;milindchoudhary@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
ROUND_UP macro cleanup use,ALIGN or DIV_ROUND_UP where ever appropriate.

Signed-off-by: Milind Arun Choudhary &lt;milindchoudhary@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>header cleaning: don't include smp_lock.h when not used</title>
<updated>2007-05-08T18:15:07+00:00</updated>
<author>
<name>Randy Dunlap</name>
<email>randy.dunlap@oracle.com</email>
</author>
<published>2007-05-08T07:28:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=e63340ae6b6205fef26b40a75673d1c9c0c8bb90'/>
<id>e63340ae6b6205fef26b40a75673d1c9c0c8bb90</id>
<content type='text'>
Remove includes of &lt;linux/smp_lock.h&gt; where it is not used/needed.
Suggested by Al Viro.

Builds cleanly on x86_64, i386, alpha, ia64, powerpc, sparc,
sparc64, and arm (all 59 defconfigs).

Signed-off-by: Randy Dunlap &lt;randy.dunlap@oracle.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Remove includes of &lt;linux/smp_lock.h&gt; where it is not used/needed.
Suggested by Al Viro.

Builds cleanly on x86_64, i386, alpha, ia64, powerpc, sparc,
sparc64, and arm (all 59 defconfigs).

Signed-off-by: Randy Dunlap &lt;randy.dunlap@oracle.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[PATCH] fdtable: Make fdarray and fdsets equal in size</title>
<updated>2006-12-10T17:57:22+00:00</updated>
<author>
<name>Vadim Lobanov</name>
<email>vlobanov@speakeasy.net</email>
</author>
<published>2006-12-10T10:21:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=bbea9f69668a3d0cf9feba15a724cd02896f8675'/>
<id>bbea9f69668a3d0cf9feba15a724cd02896f8675</id>
<content type='text'>
Currently, each fdtable supports three dynamically-sized arrays of data: the
fdarray and two fdsets.  The code allows the number of fds supported by the
fdarray (fdtable-&gt;max_fds) to differ from the number of fds supported by each
of the fdsets (fdtable-&gt;max_fdset).

In practice, it is wasteful for these two sizes to differ: whenever we hit a
limit on the smaller-capacity structure, we will reallocate the entire fdtable
and all the dynamic arrays within it, so any delta in the memory used by the
larger-capacity structure will never be touched at all.

Rather than hogging this excess, we shouldn't even allocate it in the first
place, and keep the capacities of the fdarray and the fdsets equal.  This
patch removes fdtable-&gt;max_fdset.  As an added bonus, most of the supporting
code becomes simpler.

Signed-off-by: Vadim Lobanov &lt;vlobanov@speakeasy.net&gt;
Cc: Christoph Hellwig &lt;hch@lst.de&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Dipankar Sarma &lt;dipankar@in.ibm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@osdl.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@osdl.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently, each fdtable supports three dynamically-sized arrays of data: the
fdarray and two fdsets.  The code allows the number of fds supported by the
fdarray (fdtable-&gt;max_fds) to differ from the number of fds supported by each
of the fdsets (fdtable-&gt;max_fdset).

In practice, it is wasteful for these two sizes to differ: whenever we hit a
limit on the smaller-capacity structure, we will reallocate the entire fdtable
and all the dynamic arrays within it, so any delta in the memory used by the
larger-capacity structure will never be touched at all.

Rather than hogging this excess, we shouldn't even allocate it in the first
place, and keep the capacities of the fdarray and the fdsets equal.  This
patch removes fdtable-&gt;max_fdset.  As an added bonus, most of the supporting
code becomes simpler.

Signed-off-by: Vadim Lobanov &lt;vlobanov@speakeasy.net&gt;
Cc: Christoph Hellwig &lt;hch@lst.de&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Dipankar Sarma &lt;dipankar@in.ibm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@osdl.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@osdl.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[PATCH] enforce RLIMIT_NOFILE in poll()</title>
<updated>2006-09-29T16:18:23+00:00</updated>
<author>
<name>Chris Snook</name>
<email>csnook@redhat.com</email>
</author>
<published>2006-09-29T09:01:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=4e6fd33b75602ced4c5d43e99a10a1d13f33d4f4'/>
<id>4e6fd33b75602ced4c5d43e99a10a1d13f33d4f4</id>
<content type='text'>
POSIX states that poll() shall fail with EINVAL if nfds &gt; OPEN_MAX.  In
this context, POSIX is referring to sysconf(OPEN_MAX), which is the value
of current-&gt;signal-&gt;rlim[RLIMIT_NOFILE].rlim_cur in the linux kernel, not
the compile-time constant which happens to also be named OPEN_MAX.  In the
current code, an application may poll up to max_fdset file descriptors,
even if this exceeds RLIMIT_NOFILE.  The current code also breaks
applications which poll more than max_fdset descriptors, which worked circa
2.4.18 when the check was against NR_OPEN, which is 1024*1024.  This patch
enforces the limit precisely as POSIX defines, even if RLIMIT_NOFILE has
been changed at run time with ulimit -n.

To elaborate on the rationale for this, there are three cases:

1) RLIMIT_NOFILE is at the default value of 1024

In this (default) case, the patch changes nothing.  Calls with nfds &gt; 1024
fail with EINVAL both before and after the patch, and calls with nfds &lt;=
1024 pass the check both before and after the patch, since 1024 is the
initial value of max_fdset.

2) RLIMIT_NOFILE has been raised above the default

In this case, poll() becomes more permissive, allowing polling up to
RLIMIT_NOFILE file descriptors even if less than 1024 have been opened.
The patch won't introduce new errors here.  If an application somehow
depends on poll() failing when it polls with duplicate or invalid file
descriptors, it's already broken, since this is already allowed below 1024,
and will also work above 1024 if enough file descriptors have been open at
some point to cause max_fdset to have been increased above nfds.

3) RLIMIT_NOFILE has been lowered below the default

In this case, the system administrator or the user has gone out of their
way to protect the system from inefficient (or malicious) applications
wasting kernel memory.  The current code allows polling up to 1024 file
descriptors even if RLIMIT_NOFILE is much lower, which is not what the user
or administrator intended.  Well-written applications which only poll
valid, unique file descriptors will never notice the difference, because
they'll hit the limit on open() first.  If an application gets broken
because of the patch in this case, then it was already poorly/maliciously
designed, and allowing it to work in the past was a violation of POSIX and
a DoS risk on low-resource systems.

With this patch, poll() will permit exactly what POSIX suggests, no more,
no less, and for any run-time value set with ulimit -n, not just 256 or
1024.  There are existing apps which which poll a large number of file
descriptors, some of which may be invalid, and if those numbers stradle
1024, they currently fail with or without the patch in -mm, though they
worked fine under 2.4.18.

Signed-off-by: Chris Snook &lt;csnook@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@osdl.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@osdl.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
POSIX states that poll() shall fail with EINVAL if nfds &gt; OPEN_MAX.  In
this context, POSIX is referring to sysconf(OPEN_MAX), which is the value
of current-&gt;signal-&gt;rlim[RLIMIT_NOFILE].rlim_cur in the linux kernel, not
the compile-time constant which happens to also be named OPEN_MAX.  In the
current code, an application may poll up to max_fdset file descriptors,
even if this exceeds RLIMIT_NOFILE.  The current code also breaks
applications which poll more than max_fdset descriptors, which worked circa
2.4.18 when the check was against NR_OPEN, which is 1024*1024.  This patch
enforces the limit precisely as POSIX defines, even if RLIMIT_NOFILE has
been changed at run time with ulimit -n.

To elaborate on the rationale for this, there are three cases:

1) RLIMIT_NOFILE is at the default value of 1024

In this (default) case, the patch changes nothing.  Calls with nfds &gt; 1024
fail with EINVAL both before and after the patch, and calls with nfds &lt;=
1024 pass the check both before and after the patch, since 1024 is the
initial value of max_fdset.

2) RLIMIT_NOFILE has been raised above the default

In this case, poll() becomes more permissive, allowing polling up to
RLIMIT_NOFILE file descriptors even if less than 1024 have been opened.
The patch won't introduce new errors here.  If an application somehow
depends on poll() failing when it polls with duplicate or invalid file
descriptors, it's already broken, since this is already allowed below 1024,
and will also work above 1024 if enough file descriptors have been open at
some point to cause max_fdset to have been increased above nfds.

3) RLIMIT_NOFILE has been lowered below the default

In this case, the system administrator or the user has gone out of their
way to protect the system from inefficient (or malicious) applications
wasting kernel memory.  The current code allows polling up to 1024 file
descriptors even if RLIMIT_NOFILE is much lower, which is not what the user
or administrator intended.  Well-written applications which only poll
valid, unique file descriptors will never notice the difference, because
they'll hit the limit on open() first.  If an application gets broken
because of the patch in this case, then it was already poorly/maliciously
designed, and allowing it to work in the past was a violation of POSIX and
a DoS risk on low-resource systems.

With this patch, poll() will permit exactly what POSIX suggests, no more,
no less, and for any run-time value set with ulimit -n, not just 256 or
1024.  There are existing apps which which poll a large number of file
descriptors, some of which may be invalid, and if those numbers stradle
1024, they currently fail with or without the patch in -mm, though they
worked fine under 2.4.18.

Signed-off-by: Chris Snook &lt;csnook@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@osdl.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@osdl.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
