linux.git/fs/eventpoll.c, branch v2.6.18

[PATCH] eventpoll.c compile fix

2006-08-27T18:01:31+00:00

Fix two compile failures in eventpoll.c code which would happen if
DEBUG_EPOLL is bigger than zero.

Signed-off-by: Masoud Sharbiani 
Cc: Davide Libenzi 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

[PATCH] sched: cleanup, remove task_t, convert to struct task_struct

2006-07-03T22:27:11+00:00

cleanup: remove task_t and convert all the uses to struct task_struct. I
introduced it for the scheduler anno and it was a mistake.

Conversion was mostly scripted, the result was reviewed and all
secondary whitespace and style impact (if any) was fixed up by hand.

Signed-off-by: Ingo Molnar 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

[PATCH] epoll: use unlocked wqueue operations

2006-06-25T17:01:13+00:00

A few days ago Arjan signaled a lockdep red flag on epoll locks, and
precisely between the epoll's device structure lock (->lock) and the wait
queue head lock (->lock).

Like I explained in another email, and directly to Arjan, this can't happen
in reality because of the explicit check at eventpoll.c:592, that does not
allow to drop an epoll fd inside the same epoll fd.  Since lockdep is
working on per-structure locks, it will never be able to know of policies
enforced in other parts of the code.

It was decided time ago of having the ability to drop epoll fds inside
other epoll fds, that triggers a very trick wakeup operations (due to
possibly reentrant callback-driven wakeups) handled by the
ep_poll_safewake() function.  While looking again at the code though, I
noticed that all the operations done on the epoll's main structure wait
queue head (->wq) are already protected by the epoll lock (->lock), so that
locked-style functions can be used to manipulate the ->wq member.  This
makes both a lock-acquire save, and lockdep happy.

Running totalmess on my dual opteron for a while did not reveal any problem
so far:

http://www.xmailserver.org/totalmess.c

Signed-off-by: Davide Libenzi 
Cc: Arjan van de Ven 
Cc: Ingo Molnar 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

[PATCH] VFS: Permit filesystem to override root dentry on mount

2006-06-23T14:42:45+00:00

Extend the get_sb() filesystem operation to take an extra argument that
permits the VFS to pass in the target vfsmount that defines the mountpoint.

The filesystem is then required to manually set the superblock and root dentry
pointers.  For most filesystems, this should be done with simple_set_mnt()
which will set the superblock pointer and then set the root dentry to the
superblock's s_root (as per the old default behaviour).

The get_sb() op now returns an integer as there's now no need to return the
superblock pointer.

This patch permits a superblock to be implicitly shared amongst several mount
points, such as can be done with NFS to avoid potential inode aliasing.  In
such a case, simple_set_mnt() would not be called, and instead the mnt_root
and mnt_sb would be set directly.

The patch also makes the following changes:

 (*) the get_sb_*() convenience functions in the core kernel now take a vfsmount
     pointer argument and return an integer, so most filesystems have to change
     very little.

 (*) If one of the convenience function is not used, then get_sb() should
     normally call simple_set_mnt() to instantiate the vfsmount. This will
     always return 0, and so can be tail-called from get_sb().

 (*) generic_shutdown_super() now calls shrink_dcache_sb() to clean up the
     dcache upon superblock destruction rather than shrink_dcache_anon().

     This is required because the superblock may now have multiple trees that
     aren't actually bound to s_root, but that still need to be cleaned up. The
     currently called functions assume that the whole tree is rooted at s_root,
     and that anonymous dentries are not the roots of trees which results in
     dentries being left unculled.

     However, with the way NFS superblock sharing are currently set to be
     implemented, these assumptions are violated: the root of the filesystem is
     simply a dummy dentry and inode (the real inode for '/' may well be
     inaccessible), and all the vfsmounts are rooted on anonymous[*] dentries
     with child trees.

     [*] Anonymous until discovered from another tree.

 (*) The documentation has been adjusted, including the additional bit of
     changing ext2_* into foo_* in the documentation.

[akpm@osdl.org: convert ipath_fs, do other stuff]
Signed-off-by: David Howells 
Acked-by: Al Viro 
Cc: Nathan Scott 
Cc: Roland Dreier 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

[RBTREE] Update eventpoll.c to use rb_parent() accessor macro.

2006-04-21T12:17:24+00:00

Signed-off-by: David Woodhouse

[PATCH] uniform POLLRDHUP handling between epoll and poll/select

2006-04-11T13:18:42+00:00

As reported by Michael Kerrisk, POLLRDHUP handling was not consistent
between epoll and poll/select, since in epoll it was unmaskeable.  This
patch brings uniformity in POLLRDHUP handling.

Signed-off-by: Davide Libenzi 
Cc: Michael Kerrisk 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

[PATCH] Make most file operations structs in fs/ const

2006-03-28T17:16:06+00:00

This is a conversion to make the various file_operations structs in fs/
const.  Basically a regexp job, with a few manual fixups

The goal is both to increase correctness (harder to accidentally write to
shared datastructures) and reducing the false sharing of cachelines with
things that get dirty in .data (while .rodata is nicely read only and thus
cache clean)

Signed-off-by: Arjan van de Ven 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

[PATCH] Use __read_mostly on some hot fs variables

2006-03-26T16:56:56+00:00

I discovered on oprofile hunting on a SMP platform that dentry lookups were
slowed down because d_hash_mask, d_hash_shift and dentry_hashtable were in
a cache line that contained inodes_stat.  So each time inodes_stats is
changed by a cpu, other cpus have to refill their cache line.

This patch moves some variables to the __read_mostly section, in order to
avoid false sharing.  RCU dentry lookups can go full speed.

Signed-off-by: Eric Dumazet 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

[PATCH] POLLRDHUP/EPOLLRDHUP handling for half-closed devices notifications

2006-03-25T16:22:56+00:00

Implement the half-closed devices notifiation, by adding a new POLLRDHUP
(and its alias EPOLLRDHUP) bit to the existing poll/select sets.  Since the
existing POLLHUP handling, that does not report correctly half-closed
devices, was feared to be changed, this implementation leaves the current
POLLHUP reporting unchanged and simply add a new bit that is set in the few
places where it makes sense.  The same thing was discussed and conceptually
agreed quite some time ago:

http://lkml.org/lkml/2003/7/12/116

Since this new event bit is added to the existing Linux poll infrastruture,
even the existing poll/select system calls will be able to use it.  As far
as the existing POLLHUP handling, the patch leaves it as is.  The
pollrdhup-2.6.16.rc5-0.10.diff defines the POLLRDHUP for all the existing
archs and sets the bit in the six relevant files.  The other attached diff
is the simple change required to sys/epoll.h to add the EPOLLRDHUP
definition.

There is "a stupid program" to test POLLRDHUP delivery here:

 http://www.xmailserver.org/pollrdhup-test.c

It tests poll(2), but since the delivery is same epoll(2) will work equally.

Signed-off-by: Davide Libenzi 
Cc: "David S. Miller" 
Cc: Michael Kerrisk 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

[PATCH] get_empty_filp tweaks, inline epoll_init_file()

2006-03-23T15:38:17+00:00

Eliminate a handful of cache references by keeping current in a register
instead of reloading (helps x86) and avoiding the overhead of a function
call.  Inlining eventpoll_init_file() saves 24 bytes.  Also reorder file
initialization to make writes occur more sequentially.

Signed-off-by: Benjamin LaHaise 
Cc: Davide Libenzi 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds