<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/fs/fuse/inode.c, branch linux-3.0.y</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>fuse: fix stat call on 32 bit platforms</title>
<updated>2012-06-17T18:23:12+00:00</updated>
<author>
<name>Pavel Shilovsky</name>
<email>piastry@etersoft.ru</email>
</author>
<published>2012-05-10T15:49:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=6140710c5dba509a20b26dfe38b58f40baf2a2c8'/>
<id>6140710c5dba509a20b26dfe38b58f40baf2a2c8</id>
<content type='text'>
commit 45c72cd73c788dd18c8113d4a404d6b4a01decf1 upstream.

Now we store attr-&gt;ino at inode-&gt;i_ino, return attr-&gt;ino at the
first time and then return inode-&gt;i_ino if the attribute timeout
isn't expired. That's wrong on 32 bit platforms because attr-&gt;ino
is 64 bit and inode-&gt;i_ino is 32 bit in this case.

Fix this by saving 64 bit ino in fuse_inode structure and returning
it every time we call getattr. Also squash attr-&gt;ino into inode-&gt;i_ino
explicitly.

Signed-off-by: Pavel Shilovsky &lt;piastry@etersoft.ru&gt;
Signed-off-by: Miklos Szeredi &lt;mszeredi@suse.cz&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 45c72cd73c788dd18c8113d4a404d6b4a01decf1 upstream.

Now we store attr-&gt;ino at inode-&gt;i_ino, return attr-&gt;ino at the
first time and then return inode-&gt;i_ino if the attribute timeout
isn't expired. That's wrong on 32 bit platforms because attr-&gt;ino
is 64 bit and inode-&gt;i_ino is 32 bit in this case.

Fix this by saving 64 bit ino in fuse_inode structure and returning
it every time we call getattr. Also squash attr-&gt;ino into inode-&gt;i_ino
explicitly.

Signed-off-by: Pavel Shilovsky &lt;piastry@etersoft.ru&gt;
Signed-off-by: Miklos Szeredi &lt;mszeredi@suse.cz&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>more conservative S_NOSEC handling</title>
<updated>2011-06-03T22:24:58+00:00</updated>
<author>
<name>Al Viro</name>
<email>viro@zeniv.linux.org.uk</email>
</author>
<published>2011-06-03T22:24:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=9e1f1de02c2275d7172e18dc4e7c2065777611bf'/>
<id>9e1f1de02c2275d7172e18dc4e7c2065777611bf</id>
<content type='text'>
Caching "we have already removed suid/caps" was overenthusiastic as merged.
On network filesystems we might have had suid/caps set on another client,
silently picked by this client on revalidate, all of that *without* clearing
the S_NOSEC flag.

AFAICS, the only reasonably sane way to deal with that is
	* new superblock flag; unless set, S_NOSEC is not going to be set.
	* local block filesystems set it in their -&gt;mount() (more accurately,
mount_bdev() does, so does btrfs -&gt;mount(), users of mount_bdev() other than
local block ones clear it)
	* if any network filesystem (or a cluster one) wants to use S_NOSEC,
it'll need to set MS_NOSEC in sb-&gt;s_flags *AND* take care to clear S_NOSEC when
inode attribute changes are picked from other clients.

It's not an earth-shattering hole (anybody that can set suid on another client
will almost certainly be able to write to the file before doing that anyway),
but it's a bug that needs fixing.

Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Caching "we have already removed suid/caps" was overenthusiastic as merged.
On network filesystems we might have had suid/caps set on another client,
silently picked by this client on revalidate, all of that *without* clearing
the S_NOSEC flag.

AFAICS, the only reasonably sane way to deal with that is
	* new superblock flag; unless set, S_NOSEC is not going to be set.
	* local block filesystems set it in their -&gt;mount() (more accurately,
mount_bdev() does, so does btrfs -&gt;mount(), users of mount_bdev() other than
local block ones clear it)
	* if any network filesystem (or a cluster one) wants to use S_NOSEC,
it'll need to set MS_NOSEC in sb-&gt;s_flags *AND* take care to clear S_NOSEC when
inode attribute changes are picked from other clients.

It's not an earth-shattering hole (anybody that can set suid on another client
will almost certainly be able to write to the file before doing that anyway),
but it's a bug that needs fixing.

Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'for-2.6.39/core' of git://git.kernel.dk/linux-2.6-block</title>
<updated>2011-03-24T17:16:26+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2011-03-24T17:16:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=6c5103890057b1bb781b26b7aae38d33e4c517d8'/>
<id>6c5103890057b1bb781b26b7aae38d33e4c517d8</id>
<content type='text'>
* 'for-2.6.39/core' of git://git.kernel.dk/linux-2.6-block: (65 commits)
  Documentation/iostats.txt: bit-size reference etc.
  cfq-iosched: removing unnecessary think time checking
  cfq-iosched: Don't clear queue stats when preempt.
  blk-throttle: Reset group slice when limits are changed
  blk-cgroup: Only give unaccounted_time under debug
  cfq-iosched: Don't set active queue in preempt
  block: fix non-atomic access to genhd inflight structures
  block: attempt to merge with existing requests on plug flush
  block: NULL dereference on error path in __blkdev_get()
  cfq-iosched: Don't update group weights when on service tree
  fs: assign sb-&gt;s_bdi to default_backing_dev_info if the bdi is going away
  block: Require subsystems to explicitly allocate bio_set integrity mempool
  jbd2: finish conversion from WRITE_SYNC_PLUG to WRITE_SYNC and explicit plugging
  jbd: finish conversion from WRITE_SYNC_PLUG to WRITE_SYNC and explicit plugging
  fs: make fsync_buffers_list() plug
  mm: make generic_writepages() use plugging
  blk-cgroup: Add unaccounted time to timeslice_used.
  block: fixup plugging stubs for !CONFIG_BLOCK
  block: remove obsolete comments for blkdev_issue_zeroout.
  blktrace: Use rq-&gt;cmd_flags directly in blk_add_trace_rq.
  ...

Fix up conflicts in fs/{aio.c,super.c}
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* 'for-2.6.39/core' of git://git.kernel.dk/linux-2.6-block: (65 commits)
  Documentation/iostats.txt: bit-size reference etc.
  cfq-iosched: removing unnecessary think time checking
  cfq-iosched: Don't clear queue stats when preempt.
  blk-throttle: Reset group slice when limits are changed
  blk-cgroup: Only give unaccounted_time under debug
  cfq-iosched: Don't set active queue in preempt
  block: fix non-atomic access to genhd inflight structures
  block: attempt to merge with existing requests on plug flush
  block: NULL dereference on error path in __blkdev_get()
  cfq-iosched: Don't update group weights when on service tree
  fs: assign sb-&gt;s_bdi to default_backing_dev_info if the bdi is going away
  block: Require subsystems to explicitly allocate bio_set integrity mempool
  jbd2: finish conversion from WRITE_SYNC_PLUG to WRITE_SYNC and explicit plugging
  jbd: finish conversion from WRITE_SYNC_PLUG to WRITE_SYNC and explicit plugging
  fs: make fsync_buffers_list() plug
  mm: make generic_writepages() use plugging
  blk-cgroup: Add unaccounted time to timeslice_used.
  block: fixup plugging stubs for !CONFIG_BLOCK
  block: remove obsolete comments for blkdev_issue_zeroout.
  blktrace: Use rq-&gt;cmd_flags directly in blk_add_trace_rq.
  ...

Fix up conflicts in fs/{aio.c,super.c}
</pre>
</div>
</content>
</entry>
<entry>
<title>exportfs: Return the minimum required handle size</title>
<updated>2011-03-14T13:15:28+00:00</updated>
<author>
<name>Aneesh Kumar K.V</name>
<email>aneesh.kumar@linux.vnet.ibm.com</email>
</author>
<published>2011-01-29T13:13:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=5fe0c2378884e68beb532f5890cc0e3539ac747b'/>
<id>5fe0c2378884e68beb532f5890cc0e3539ac747b</id>
<content type='text'>
The exportfs encode handle function should return the minimum required
handle size. This helps user to find out the handle size by passing 0
handle size in the first step and then redoing to the call again with
the returned handle size value.

Acked-by: Serge Hallyn &lt;serue@us.ibm.com&gt;
Signed-off-by: Aneesh Kumar K.V &lt;aneesh.kumar@linux.vnet.ibm.com&gt;
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The exportfs encode handle function should return the minimum required
handle size. This helps user to find out the handle size by passing 0
handle size in the first step and then redoing to the call again with
the returned handle size value.

Acked-by: Serge Hallyn &lt;serue@us.ibm.com&gt;
Signed-off-by: Aneesh Kumar K.V &lt;aneesh.kumar@linux.vnet.ibm.com&gt;
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>block: remove per-queue plugging</title>
<updated>2011-03-10T07:52:07+00:00</updated>
<author>
<name>Jens Axboe</name>
<email>jaxboe@fusionio.com</email>
</author>
<published>2011-03-10T07:52:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=7eaceaccab5f40bbfda044629a6298616aeaed50'/>
<id>7eaceaccab5f40bbfda044629a6298616aeaed50</id>
<content type='text'>
Code has been converted over to the new explicit on-stack plugging,
and delay users have been converted to use the new API for that.
So lets kill off the old plugging along with aops-&gt;sync_page().

Signed-off-by: Jens Axboe &lt;jaxboe@fusionio.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Code has been converted over to the new explicit on-stack plugging,
and delay users have been converted to use the new API for that.
So lets kill off the old plugging along with aops-&gt;sync_page().

Signed-off-by: Jens Axboe &lt;jaxboe@fusionio.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>switch fuse</title>
<updated>2011-01-13T01:02:44+00:00</updated>
<author>
<name>Al Viro</name>
<email>viro@zeniv.linux.org.uk</email>
</author>
<published>2010-12-18T16:15:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=c35eebe9939f55b9d51631d03301a7af19090dcc'/>
<id>c35eebe9939f55b9d51631d03301a7af19090dcc</id>
<content type='text'>
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse</title>
<updated>2011-01-10T15:43:54+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2011-01-10T15:43:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=7d44b0440147d83a65270205b22e7d365de28948'/>
<id>7d44b0440147d83a65270205b22e7d365de28948</id>
<content type='text'>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
  fuse: fix ioctl ABI
  fuse: allow batching of FORGET requests
  fuse: separate queue for FORGET requests
  fuse: ioctl cleanup

Fix up trivial conflict in fs/fuse/inode.c due to RCU lookup having done
the RCU-freeing of the inode in fuse_destroy_inode().
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
  fuse: fix ioctl ABI
  fuse: allow batching of FORGET requests
  fuse: separate queue for FORGET requests
  fuse: ioctl cleanup

Fix up trivial conflict in fs/fuse/inode.c due to RCU lookup having done
the RCU-freeing of the inode in fuse_destroy_inode().
</pre>
</div>
</content>
</entry>
<entry>
<title>fs: dcache reduce branches in lookup path</title>
<updated>2011-01-07T06:50:28+00:00</updated>
<author>
<name>Nick Piggin</name>
<email>npiggin@kernel.dk</email>
</author>
<published>2011-01-07T06:49:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=fb045adb99d9b7c562dc7fef834857f78249daa1'/>
<id>fb045adb99d9b7c562dc7fef834857f78249daa1</id>
<content type='text'>
Reduce some branches and memory accesses in dcache lookup by adding dentry
flags to indicate common d_ops are set, rather than having to check them.
This saves a pointer memory access (dentry-&gt;d_op) in common path lookup
situations, and saves another pointer load and branch in cases where we
have d_op but not the particular operation.

Patched with:

git grep -E '[.&gt;]([[:space:]])*d_op([[:space:]])*=' | xargs sed -e 's/\([^\t ]*\)-&gt;d_op = \(.*\);/d_set_d_op(\1, \2);/' -e 's/\([^\t ]*\)\.d_op = \(.*\);/d_set_d_op(\&amp;\1, \2);/' -i

Signed-off-by: Nick Piggin &lt;npiggin@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Reduce some branches and memory accesses in dcache lookup by adding dentry
flags to indicate common d_ops are set, rather than having to check them.
This saves a pointer memory access (dentry-&gt;d_op) in common path lookup
situations, and saves another pointer load and branch in cases where we
have d_op but not the particular operation.

Patched with:

git grep -E '[.&gt;]([[:space:]])*d_op([[:space:]])*=' | xargs sed -e 's/\([^\t ]*\)-&gt;d_op = \(.*\);/d_set_d_op(\1, \2);/' -e 's/\([^\t ]*\)\.d_op = \(.*\);/d_set_d_op(\&amp;\1, \2);/' -i

Signed-off-by: Nick Piggin &lt;npiggin@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>fs: icache RCU free inodes</title>
<updated>2011-01-07T06:50:26+00:00</updated>
<author>
<name>Nick Piggin</name>
<email>npiggin@kernel.dk</email>
</author>
<published>2011-01-07T06:49:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=fa0d7e3de6d6fc5004ad9dea0dd6b286af8f03e9'/>
<id>fa0d7e3de6d6fc5004ad9dea0dd6b286af8f03e9</id>
<content type='text'>
RCU free the struct inode. This will allow:

- Subsequent store-free path walking patch. The inode must be consulted for
  permissions when walking, so an RCU inode reference is a must.
- sb_inode_list_lock to be moved inside i_lock because sb list walkers who want
  to take i_lock no longer need to take sb_inode_list_lock to walk the list in
  the first place. This will simplify and optimize locking.
- Could remove some nested trylock loops in dcache code
- Could potentially simplify things a bit in VM land. Do not need to take the
  page lock to follow page-&gt;mapping.

The downsides of this is the performance cost of using RCU. In a simple
creat/unlink microbenchmark, performance drops by about 10% due to inability to
reuse cache-hot slab objects. As iterations increase and RCU freeing starts
kicking over, this increases to about 20%.

In cases where inode lifetimes are longer (ie. many inodes may be allocated
during the average life span of a single inode), a lot of this cache reuse is
not applicable, so the regression caused by this patch is smaller.

The cache-hot regression could largely be avoided by using SLAB_DESTROY_BY_RCU,
however this adds some complexity to list walking and store-free path walking,
so I prefer to implement this at a later date, if it is shown to be a win in
real situations. I haven't found a regression in any non-micro benchmark so I
doubt it will be a problem.

Signed-off-by: Nick Piggin &lt;npiggin@kernel.dk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
RCU free the struct inode. This will allow:

- Subsequent store-free path walking patch. The inode must be consulted for
  permissions when walking, so an RCU inode reference is a must.
- sb_inode_list_lock to be moved inside i_lock because sb list walkers who want
  to take i_lock no longer need to take sb_inode_list_lock to walk the list in
  the first place. This will simplify and optimize locking.
- Could remove some nested trylock loops in dcache code
- Could potentially simplify things a bit in VM land. Do not need to take the
  page lock to follow page-&gt;mapping.

The downsides of this is the performance cost of using RCU. In a simple
creat/unlink microbenchmark, performance drops by about 10% due to inability to
reuse cache-hot slab objects. As iterations increase and RCU freeing starts
kicking over, this increases to about 20%.

In cases where inode lifetimes are longer (ie. many inodes may be allocated
during the average life span of a single inode), a lot of this cache reuse is
not applicable, so the regression caused by this patch is smaller.

The cache-hot regression could largely be avoided by using SLAB_DESTROY_BY_RCU,
however this adds some complexity to list walking and store-free path walking,
so I prefer to implement this at a later date, if it is shown to be a win in
real situations. I haven't found a regression in any non-micro benchmark so I
doubt it will be a problem.

Signed-off-by: Nick Piggin &lt;npiggin@kernel.dk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>fuse: separate queue for FORGET requests</title>
<updated>2010-12-07T19:16:56+00:00</updated>
<author>
<name>Miklos Szeredi</name>
<email>mszeredi@suse.cz</email>
</author>
<published>2010-12-07T19:16:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=07e77dca8a1f17a724a9b7449f0ca02e70e9d057'/>
<id>07e77dca8a1f17a724a9b7449f0ca02e70e9d057</id>
<content type='text'>
Terje Malmedal reports that a fuse filesystem with 32 million inodes
on a machine with lots of memory can go unresponsive for up to 30
minutes when all those inodes are evicted from the icache.

The reason is that FORGET messages, sent when the inode is evicted,
are queued up together with regular filesystem requests, and while the
huge queue of FORGET messages are processed no other filesystem
operation can proceed.

Since a full fuse request structure is allocated for each inode, these
take up quite a bit of memory as well.

To solve these issues, create a slim 'fuse_forget_link' structure
containing just the minimum of information required to send the FORGET
request and chain these on a separate queue.

When userspace is asking for a request make sure that FORGET and
non-FORGET requests are selected fairly: for each 8 non-FORGET allow
16 FORGET requests.  This will make sure FORGETs do not pile up, yet
other requests are also allowed to proceed while the queued FORGETs
are processed.

Reported-by: Terje Malmedal &lt;terje.malmedal@usit.uio.no&gt;
Signed-off-by: Miklos Szeredi &lt;mszeredi@suse.cz&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Terje Malmedal reports that a fuse filesystem with 32 million inodes
on a machine with lots of memory can go unresponsive for up to 30
minutes when all those inodes are evicted from the icache.

The reason is that FORGET messages, sent when the inode is evicted,
are queued up together with regular filesystem requests, and while the
huge queue of FORGET messages are processed no other filesystem
operation can proceed.

Since a full fuse request structure is allocated for each inode, these
take up quite a bit of memory as well.

To solve these issues, create a slim 'fuse_forget_link' structure
containing just the minimum of information required to send the FORGET
request and chain these on a separate queue.

When userspace is asking for a request make sure that FORGET and
non-FORGET requests are selected fairly: for each 8 non-FORGET allow
16 FORGET requests.  This will make sure FORGETs do not pile up, yet
other requests are also allowed to proceed while the queued FORGETs
are processed.

Reported-by: Terje Malmedal &lt;terje.malmedal@usit.uio.no&gt;
Signed-off-by: Miklos Szeredi &lt;mszeredi@suse.cz&gt;
</pre>
</div>
</content>
</entry>
</feed>
