<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/fs/reiserfs/bitmap.c, branch linux-3.3.y</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>reiserfs: delay reiserfs lock until journal initialization</title>
<updated>2012-01-11T00:30:53+00:00</updated>
<author>
<name>Frederic Weisbecker</name>
<email>fweisbec@gmail.com</email>
</author>
<published>2012-01-10T23:11:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=f32485be8397ad811312bc055d2e2a5906bc7576'/>
<id>f32485be8397ad811312bc055d2e2a5906bc7576</id>
<content type='text'>
In the mount path, transactions that are made before journal
initialization don't involve the filesystem.  We can delay the reiserfs
lock until we play with the journal.

Signed-off-by: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Christoph Hellwig &lt;hch@lst.de&gt;
Cc: Jeff Mahoney &lt;jeffm@suse.com&gt;
Cc: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In the mount path, transactions that are made before journal
initialization don't involve the filesystem.  We can delay the reiserfs
lock until we play with the journal.

Signed-off-by: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Christoph Hellwig &lt;hch@lst.de&gt;
Cc: Jeff Mahoney &lt;jeffm@suse.com&gt;
Cc: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>reiserfs: Properly display mount options in /proc/mounts</title>
<updated>2012-01-07T04:20:13+00:00</updated>
<author>
<name>Jan Kara</name>
<email>jack@suse.cz</email>
</author>
<published>2011-12-21T19:17:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=c3aa077648e147783a7a53b409578234647db853'/>
<id>c3aa077648e147783a7a53b409578234647db853</id>
<content type='text'>
Make reiserfs properly display mount options in /proc/mounts.

CC: reiserfs-devel@vger.kernel.org
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Make reiserfs properly display mount options in /proc/mounts.

CC: reiserfs-devel@vger.kernel.org
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>reiserfs: use hweight_long()</title>
<updated>2011-07-26T03:57:17+00:00</updated>
<author>
<name>Akinobu Mita</name>
<email>akinobu.mita@gmail.com</email>
</author>
<published>2011-07-26T00:13:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=9d6bf5aa177ee7ffdcee2a590ef8a1bf9e8ade87'/>
<id>9d6bf5aa177ee7ffdcee2a590ef8a1bf9e8ade87</id>
<content type='text'>
Use hweight_long() to count free bits in the bitmap.

Signed-off-by: Akinobu Mita &lt;akinobu.mita@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Use hweight_long() to count free bits in the bitmap.

Signed-off-by: Akinobu Mita &lt;akinobu.mita@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>reiserfs: use proper little-endian bitops</title>
<updated>2011-07-26T03:57:17+00:00</updated>
<author>
<name>Akinobu Mita</name>
<email>akinobu.mita@gmail.com</email>
</author>
<published>2011-07-26T00:13:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=0c2fd1bfb155947a821fdaeb3c46aa1cfa20ad64'/>
<id>0c2fd1bfb155947a821fdaeb3c46aa1cfa20ad64</id>
<content type='text'>
Using __test_and_{set,clear}_bit_le() with ignoring its return value can
be replaced with __{set,clear}_bit_le().

This introduces reiserfs_{set,clear}_le_bit for __{set,clear}_bit_le and
does the above change with them.

Signed-off-by: Akinobu Mita &lt;akinobu.mita@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Using __test_and_{set,clear}_bit_le() with ignoring its return value can
be replaced with __{set,clear}_bit_le().

This introduces reiserfs_{set,clear}_le_bit for __{set,clear}_bit_le and
does the above change with them.

Signed-off-by: Akinobu Mita &lt;akinobu.mita@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'for-next' into for-linus</title>
<updated>2010-03-08T15:55:37+00:00</updated>
<author>
<name>Jiri Kosina</name>
<email>jkosina@suse.cz</email>
</author>
<published>2010-03-08T15:55:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=318ae2edc3b29216abd8a2510f3f80b764f06858'/>
<id>318ae2edc3b29216abd8a2510f3f80b764f06858</id>
<content type='text'>
Conflicts:
	Documentation/filesystems/proc.txt
	arch/arm/mach-u300/include/mach/debug-macro.S
	drivers/net/qlge/qlge_ethtool.c
	drivers/net/qlge/qlge_main.c
	drivers/net/typhoon.c
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Conflicts:
	Documentation/filesystems/proc.txt
	arch/arm/mach-u300/include/mach/debug-macro.S
	drivers/net/qlge/qlge_ethtool.c
	drivers/net/qlge/qlge_main.c
	drivers/net/typhoon.c
</pre>
</div>
</content>
</entry>
<entry>
<title>dquot: cleanup space allocation / freeing routines</title>
<updated>2010-03-04T23:20:28+00:00</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@infradead.org</email>
</author>
<published>2010-03-03T14:05:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=5dd4056db84387975140ff2568eaa0406f07985e'/>
<id>5dd4056db84387975140ff2568eaa0406f07985e</id>
<content type='text'>
Get rid of the alloc_space, free_space, reserve_space, claim_space and
release_rsv dquot operations - they are always called from the filesystem
and if a filesystem really needs their own (which none currently does)
it can just call into it's own routine directly.

Move shared logic into the common __dquot_alloc_space,
dquot_claim_space_nodirty and __dquot_free_space low-level methods,
and rationalize the wrappers around it to move as much as possible
code into the common block for CONFIG_QUOTA vs not.  Also rename
all these helpers to be named dquot_* instead of vfs_dq_*.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Get rid of the alloc_space, free_space, reserve_space, claim_space and
release_rsv dquot operations - they are always called from the filesystem
and if a filesystem really needs their own (which none currently does)
it can just call into it's own routine directly.

Move shared logic into the common __dquot_alloc_space,
dquot_claim_space_nodirty and __dquot_free_space low-level methods,
and rationalize the wrappers around it to move as much as possible
code into the common block for CONFIG_QUOTA vs not.  Also rename
all these helpers to be named dquot_* instead of vfs_dq_*.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tree-wide: Assorted spelling fixes</title>
<updated>2010-02-09T10:13:56+00:00</updated>
<author>
<name>Daniel Mack</name>
<email>daniel@caiaq.de</email>
</author>
<published>2010-02-03T00:01:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=3ad2f3fbb961429d2aa627465ae4829758bc7e07'/>
<id>3ad2f3fbb961429d2aa627465ae4829758bc7e07</id>
<content type='text'>
In particular, several occurances of funny versions of 'success',
'unknown', 'therefore', 'acknowledge', 'argument', 'achieve', 'address',
'beginning', 'desirable', 'separate' and 'necessary' are fixed.

Signed-off-by: Daniel Mack &lt;daniel@caiaq.de&gt;
Cc: Joe Perches &lt;joe@perches.com&gt;
Cc: Junio C Hamano &lt;gitster@pobox.com&gt;
Signed-off-by: Jiri Kosina &lt;jkosina@suse.cz&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In particular, several occurances of funny versions of 'success',
'unknown', 'therefore', 'acknowledge', 'argument', 'achieve', 'address',
'beginning', 'desirable', 'separate' and 'necessary' are fixed.

Signed-off-by: Daniel Mack &lt;daniel@caiaq.de&gt;
Cc: Joe Perches &lt;joe@perches.com&gt;
Cc: Junio C Hamano &lt;gitster@pobox.com&gt;
Signed-off-by: Jiri Kosina &lt;jkosina@suse.cz&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>reiserfs: Fix possible recursive lock</title>
<updated>2009-12-14T10:43:09+00:00</updated>
<author>
<name>Frederic Weisbecker</name>
<email>fweisbec@gmail.com</email>
</author>
<published>2009-12-13T21:48:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=500f5a0bf5f0624dae34307010e240ec090e4cde'/>
<id>500f5a0bf5f0624dae34307010e240ec090e4cde</id>
<content type='text'>
While allocating the bitmap using vmalloc, we hold the reiserfs lock,
which makes lockdep later reporting a possible deadlock as we may
swap out pages to allocate memory and then take the reiserfs lock
recursively:

inconsistent {RECLAIM_FS-ON-W} -&gt; {IN-RECLAIM_FS-W} usage.
kswapd0/312 [HC0[0]:SC0[0]:HE1:SE1] takes:
 (&amp;REISERFS_SB(s)-&gt;lock){+.+.?.}, at: [&lt;c11108a8&gt;] reiserfs_write_lock+0x28/0x40
{RECLAIM_FS-ON-W} state was registered at:
  [&lt;c104e1c2&gt;] mark_held_locks+0x62/0x90
  [&lt;c104e28a&gt;] lockdep_trace_alloc+0x9a/0xc0
  [&lt;c108e396&gt;] kmem_cache_alloc+0x26/0xf0
  [&lt;c10850ec&gt;] __get_vm_area_node+0x6c/0xf0
  [&lt;c10857de&gt;] __vmalloc_node+0x7e/0xa0
  [&lt;c108597b&gt;] vmalloc+0x2b/0x30
  [&lt;c10e00b9&gt;] reiserfs_init_bitmap_cache+0x39/0x70
  [&lt;c10f8178&gt;] reiserfs_fill_super+0x2e8/0xb90
  [&lt;c1094345&gt;] get_sb_bdev+0x145/0x180
  [&lt;c10f5a11&gt;] get_super_block+0x21/0x30
  [&lt;c10931f0&gt;] vfs_kern_mount+0x40/0xd0
  [&lt;c10932d9&gt;] do_kern_mount+0x39/0xd0
  [&lt;c10a9857&gt;] do_mount+0x2c7/0x6b0
  [&lt;c10a9ca6&gt;] sys_mount+0x66/0xa0
  [&lt;c161589b&gt;] mount_block_root+0xc4/0x245
  [&lt;c1615a75&gt;] mount_root+0x59/0x5f
  [&lt;c1615b8c&gt;] prepare_namespace+0x111/0x14b
  [&lt;c1615269&gt;] kernel_init+0xcf/0xdb
  [&lt;c10031fb&gt;] kernel_thread_helper+0x7/0x1c

This is actually fine for two reasons: we call vmalloc at mount time
then it's not in the swapping out path. Also the reiserfs lock can be
acquired recursively, but since its implementation depends on a mutex,
it's hard and not necessary worth it to teach that to lockdep.

The lock is useless at mount time anyway, at least until we replay the
journal. But let's remove it from this path later as this needs
more thinking and is a sensible change.

For now we can just relax the lock around vmalloc,

Reported-by: Alexander Beregalov &lt;a.beregalov@gmail.com&gt;
Signed-off-by: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Chris Mason &lt;chris.mason@oracle.com&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
While allocating the bitmap using vmalloc, we hold the reiserfs lock,
which makes lockdep later reporting a possible deadlock as we may
swap out pages to allocate memory and then take the reiserfs lock
recursively:

inconsistent {RECLAIM_FS-ON-W} -&gt; {IN-RECLAIM_FS-W} usage.
kswapd0/312 [HC0[0]:SC0[0]:HE1:SE1] takes:
 (&amp;REISERFS_SB(s)-&gt;lock){+.+.?.}, at: [&lt;c11108a8&gt;] reiserfs_write_lock+0x28/0x40
{RECLAIM_FS-ON-W} state was registered at:
  [&lt;c104e1c2&gt;] mark_held_locks+0x62/0x90
  [&lt;c104e28a&gt;] lockdep_trace_alloc+0x9a/0xc0
  [&lt;c108e396&gt;] kmem_cache_alloc+0x26/0xf0
  [&lt;c10850ec&gt;] __get_vm_area_node+0x6c/0xf0
  [&lt;c10857de&gt;] __vmalloc_node+0x7e/0xa0
  [&lt;c108597b&gt;] vmalloc+0x2b/0x30
  [&lt;c10e00b9&gt;] reiserfs_init_bitmap_cache+0x39/0x70
  [&lt;c10f8178&gt;] reiserfs_fill_super+0x2e8/0xb90
  [&lt;c1094345&gt;] get_sb_bdev+0x145/0x180
  [&lt;c10f5a11&gt;] get_super_block+0x21/0x30
  [&lt;c10931f0&gt;] vfs_kern_mount+0x40/0xd0
  [&lt;c10932d9&gt;] do_kern_mount+0x39/0xd0
  [&lt;c10a9857&gt;] do_mount+0x2c7/0x6b0
  [&lt;c10a9ca6&gt;] sys_mount+0x66/0xa0
  [&lt;c161589b&gt;] mount_block_root+0xc4/0x245
  [&lt;c1615a75&gt;] mount_root+0x59/0x5f
  [&lt;c1615b8c&gt;] prepare_namespace+0x111/0x14b
  [&lt;c1615269&gt;] kernel_init+0xcf/0xdb
  [&lt;c10031fb&gt;] kernel_thread_helper+0x7/0x1c

This is actually fine for two reasons: we call vmalloc at mount time
then it's not in the swapping out path. Also the reiserfs lock can be
acquired recursively, but since its implementation depends on a mutex,
it's hard and not necessary worth it to teach that to lockdep.

The lock is useless at mount time anyway, at least until we replay the
journal. But let's remove it from this path later as this needs
more thinking and is a sensible change.

For now we can just relax the lock around vmalloc,

Reported-by: Alexander Beregalov &lt;a.beregalov@gmail.com&gt;
Signed-off-by: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Chris Mason &lt;chris.mason@oracle.com&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>kill-the-BKL/reiserfs: release the write lock inside reiserfs_read_bitmap_block()</title>
<updated>2009-09-14T05:18:11+00:00</updated>
<author>
<name>Frederic Weisbecker</name>
<email>fweisbec@gmail.com</email>
</author>
<published>2009-04-30T23:44:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=4c5eface5d0e4eb7f77be346193c2850e7e3b983'/>
<id>4c5eface5d0e4eb7f77be346193c2850e7e3b983</id>
<content type='text'>
reiserfs_read_bitmap_block() uses sb_bread() to read the bitmap block. This
helper might sleep.

Then, when the bkl was used, it was released at this point. We can then
relax the write lock too here.

[ Impact: release the reiserfs write lock when it is not needed ]

Cc: Jeff Mahoney &lt;jeffm@suse.com&gt;
Cc: Chris Mason &lt;chris.mason@oracle.com&gt;
Cc: Alexander Beregalov &lt;a.beregalov@gmail.com&gt;
Signed-off-by: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
reiserfs_read_bitmap_block() uses sb_bread() to read the bitmap block. This
helper might sleep.

Then, when the bkl was used, it was released at this point. We can then
relax the write lock too here.

[ Impact: release the reiserfs write lock when it is not needed ]

Cc: Jeff Mahoney &lt;jeffm@suse.com&gt;
Cc: Chris Mason &lt;chris.mason@oracle.com&gt;
Cc: Alexander Beregalov &lt;a.beregalov@gmail.com&gt;
Signed-off-by: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>reiserfs: kill-the-BKL</title>
<updated>2009-09-14T05:17:59+00:00</updated>
<author>
<name>Frederic Weisbecker</name>
<email>fweisbec@gmail.com</email>
</author>
<published>2009-04-07T02:19:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=8ebc423238341b52912c7295b045a32477b33f09'/>
<id>8ebc423238341b52912c7295b045a32477b33f09</id>
<content type='text'>
This patch is an attempt to remove the Bkl based locking scheme from
reiserfs and is intended.

It is a bit inspired from an old attempt by Peter Zijlstra:

   http://lkml.indiana.edu/hypermail/linux/kernel/0704.2/2174.html

The bkl is heavily used in this filesystem to prevent from
concurrent write accesses on the filesystem.

Reiserfs makes a deep use of the specific properties of the Bkl:

- It can be acqquired recursively by a same task
- It is released on the schedule() calls and reacquired when schedule() returns

The two properties above are a roadmap for the reiserfs write locking so it's
very hard to simply replace it with a common mutex.

- We need a recursive-able locking unless we want to restructure several blocks
  of the code.
- We need to identify the sites where the bkl was implictly relaxed
  (schedule, wait, sync, etc...) so that we can in turn release and
  reacquire our new lock explicitly.
  Such implicit releases of the lock are often required to let other
  resources producer/consumer do their job or we can suffer unexpected
  starvations or deadlocks.

So the new lock that replaces the bkl here is a per superblock mutex with a
specific property: it can be acquired recursively by a same task, like the
bkl.

For such purpose, we integrate a lock owner and a lock depth field on the
superblock information structure.

The first axis on this patch is to turn reiserfs_write_(un)lock() function
into a wrapper to manage this mutex. Also some explicit calls to
lock_kernel() have been converted to reiserfs_write_lock() helpers.

The second axis is to find the important blocking sites (schedule...(),
wait_on_buffer(), sync_dirty_buffer(), etc...) and then apply an explicit
release of the write lock on these locations before blocking. Then we can
safely wait for those who can give us resources or those who need some.
Typically this is a fight between the current writer, the reiserfs workqueue
(aka the async commiter) and the pdflush threads.

The third axis is a consequence of the second. The write lock is usually
on top of a lock dependency chain which can include the journal lock, the
flush lock or the commit lock. So it's dangerous to release and trying to
reacquire the write lock while we still hold other locks.

This is fine with the bkl:

      T1                       T2

lock_kernel()
    mutex_lock(A)
    unlock_kernel()
    // do something
                            lock_kernel()
                                mutex_lock(A) -&gt; already locked by T1
                                schedule() (and then unlock_kernel())
    lock_kernel()
    mutex_unlock(A)
    ....

This is not fine with a mutex:

      T1                       T2

mutex_lock(write)
    mutex_lock(A)
    mutex_unlock(write)
    // do something
                           mutex_lock(write)
                              mutex_lock(A) -&gt; already locked by T1
                              schedule()

    mutex_lock(write) -&gt; already locked by T2
    deadlock

The solution in this patch is to provide a helper which releases the write
lock and sleep a bit if we can't lock a mutex that depend on it. It's another
simulation of the bkl behaviour.

The last axis is to locate the fs callbacks that are called with the bkl held,
according to Documentation/filesystem/Locking.

Those are:

- reiserfs_remount
- reiserfs_fill_super
- reiserfs_put_super

Reiserfs didn't need to explicitly lock because of the context of these callbacks.
But now we must take care of that with the new locking.

After this patch, reiserfs suffers from a slight performance regression (for now).
On UP, a high volume write with dd reports an average of 27 MB/s instead
of 30 MB/s without the patch applied.

Signed-off-by: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Reviewed-by: Ingo Molnar &lt;mingo@elte.hu&gt;
Cc: Jeff Mahoney &lt;jeffm@suse.com&gt;
Cc: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: Bron Gondwana &lt;brong@fastmail.fm&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Alexander Viro &lt;viro@zeniv.linux.org.uk&gt;
LKML-Reference: &lt;1239070789-13354-1-git-send-email-fweisbec@gmail.com&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch is an attempt to remove the Bkl based locking scheme from
reiserfs and is intended.

It is a bit inspired from an old attempt by Peter Zijlstra:

   http://lkml.indiana.edu/hypermail/linux/kernel/0704.2/2174.html

The bkl is heavily used in this filesystem to prevent from
concurrent write accesses on the filesystem.

Reiserfs makes a deep use of the specific properties of the Bkl:

- It can be acqquired recursively by a same task
- It is released on the schedule() calls and reacquired when schedule() returns

The two properties above are a roadmap for the reiserfs write locking so it's
very hard to simply replace it with a common mutex.

- We need a recursive-able locking unless we want to restructure several blocks
  of the code.
- We need to identify the sites where the bkl was implictly relaxed
  (schedule, wait, sync, etc...) so that we can in turn release and
  reacquire our new lock explicitly.
  Such implicit releases of the lock are often required to let other
  resources producer/consumer do their job or we can suffer unexpected
  starvations or deadlocks.

So the new lock that replaces the bkl here is a per superblock mutex with a
specific property: it can be acquired recursively by a same task, like the
bkl.

For such purpose, we integrate a lock owner and a lock depth field on the
superblock information structure.

The first axis on this patch is to turn reiserfs_write_(un)lock() function
into a wrapper to manage this mutex. Also some explicit calls to
lock_kernel() have been converted to reiserfs_write_lock() helpers.

The second axis is to find the important blocking sites (schedule...(),
wait_on_buffer(), sync_dirty_buffer(), etc...) and then apply an explicit
release of the write lock on these locations before blocking. Then we can
safely wait for those who can give us resources or those who need some.
Typically this is a fight between the current writer, the reiserfs workqueue
(aka the async commiter) and the pdflush threads.

The third axis is a consequence of the second. The write lock is usually
on top of a lock dependency chain which can include the journal lock, the
flush lock or the commit lock. So it's dangerous to release and trying to
reacquire the write lock while we still hold other locks.

This is fine with the bkl:

      T1                       T2

lock_kernel()
    mutex_lock(A)
    unlock_kernel()
    // do something
                            lock_kernel()
                                mutex_lock(A) -&gt; already locked by T1
                                schedule() (and then unlock_kernel())
    lock_kernel()
    mutex_unlock(A)
    ....

This is not fine with a mutex:

      T1                       T2

mutex_lock(write)
    mutex_lock(A)
    mutex_unlock(write)
    // do something
                           mutex_lock(write)
                              mutex_lock(A) -&gt; already locked by T1
                              schedule()

    mutex_lock(write) -&gt; already locked by T2
    deadlock

The solution in this patch is to provide a helper which releases the write
lock and sleep a bit if we can't lock a mutex that depend on it. It's another
simulation of the bkl behaviour.

The last axis is to locate the fs callbacks that are called with the bkl held,
according to Documentation/filesystem/Locking.

Those are:

- reiserfs_remount
- reiserfs_fill_super
- reiserfs_put_super

Reiserfs didn't need to explicitly lock because of the context of these callbacks.
But now we must take care of that with the new locking.

After this patch, reiserfs suffers from a slight performance regression (for now).
On UP, a high volume write with dd reports an average of 27 MB/s instead
of 30 MB/s without the patch applied.

Signed-off-by: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Reviewed-by: Ingo Molnar &lt;mingo@elte.hu&gt;
Cc: Jeff Mahoney &lt;jeffm@suse.com&gt;
Cc: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: Bron Gondwana &lt;brong@fastmail.fm&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Alexander Viro &lt;viro@zeniv.linux.org.uk&gt;
LKML-Reference: &lt;1239070789-13354-1-git-send-email-fweisbec@gmail.com&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</pre>
</div>
</content>
</entry>
</feed>
