<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/fs/jbd2, branch v3.2.41</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>jbd2: fix assertion failure in jbd2_journal_flush()</title>
<updated>2013-01-16T01:13:11+00:00</updated>
<author>
<name>Jan Kara</name>
<email>jack@suse.cz</email>
</author>
<published>2012-12-21T05:15:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=7a55283222cdd70c1cd7a33df0db1e0c96462ac9'/>
<id>7a55283222cdd70c1cd7a33df0db1e0c96462ac9</id>
<content type='text'>
commit d7961c7fa4d2e3c3f12be67e21ba8799b5a7238a upstream.

The following race is possible between start_this_handle() and someone
calling jbd2_journal_flush().

Process A                              Process B
start_this_handle().
  if (journal-&gt;j_barrier_count) # false
  if (!journal-&gt;j_running_transaction) { #true
    read_unlock(&amp;journal-&gt;j_state_lock);
                                       jbd2_journal_lock_updates()
                                       jbd2_journal_flush()
                                         write_lock(&amp;journal-&gt;j_state_lock);
                                         if (journal-&gt;j_running_transaction) {
                                           # false
                                         ... wait for committing trans ...
                                         write_unlock(&amp;journal-&gt;j_state_lock);
    ...
    write_lock(&amp;journal-&gt;j_state_lock);
    if (!journal-&gt;j_running_transaction) { # true
      jbd2_get_transaction(journal, new_transaction);
    write_unlock(&amp;journal-&gt;j_state_lock);
    goto repeat; # eventually blocks on j_barrier_count &gt; 0
                                         ...
                                         J_ASSERT(!journal-&gt;j_running_transaction);
                                           # fails

We fix the race by rechecking j_barrier_count after reacquiring j_state_lock
in exclusive mode.

Reported-by: yjwsignal@empal.com
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: "Theodore Ts'o" &lt;tytso@mit.edu&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit d7961c7fa4d2e3c3f12be67e21ba8799b5a7238a upstream.

The following race is possible between start_this_handle() and someone
calling jbd2_journal_flush().

Process A                              Process B
start_this_handle().
  if (journal-&gt;j_barrier_count) # false
  if (!journal-&gt;j_running_transaction) { #true
    read_unlock(&amp;journal-&gt;j_state_lock);
                                       jbd2_journal_lock_updates()
                                       jbd2_journal_flush()
                                         write_lock(&amp;journal-&gt;j_state_lock);
                                         if (journal-&gt;j_running_transaction) {
                                           # false
                                         ... wait for committing trans ...
                                         write_unlock(&amp;journal-&gt;j_state_lock);
    ...
    write_lock(&amp;journal-&gt;j_state_lock);
    if (!journal-&gt;j_running_transaction) { # true
      jbd2_get_transaction(journal, new_transaction);
    write_unlock(&amp;journal-&gt;j_state_lock);
    goto repeat; # eventually blocks on j_barrier_count &gt; 0
                                         ...
                                         J_ASSERT(!journal-&gt;j_running_transaction);
                                           # fails

We fix the race by rechecking j_barrier_count after reacquiring j_state_lock
in exclusive mode.

Reported-by: yjwsignal@empal.com
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: "Theodore Ts'o" &lt;tytso@mit.edu&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>jbd2: use GFP_NOFS for blkdev_issue_flush</title>
<updated>2012-05-11T12:13:59+00:00</updated>
<author>
<name>Shaohua Li</name>
<email>shli@kernel.org</email>
</author>
<published>2012-04-13T02:27:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=f25543699889485de366c0ccf311fc593946c96e'/>
<id>f25543699889485de366c0ccf311fc593946c96e</id>
<content type='text'>
commit 99aa78466777083255b876293e9e83dec7cd809a upstream.

flush request is issued in transaction commit code path, so looks using
GFP_KERNEL to allocate memory for flush request bio falls into the classic
deadlock issue.  I saw btrfs and dm get it right, but ext4, xfs and md are
using GFP.

Signed-off-by: Shaohua Li &lt;shli@fusionio.com&gt;
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 99aa78466777083255b876293e9e83dec7cd809a upstream.

flush request is issued in transaction commit code path, so looks using
GFP_KERNEL to allocate memory for flush request bio falls into the classic
deadlock issue.  I saw btrfs and dm get it right, but ext4, xfs and md are
using GFP.

Signed-off-by: Shaohua Li &lt;shli@fusionio.com&gt;
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>jbd2: clear BH_Delay &amp; BH_Unwritten in journal_unmap_buffer</title>
<updated>2012-04-02T16:53:03+00:00</updated>
<author>
<name>Eric Sandeen</name>
<email>sandeen@redhat.com</email>
</author>
<published>2012-02-20T22:53:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=146af184d533e62fc55ecd2cca925d3c072b5afc'/>
<id>146af184d533e62fc55ecd2cca925d3c072b5afc</id>
<content type='text'>
commit 15291164b22a357cb211b618adfef4fa82fc0de3 upstream.

journal_unmap_buffer()'s zap_buffer: code clears a lot of buffer head
state ala discard_buffer(), but does not touch _Delay or _Unwritten as
discard_buffer() does.

This can be problematic in some areas of the ext4 code which assume
that if they have found a buffer marked unwritten or delay, then it's
a live one.  Perhaps those spots should check whether it is mapped
as well, but if jbd2 is going to tear down a buffer, let's really
tear it down completely.

Without this I get some fsx failures on sub-page-block filesystems
up until v3.2, at which point 4e96b2dbbf1d7e81f22047a50f862555a6cb87cb
and 189e868fa8fdca702eb9db9d8afc46b5cb9144c9 make the failures go
away, because buried within that large change is some more flag
clearing.  I still think it's worth doing in jbd2, since
-&gt;invalidatepage leads here directly, and it's the right place
to clear away these flags.

Signed-off-by: Eric Sandeen &lt;sandeen@redhat.com&gt;
Signed-off-by: "Theodore Ts'o" &lt;tytso@mit.edu&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 15291164b22a357cb211b618adfef4fa82fc0de3 upstream.

journal_unmap_buffer()'s zap_buffer: code clears a lot of buffer head
state ala discard_buffer(), but does not touch _Delay or _Unwritten as
discard_buffer() does.

This can be problematic in some areas of the ext4 code which assume
that if they have found a buffer marked unwritten or delay, then it's
a live one.  Perhaps those spots should check whether it is mapped
as well, but if jbd2 is going to tear down a buffer, let's really
tear it down completely.

Without this I get some fsx failures on sub-page-block filesystems
up until v3.2, at which point 4e96b2dbbf1d7e81f22047a50f862555a6cb87cb
and 189e868fa8fdca702eb9db9d8afc46b5cb9144c9 make the failures go
away, because buried within that large change is some more flag
clearing.  I still think it's worth doing in jbd2, since
-&gt;invalidatepage leads here directly, and it's the right place
to clear away these flags.

Signed-off-by: Eric Sandeen &lt;sandeen@redhat.com&gt;
Signed-off-by: "Theodore Ts'o" &lt;tytso@mit.edu&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>jbd2: Unify log messages in jbd2 code</title>
<updated>2011-11-01T23:09:18+00:00</updated>
<author>
<name>Eryu Guan</name>
<email>guaneryu@gmail.com</email>
</author>
<published>2011-11-01T23:09:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=f2a44523b20f323e4aef7c16261d34d6f0a4bf06'/>
<id>f2a44523b20f323e4aef7c16261d34d6f0a4bf06</id>
<content type='text'>
Some jbd2 code prints out kernel messages with "JBD2: " prefix, at the
same time other jbd2 code prints with "JBD: " prefix. Unify the prefix
to "JBD2: ".

Signed-off-by: Eryu Guan &lt;guaneryu@gmail.com&gt;
Signed-off-by: "Theodore Ts'o" &lt;tytso@mit.edu&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Some jbd2 code prints out kernel messages with "JBD2: " prefix, at the
same time other jbd2 code prints with "JBD: " prefix. Unify the prefix
to "JBD2: ".

Signed-off-by: Eryu Guan &lt;guaneryu@gmail.com&gt;
Signed-off-by: "Theodore Ts'o" &lt;tytso@mit.edu&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>jbd/jbd2: validate sb-&gt;s_first in journal_get_superblock()</title>
<updated>2011-11-01T23:04:59+00:00</updated>
<author>
<name>Eryu Guan</name>
<email>guaneryu@gmail.com</email>
</author>
<published>2011-11-01T23:04:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=8762202dd0d6e46854f786bdb6fb3780a1625efe'/>
<id>8762202dd0d6e46854f786bdb6fb3780a1625efe</id>
<content type='text'>
I hit a J_ASSERT(blocknr != 0) failure in cleanup_journal_tail() when
mounting a fsfuzzed ext3 image. It turns out that the corrupted ext3
image has s_first = 0 in journal superblock, and the 0 is passed to
journal-&gt;j_head in journal_reset(), then to blocknr in
cleanup_journal_tail(), in the end the J_ASSERT failed.

So validate s_first after reading journal superblock from disk in
journal_get_superblock() to ensure s_first is valid.

The following script could reproduce it:

fstype=ext3
blocksize=1024
img=$fstype.img
offset=0
found=0
magic="c0 3b 39 98"

dd if=/dev/zero of=$img bs=1M count=8
mkfs -t $fstype -b $blocksize -F $img
filesize=`stat -c %s $img`
while [ $offset -lt $filesize ]
do
        if od -j $offset -N 4 -t x1 $img | grep -i "$magic";then
                echo "Found journal: $offset"
                found=1
                break
        fi
        offset=`echo "$offset+$blocksize" | bc`
done

if [ $found -ne 1 ];then
        echo "Magic \"$magic\" not found"
        exit 1
fi

dd if=/dev/zero of=$img seek=$(($offset+23)) conv=notrunc bs=1 count=1

mkdir -p ./mnt
mount -o loop $img ./mnt

Cc: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Eryu Guan &lt;guaneryu@gmail.com&gt;
Signed-off-by: "Theodore Ts'o" &lt;tytso@mit.edu&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
I hit a J_ASSERT(blocknr != 0) failure in cleanup_journal_tail() when
mounting a fsfuzzed ext3 image. It turns out that the corrupted ext3
image has s_first = 0 in journal superblock, and the 0 is passed to
journal-&gt;j_head in journal_reset(), then to blocknr in
cleanup_journal_tail(), in the end the J_ASSERT failed.

So validate s_first after reading journal superblock from disk in
journal_get_superblock() to ensure s_first is valid.

The following script could reproduce it:

fstype=ext3
blocksize=1024
img=$fstype.img
offset=0
found=0
magic="c0 3b 39 98"

dd if=/dev/zero of=$img bs=1M count=8
mkfs -t $fstype -b $blocksize -F $img
filesize=`stat -c %s $img`
while [ $offset -lt $filesize ]
do
        if od -j $offset -N 4 -t x1 $img | grep -i "$magic";then
                echo "Found journal: $offset"
                found=1
                break
        fi
        offset=`echo "$offset+$blocksize" | bc`
done

if [ $found -ne 1 ];then
        echo "Magic \"$magic\" not found"
        exit 1
fi

dd if=/dev/zero of=$img seek=$(($offset+23)) conv=notrunc bs=1 count=1

mkdir -p ./mnt
mount -o loop $img ./mnt

Cc: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Eryu Guan &lt;guaneryu@gmail.com&gt;
Signed-off-by: "Theodore Ts'o" &lt;tytso@mit.edu&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>jbd2: fix build when CONFIG_BUG is not enabled</title>
<updated>2011-10-27T08:05:13+00:00</updated>
<author>
<name>Randy Dunlap</name>
<email>rdunlap@xenotime.net</email>
</author>
<published>2011-10-27T08:05:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=44705754610dbc63503bc7679ff9d9f84978a76f'/>
<id>44705754610dbc63503bc7679ff9d9f84978a76f</id>
<content type='text'>
Fix build error when CONFIG_BUG is not enabled:

fs/jbd2/transaction.c:1175:3: error: implicit declaration of function '__WARN'

by changing __WARN() to WARN_ON(), as suggested by
Arnaud Lacombe &lt;lacombar@gmail.com&gt;.

Signed-off-by: Randy Dunlap &lt;rdunlap@xenotime.net&gt;
Signed-off-by: "Theodore Ts'o" &lt;tytso@mit.edu&gt;
Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: Arnaud Lacombe &lt;lacombar@gmail.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Fix build error when CONFIG_BUG is not enabled:

fs/jbd2/transaction.c:1175:3: error: implicit declaration of function '__WARN'

by changing __WARN() to WARN_ON(), as suggested by
Arnaud Lacombe &lt;lacombar@gmail.com&gt;.

Signed-off-by: Randy Dunlap &lt;rdunlap@xenotime.net&gt;
Signed-off-by: "Theodore Ts'o" &lt;tytso@mit.edu&gt;
Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: Arnaud Lacombe &lt;lacombar@gmail.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>jbd2: use gfp_t instead of int</title>
<updated>2011-09-04T14:20:14+00:00</updated>
<author>
<name>Dan Carpenter</name>
<email>error27@gmail.com</email>
</author>
<published>2011-09-04T14:20:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=d2159fb7b8bac12684aabdf41d84b56da9f5c062'/>
<id>d2159fb7b8bac12684aabdf41d84b56da9f5c062</id>
<content type='text'>
This silences some Sparse warnings:
fs/jbd2/transaction.c:135:69: warning: incorrect type in argument 2 (different base types)
fs/jbd2/transaction.c:135:69:    expected restricted gfp_t [usertype] flags
fs/jbd2/transaction.c:135:69:    got int [signed] gfp_mask

Signed-off-by: Dan Carpenter &lt;error27@gmail.com&gt;
Signed-off-by: "Theodore Ts'o" &lt;tytso@mit.edu&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This silences some Sparse warnings:
fs/jbd2/transaction.c:135:69: warning: incorrect type in argument 2 (different base types)
fs/jbd2/transaction.c:135:69:    expected restricted gfp_t [usertype] flags
fs/jbd2/transaction.c:135:69:    got int [signed] gfp_mask

Signed-off-by: Dan Carpenter &lt;error27@gmail.com&gt;
Signed-off-by: "Theodore Ts'o" &lt;tytso@mit.edu&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>jbd2: add debugging information to jbd2_journal_dirty_metadata()</title>
<updated>2011-09-04T14:18:14+00:00</updated>
<author>
<name>Theodore Ts'o</name>
<email>tytso@mit.edu</email>
</author>
<published>2011-09-04T14:18:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=9ea7a0df63630ad8197716cd313ea66e28906fc0'/>
<id>9ea7a0df63630ad8197716cd313ea66e28906fc0</id>
<content type='text'>
Add debugging information in case jbd2_journal_dirty_metadata() is
called with a buffer_head which didn't have
jbd2_journal_get_write_access() called on it, or if the journal_head
has the wrong transaction in it.  In addition, return an error code.
This won't change anything for ocfs2, which will BUG_ON() the non-zero
exit code.

For ext4, the caller of this function is ext4_handle_dirty_metadata(),
and on seeing a non-zero return code, will call __ext4_journal_stop(),
which will print the function and line number of the (buggy) calling
function and abort the journal.  This will allow us to recover instead
of bug halting, which is better from a robustness and reliability
point of view.

Signed-off-by: "Theodore Ts'o" &lt;tytso@mit.edu&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add debugging information in case jbd2_journal_dirty_metadata() is
called with a buffer_head which didn't have
jbd2_journal_get_write_access() called on it, or if the journal_head
has the wrong transaction in it.  In addition, return an error code.
This won't change anything for ocfs2, which will BUG_ON() the non-zero
exit code.

For ext4, the caller of this function is ext4_handle_dirty_metadata(),
and on seeing a non-zero return code, will call __ext4_journal_stop(),
which will print the function and line number of the (buggy) calling
function and abort the journal.  This will allow us to recover instead
of bug halting, which is better from a robustness and reliability
point of view.

Signed-off-by: "Theodore Ts'o" &lt;tytso@mit.edu&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>jbd2: remove jbd2_dev_to_name() from jbd2 tracepoints</title>
<updated>2011-07-11T02:05:08+00:00</updated>
<author>
<name>Theodore Ts'o</name>
<email>tytso@mit.edu</email>
</author>
<published>2011-07-11T02:05:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=4862fd6047ed02e2726667c54d35f538eecc56aa'/>
<id>4862fd6047ed02e2726667c54d35f538eecc56aa</id>
<content type='text'>
Using function calls in TP_printk causes perf heartburn, so print the
MAJOR/MINOR device numbers instead.

Signed-off-by: "Theodore Ts'o" &lt;tytso@mit.edu&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Using function calls in TP_printk causes perf heartburn, so print the
MAJOR/MINOR device numbers instead.

Signed-off-by: "Theodore Ts'o" &lt;tytso@mit.edu&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>jbd2: use WRITE_SYNC in journal checkpoint</title>
<updated>2011-06-27T16:36:29+00:00</updated>
<author>
<name>Tao Ma</name>
<email>boyu.mt@taobao.com</email>
</author>
<published>2011-06-27T16:36:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=d3ad8434aa83ef7c88bc91edcfe012cdcbab9f3e'/>
<id>d3ad8434aa83ef7c88bc91edcfe012cdcbab9f3e</id>
<content type='text'>
In journal checkpoint, we write the buffer and wait for its finish.
But in cfq, the async queue has a very low priority, and in our test,
if there are too many sync queues and every queue is filled up with
requests, the write request will be delayed for quite a long time and
all the tasks which are waiting for journal space will end with errors like:

INFO: task attr_set:3816 blocked for more than 120 seconds.
"echo 0 &gt; /proc/sys/kernel/hung_task_timeout_secs" disables this message.
attr_set      D ffff880028393480     0  3816      1 0x00000000
 ffff8802073fbae8 0000000000000086 ffff8802140847c8 ffff8800283934e8
 ffff8802073fb9d8 ffffffff8103e456 ffff8802140847b8 ffff8801ed728080
 ffff8801db4bc080 ffff8801ed728450 ffff880028393480 0000000000000002
Call Trace:
 [&lt;ffffffff8103e456&gt;] ? __dequeue_entity+0x33/0x38
 [&lt;ffffffff8103caad&gt;] ? need_resched+0x23/0x2d
 [&lt;ffffffff814006a6&gt;] ? thread_return+0xa2/0xbc
 [&lt;ffffffffa01f6224&gt;] ? jbd2_journal_dirty_metadata+0x116/0x126 [jbd2]
 [&lt;ffffffffa01f6224&gt;] ? jbd2_journal_dirty_metadata+0x116/0x126 [jbd2]
 [&lt;ffffffff81400d31&gt;] __mutex_lock_common+0x14e/0x1a9
 [&lt;ffffffffa021dbfb&gt;] ? brelse+0x13/0x15 [ext4]
 [&lt;ffffffff81400ddb&gt;] __mutex_lock_slowpath+0x19/0x1b
 [&lt;ffffffff81400b2d&gt;] mutex_lock+0x1b/0x32
 [&lt;ffffffffa01f927b&gt;] __jbd2_journal_insert_checkpoint+0xe3/0x20c [jbd2]
 [&lt;ffffffffa01f547b&gt;] start_this_handle+0x438/0x527 [jbd2]
 [&lt;ffffffff8106f491&gt;] ? autoremove_wake_function+0x0/0x3e
 [&lt;ffffffffa01f560b&gt;] jbd2_journal_start+0xa1/0xcc [jbd2]
 [&lt;ffffffffa02353be&gt;] ext4_journal_start_sb+0x57/0x81 [ext4]
 [&lt;ffffffffa024a314&gt;] ext4_xattr_set+0x6c/0xe3 [ext4]
 [&lt;ffffffffa024aaff&gt;] ext4_xattr_user_set+0x42/0x4b [ext4]
 [&lt;ffffffff81145adb&gt;] generic_setxattr+0x6b/0x76
 [&lt;ffffffff81146ac0&gt;] __vfs_setxattr_noperm+0x47/0xc0
 [&lt;ffffffff81146bb8&gt;] vfs_setxattr+0x7f/0x9a
 [&lt;ffffffff81146c88&gt;] setxattr+0xb5/0xe8
 [&lt;ffffffff81137467&gt;] ? do_filp_open+0x571/0xa6e
 [&lt;ffffffff81146d26&gt;] sys_fsetxattr+0x6b/0x91
 [&lt;ffffffff81002d32&gt;] system_call_fastpath+0x16/0x1b

So this patch tries to use WRITE_SYNC in __flush_batch so that the request will
be moved into sync queue and handled by cfq timely. We also use the new plug,
sot that all the WRITE_SYNC requests can be given as a whole when we unplug it.

Signed-off-by: Tao Ma &lt;boyu.mt@taobao.com&gt;
Signed-off-by: "Theodore Ts'o" &lt;tytso@mit.edu&gt;
Cc: Jan Kara &lt;jack@suse.cz&gt;
Reported-by: Robin Dong &lt;sanbai@taobao.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In journal checkpoint, we write the buffer and wait for its finish.
But in cfq, the async queue has a very low priority, and in our test,
if there are too many sync queues and every queue is filled up with
requests, the write request will be delayed for quite a long time and
all the tasks which are waiting for journal space will end with errors like:

INFO: task attr_set:3816 blocked for more than 120 seconds.
"echo 0 &gt; /proc/sys/kernel/hung_task_timeout_secs" disables this message.
attr_set      D ffff880028393480     0  3816      1 0x00000000
 ffff8802073fbae8 0000000000000086 ffff8802140847c8 ffff8800283934e8
 ffff8802073fb9d8 ffffffff8103e456 ffff8802140847b8 ffff8801ed728080
 ffff8801db4bc080 ffff8801ed728450 ffff880028393480 0000000000000002
Call Trace:
 [&lt;ffffffff8103e456&gt;] ? __dequeue_entity+0x33/0x38
 [&lt;ffffffff8103caad&gt;] ? need_resched+0x23/0x2d
 [&lt;ffffffff814006a6&gt;] ? thread_return+0xa2/0xbc
 [&lt;ffffffffa01f6224&gt;] ? jbd2_journal_dirty_metadata+0x116/0x126 [jbd2]
 [&lt;ffffffffa01f6224&gt;] ? jbd2_journal_dirty_metadata+0x116/0x126 [jbd2]
 [&lt;ffffffff81400d31&gt;] __mutex_lock_common+0x14e/0x1a9
 [&lt;ffffffffa021dbfb&gt;] ? brelse+0x13/0x15 [ext4]
 [&lt;ffffffff81400ddb&gt;] __mutex_lock_slowpath+0x19/0x1b
 [&lt;ffffffff81400b2d&gt;] mutex_lock+0x1b/0x32
 [&lt;ffffffffa01f927b&gt;] __jbd2_journal_insert_checkpoint+0xe3/0x20c [jbd2]
 [&lt;ffffffffa01f547b&gt;] start_this_handle+0x438/0x527 [jbd2]
 [&lt;ffffffff8106f491&gt;] ? autoremove_wake_function+0x0/0x3e
 [&lt;ffffffffa01f560b&gt;] jbd2_journal_start+0xa1/0xcc [jbd2]
 [&lt;ffffffffa02353be&gt;] ext4_journal_start_sb+0x57/0x81 [ext4]
 [&lt;ffffffffa024a314&gt;] ext4_xattr_set+0x6c/0xe3 [ext4]
 [&lt;ffffffffa024aaff&gt;] ext4_xattr_user_set+0x42/0x4b [ext4]
 [&lt;ffffffff81145adb&gt;] generic_setxattr+0x6b/0x76
 [&lt;ffffffff81146ac0&gt;] __vfs_setxattr_noperm+0x47/0xc0
 [&lt;ffffffff81146bb8&gt;] vfs_setxattr+0x7f/0x9a
 [&lt;ffffffff81146c88&gt;] setxattr+0xb5/0xe8
 [&lt;ffffffff81137467&gt;] ? do_filp_open+0x571/0xa6e
 [&lt;ffffffff81146d26&gt;] sys_fsetxattr+0x6b/0x91
 [&lt;ffffffff81002d32&gt;] system_call_fastpath+0x16/0x1b

So this patch tries to use WRITE_SYNC in __flush_batch so that the request will
be moved into sync queue and handled by cfq timely. We also use the new plug,
sot that all the WRITE_SYNC requests can be given as a whole when we unplug it.

Signed-off-by: Tao Ma &lt;boyu.mt@taobao.com&gt;
Signed-off-by: "Theodore Ts'o" &lt;tytso@mit.edu&gt;
Cc: Jan Kara &lt;jack@suse.cz&gt;
Reported-by: Robin Dong &lt;sanbai@taobao.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
