<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/include/linux/jbd.h, branch v3.11.2</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>jbd/jbd2: relocate bit_spinlock header to jbd_common</title>
<updated>2013-06-13T03:02:35+00:00</updated>
<author>
<name>Paul Gortmaker</name>
<email>paul.gortmaker@windriver.com</email>
</author>
<published>2013-06-13T03:02:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=c9b3a8ccb77e063a28a9567eb72e0c1222c4ade4'/>
<id>c9b3a8ccb77e063a28a9567eb72e0c1222c4ade4</id>
<content type='text'>
The bit_spinlock functions are only used for the jbd_lock_bh_state
functions (and friends) in jbd_common.h and are not directly used
by either of jbd.h or jbd2.h content.

The jbd_common file is new as of commit 446066724c36 ("jdb/jbd2: factor
out common functions from the jbd[2] header files") but common
(and isolated) headers were not considered for factoring at that time.

Signed-off-by: Paul Gortmaker &lt;paul.gortmaker@windriver.com&gt;
Signed-off-by: "Theodore Ts'o" &lt;tytso@mit.edu&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The bit_spinlock functions are only used for the jbd_lock_bh_state
functions (and friends) in jbd_common.h and are not directly used
by either of jbd.h or jbd2.h content.

The jbd_common file is new as of commit 446066724c36 ("jdb/jbd2: factor
out common functions from the jbd[2] header files") but common
(and isolated) headers were not considered for factoring at that time.

Signed-off-by: Paul Gortmaker &lt;paul.gortmaker@windriver.com&gt;
Signed-off-by: "Theodore Ts'o" &lt;tytso@mit.edu&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>jbd2: refine waiting for shadow buffers</title>
<updated>2013-06-04T16:08:56+00:00</updated>
<author>
<name>Jan Kara</name>
<email>jack@suse.cz</email>
</author>
<published>2013-06-04T16:08:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=b34090e5e22a02fba0e4473056cce9420ad9dd0b'/>
<id>b34090e5e22a02fba0e4473056cce9420ad9dd0b</id>
<content type='text'>
Currently when we add a buffer to a transaction, we wait until the
buffer is removed from BJ_Shadow list (so that we prevent any changes
to the buffer that is just written to the journal).  This can take
unnecessarily long as a lot happens between the time the buffer is
submitted to the journal and the time when we remove the buffer from
BJ_Shadow list.  (e.g.  We wait for all data buffers in the
transaction, we issue a cache flush, etc.)  Also this creates a
dependency of do_get_write_access() on transaction commit (namely
waiting for data IO to complete) which we want to avoid when
implementing transaction reservation.

So we modify commit code to set new BH_Shadow flag when temporary
shadowing buffer is created and we clear that flag once IO on that
buffer is complete.  This allows do_get_write_access() to wait only
for BH_Shadow bit and thus removes the dependency on data IO
completion.

Reviewed-by: Zheng Liu &lt;wenqing.lz@taobao.com&gt;
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: "Theodore Ts'o" &lt;tytso@mit.edu&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently when we add a buffer to a transaction, we wait until the
buffer is removed from BJ_Shadow list (so that we prevent any changes
to the buffer that is just written to the journal).  This can take
unnecessarily long as a lot happens between the time the buffer is
submitted to the journal and the time when we remove the buffer from
BJ_Shadow list.  (e.g.  We wait for all data buffers in the
transaction, we issue a cache flush, etc.)  Also this creates a
dependency of do_get_write_access() on transaction commit (namely
waiting for data IO to complete) which we want to avoid when
implementing transaction reservation.

So we modify commit code to set new BH_Shadow flag when temporary
shadowing buffer is created and we clear that flag once IO on that
buffer is complete.  This allows do_get_write_access() to wait only
for BH_Shadow bit and thus removes the dependency on data IO
completion.

Reviewed-by: Zheng Liu &lt;wenqing.lz@taobao.com&gt;
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: "Theodore Ts'o" &lt;tytso@mit.edu&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>jbd: change journal_invalidatepage() to accept length</title>
<updated>2013-05-22T03:26:36+00:00</updated>
<author>
<name>Lukas Czerner</name>
<email>lczerner@redhat.com</email>
</author>
<published>2013-05-22T03:26:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=d8c8900ac113d2b2b3d382acda198b4ae29b1b51'/>
<id>d8c8900ac113d2b2b3d382acda198b4ae29b1b51</id>
<content type='text'>
-&gt;invalidatepage() aop now accepts range to invalidate so we can make
use of it in journal_invalidatepage() and all the users in ext3 file
system. Also update ext3 trace point to print out length argument.

Signed-off-by: Lukas Czerner &lt;lczerner@redhat.com&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
-&gt;invalidatepage() aop now accepts range to invalidate so we can make
use of it in journal_invalidatepage() and all the users in ext3 file
system. Also update ext3 trace point to print out length argument.

Signed-off-by: Lukas Czerner &lt;lczerner@redhat.com&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>jbd: use kmem_cache_zalloc instead of kmem_cache_alloc/memset</title>
<updated>2013-04-25T13:25:23+00:00</updated>
<author>
<name>Zheng Liu</name>
<email>wenqing.lz@taobao.com</email>
</author>
<published>2013-04-12T02:24:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=e162b2f835eeda0d255bd463753b5eb823735205'/>
<id>e162b2f835eeda0d255bd463753b5eb823735205</id>
<content type='text'>
Now jbd_alloc_handle is only called by new_handle.  So this commit
uses kmem_cache_zalloc instead of kmem_cache_alloc/memset.

Signed-off-by: Zheng Liu &lt;wenqing.lz@taobao.com&gt;
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Now jbd_alloc_handle is only called by new_handle.  So this commit
uses kmem_cache_zalloc instead of kmem_cache_alloc/memset.

Signed-off-by: Zheng Liu &lt;wenqing.lz@taobao.com&gt;
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>jbd: Write journal superblock with WRITE_FUA after checkpointing</title>
<updated>2012-05-15T21:34:37+00:00</updated>
<author>
<name>Jan Kara</name>
<email>jack@suse.cz</email>
</author>
<published>2012-04-07T09:05:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=fd2cbd4dfa3db477dd6226d387d3f1911d36a6a9'/>
<id>fd2cbd4dfa3db477dd6226d387d3f1911d36a6a9</id>
<content type='text'>
If journal superblock is written only in disk's caches and other transaction
starts reusing space of the transaction cleaned from the log, it can happen
blocks of a new transaction reach the disk before journal superblock. When
power failure happens in such case, subsequent journal replay would still try
to replay the old transaction but some of it's blocks may be already
overwritten by the new transaction. For this reason we must use WRITE_FUA when
updating log tail and we must first write new log tail to disk and update
in-memory information only after that.

Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
If journal superblock is written only in disk's caches and other transaction
starts reusing space of the transaction cleaned from the log, it can happen
blocks of a new transaction reach the disk before journal superblock. When
power failure happens in such case, subsequent journal replay would still try
to replay the old transaction but some of it's blocks may be already
overwritten by the new transaction. For this reason we must use WRITE_FUA when
updating log tail and we must first write new log tail to disk and update
in-memory information only after that.

Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>jbd: Split updating of journal superblock and marking journal empty</title>
<updated>2012-05-15T21:34:36+00:00</updated>
<author>
<name>Jan Kara</name>
<email>jack@suse.cz</email>
</author>
<published>2012-04-07T10:33:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=9754e39c7bc51328f145e933bfb0df47cd67b6e9'/>
<id>9754e39c7bc51328f145e933bfb0df47cd67b6e9</id>
<content type='text'>
There are three case of updating journal superblock. In the first case, we want
to mark journal as empty (setting s_sequence to 0), in the second case we want
to update log tail, in the third case we want to update s_errno. Split these
cases into separate functions. It makes the code slightly more straightforward
and later patches will make the distinction even more important.

Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
There are three case of updating journal superblock. In the first case, we want
to mark journal as empty (setting s_sequence to 0), in the second case we want
to update log tail, in the third case we want to update s_errno. Split these
cases into separate functions. It makes the code slightly more straightforward
and later patches will make the distinction even more important.

Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>jbd: Refine commit writeout logic</title>
<updated>2012-04-11T09:12:44+00:00</updated>
<author>
<name>Jan Kara</name>
<email>jack@suse.cz</email>
</author>
<published>2011-02-21T16:25:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=2db938bee32e7469ca8ed9bfb3a05535f28c680d'/>
<id>2db938bee32e7469ca8ed9bfb3a05535f28c680d</id>
<content type='text'>
Currently we write out all journal buffers in WRITE_SYNC mode. This improves
performance for fsync heavy workloads but hinders performance when writes
are mostly asynchronous, most noticably it slows down readers and users
complain about slow desktop response etc.

So submit writes as asynchronous in the normal case and only submit writes as
WRITE_SYNC if we detect someone is waiting for current transaction commit.

I've gathered some numbers to back this change. The first is the read latency
test. It measures time to read 1 MB after several seconds of sleeping in
presence of streaming writes.

Top 10 times (out of 90) in us:
Before		After
2131586		697473
1709932		557487
1564598		535642
1480462		347573
1478579		323153
1408496		222181
1388960		181273
1329565		181070
1252486		172832
1223265		172278

Average:
619377		82180

So the improvement in both maximum and average latency is massive.

I've measured fsync throughput by:
fs_mark -n 100 -t 1 -s 16384 -d /mnt/fsync/ -S 1 -L 4

in presence of streaming reader. The numbers (fsyncs/s) are:
Before		After
9.9		6.3
6.8		6.0
6.3		6.2
5.8		6.1

So fsync performance seems unharmed by this change.

Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently we write out all journal buffers in WRITE_SYNC mode. This improves
performance for fsync heavy workloads but hinders performance when writes
are mostly asynchronous, most noticably it slows down readers and users
complain about slow desktop response etc.

So submit writes as asynchronous in the normal case and only submit writes as
WRITE_SYNC if we detect someone is waiting for current transaction commit.

I've gathered some numbers to back this change. The first is the read latency
test. It measures time to read 1 MB after several seconds of sleeping in
presence of streaming writes.

Top 10 times (out of 90) in us:
Before		After
2131586		697473
1709932		557487
1564598		535642
1480462		347573
1478579		323153
1408496		222181
1388960		181273
1329565		181070
1252486		172832
1223265		172278

Average:
619377		82180

So the improvement in both maximum and average latency is massive.

I've measured fsync throughput by:
fs_mark -n 100 -t 1 -s 16384 -d /mnt/fsync/ -S 1 -L 4

in presence of streaming reader. The numbers (fsyncs/s) are:
Before		After
9.9		6.3
6.8		6.0
6.3		6.2
5.8		6.1

So fsync performance seems unharmed by this change.

Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>jbd: Remove j_barrier mutex</title>
<updated>2012-01-09T12:52:09+00:00</updated>
<author>
<name>Jan Kara</name>
<email>jack@suse.cz</email>
</author>
<published>2011-12-22T13:52:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=0048278552e9752fd578c3d8deee756987c10873'/>
<id>0048278552e9752fd578c3d8deee756987c10873</id>
<content type='text'>
j_barrier mutex is used for serializing different journal lock operations.  The
problem with it is that e.g. FIFREEZE ioctl results in process leaving kernel
with j_barrier mutex held which makes lockdep freak out. Also hibernation code
wants to freeze filesystem but it cannot do so because it then cannot hibernate
the system because of mutex being locked.

So we remove j_barrier mutex and use direct wait on j_barrier_count instead.
Since locking journal is a rare operation we don't have to care about fairness
or such things.

CC: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Acked-by: Joel Becker &lt;jlbec@evilplan.org&gt;
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
j_barrier mutex is used for serializing different journal lock operations.  The
problem with it is that e.g. FIFREEZE ioctl results in process leaving kernel
with j_barrier mutex held which makes lockdep freak out. Also hibernation code
wants to freeze filesystem but it cannot do so because it then cannot hibernate
the system because of mutex being locked.

So we remove j_barrier mutex and use direct wait on j_barrier_count instead.
Since locking journal is a rare operation we don't have to care about fairness
or such things.

CC: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Acked-by: Joel Becker &lt;jlbec@evilplan.org&gt;
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>jbd: clear revoked flag on buffers before a new transaction started</title>
<updated>2011-11-22T00:20:53+00:00</updated>
<author>
<name>Yongqiang Yang</name>
<email>xiaoqiangnk@gmail.com</email>
</author>
<published>2011-11-19T09:34:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=8c111b3f56332a216b18cd57950bdf04ac8f2a98'/>
<id>8c111b3f56332a216b18cd57950bdf04ac8f2a98</id>
<content type='text'>
Currently, we clear revoked flag only when a block is reused.  However,
this can tigger a false journal error.  Consider a situation when a block
is used as a meta block and is deleted(revoked) in ordered mode, then the
block is allocated as a data block to a file.  At this moment, user changes
the file's journal mode from ordered to journaled and truncates the file.
The block will be considered re-revoked by journal because it has revoked
flag still pending from the last transaction and an assertion triggers.

We fix the problem by keeping the revoked status more uptodate - we clear
revoked flag when switching revoke tables to reflect there is no revoked
buffers in current transaction any more.

Signed-off-by: Yongqiang Yang &lt;xiaoqiangnk@gmail.com&gt;
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently, we clear revoked flag only when a block is reused.  However,
this can tigger a false journal error.  Consider a situation when a block
is used as a meta block and is deleted(revoked) in ordered mode, then the
block is allocated as a data block to a file.  At this moment, user changes
the file's journal mode from ordered to journaled and truncates the file.
The block will be considered re-revoked by journal because it has revoked
flag still pending from the last transaction and an assertion triggers.

We fix the problem by keeping the revoked status more uptodate - we clear
revoked flag when switching revoke tables to reflect there is no revoked
buffers in current transaction any more.

Signed-off-by: Yongqiang Yang &lt;xiaoqiangnk@gmail.com&gt;
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>jdb/jbd2: factor out common functions from the jbd[2] header files</title>
<updated>2011-10-27T08:38:18+00:00</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2011-10-27T08:38:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=446066724c3629664e29942a00b0aee0d6b1663a'/>
<id>446066724c3629664e29942a00b0aee0d6b1663a</id>
<content type='text'>
The state bits and the lock functions of jbd and jbd2 are
identical.  Share them.

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: "Theodore Ts'o" &lt;tytso@mit.edu&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The state bits and the lock functions of jbd and jbd2 are
identical.  Share them.

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: "Theodore Ts'o" &lt;tytso@mit.edu&gt;
</pre>
</div>
</content>
</entry>
</feed>
