<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/fs/bcachefs/btree_write_buffer.h, branch v6.16</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>bcachefs: Check for bad write buffer key when moving from journal</title>
<updated>2025-06-24T19:48:00+00:00</updated>
<author>
<name>Kent Overstreet</name>
<email>kent.overstreet@linux.dev</email>
</author>
<published>2025-06-23T22:42:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=865ad1dbf13244b349e3da2731c1188476ad17b8'/>
<id>865ad1dbf13244b349e3da2731c1188476ad17b8</id>
<content type='text'>
Signed-off-by: Kent Overstreet &lt;kent.overstreet@linux.dev&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Signed-off-by: Kent Overstreet &lt;kent.overstreet@linux.dev&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bcachefs: Move various init code to _init_early()</title>
<updated>2025-05-22T00:14:02+00:00</updated>
<author>
<name>Kent Overstreet</name>
<email>kent.overstreet@linux.dev</email>
</author>
<published>2025-04-05T23:30:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=a17e985be9831bf866795fe5e3da219d2061ce6c'/>
<id>a17e985be9831bf866795fe5e3da219d2061ce6c</id>
<content type='text'>
_init_early() is for initialization that cannot fail, and often must
happen for teardown partway through initialization to work.

Signed-off-by: Kent Overstreet &lt;kent.overstreet@linux.dev&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
_init_early() is for initialization that cannot fail, and often must
happen for teardown partway through initialization to work.

Signed-off-by: Kent Overstreet &lt;kent.overstreet@linux.dev&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bcachefs: bch2_btree_write_buffer_flush_going_ro()</title>
<updated>2024-11-08T04:31:11+00:00</updated>
<author>
<name>Kent Overstreet</name>
<email>kent.overstreet@linux.dev</email>
</author>
<published>2024-11-08T02:48:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=ca43f73cd1720e3b0b9c49deec1a13c89c0ca1e8'/>
<id>ca43f73cd1720e3b0b9c49deec1a13c89c0ca1e8</id>
<content type='text'>
The write buffer needs to be specifically flushed when going RO: keys in
the journal that haven't yet been moved to the write buffer don't have a
journal pin yet.

This fixes numerous syzbot bugs, all with symptoms of still doing writes
after we've got RO.

Signed-off-by: Kent Overstreet &lt;kent.overstreet@linux.dev&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The write buffer needs to be specifically flushed when going RO: keys in
the journal that haven't yet been moved to the write buffer don't have a
journal pin yet.

This fixes numerous syzbot bugs, all with symptoms of still doing writes
after we've got RO.

Signed-off-by: Kent Overstreet &lt;kent.overstreet@linux.dev&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bcachefs: Eytzinger accumulation for accounting keys</title>
<updated>2024-07-14T23:00:14+00:00</updated>
<author>
<name>Kent Overstreet</name>
<email>kent.overstreet@linux.dev</email>
</author>
<published>2023-12-27T16:33:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=b9efa9673e1d3fee530b582dbde1827d336513a8'/>
<id>b9efa9673e1d3fee530b582dbde1827d336513a8</id>
<content type='text'>
The btree write buffer takes as input keys from the journal, sorts them,
deduplicates them, and flushes them back to the btree in sorted order.

The disk space accounting rewrite is moving accounting to normal btree
keys, with update (in this case deltas) accumulated in the write buffer
and then flushed to the btree; but this is going to increase the number
of keys handled by the write buffer by perhaps as much as a factor of
3x-5x.

The overhead from copying around and sorting this many keys would cause
a significant performance regression, but: there is huge locality in
updates to accounting keys that we can take advantage of.

Instead of appending accounting keys to the list of keys to be sorted,
this patch adds an eytzinger search tree of recently seen accounting
keys. We look up the accounting key in the eytzinger search tree and
apply the delta directly, adding it if it doesn't exist, and
periodically prune the eytzinger tree of unused entries.

Signed-off-by: Kent Overstreet &lt;kent.overstreet@linux.dev&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The btree write buffer takes as input keys from the journal, sorts them,
deduplicates them, and flushes them back to the btree in sorted order.

The disk space accounting rewrite is moving accounting to normal btree
keys, with update (in this case deltas) accumulated in the write buffer
and then flushed to the btree; but this is going to increase the number
of keys handled by the write buffer by perhaps as much as a factor of
3x-5x.

The overhead from copying around and sorting this many keys would cause
a significant performance regression, but: there is huge locality in
updates to accounting keys that we can take advantage of.

Instead of appending accounting keys to the list of keys to be sorted,
this patch adds an eytzinger search tree of recently seen accounting
keys. We look up the accounting key in the eytzinger search tree and
apply the delta directly, adding it if it doesn't exist, and
periodically prune the eytzinger tree of unused entries.

Signed-off-by: Kent Overstreet &lt;kent.overstreet@linux.dev&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bcachefs: bch2_btree_write_buffer_maybe_flush()</title>
<updated>2024-06-29T22:34:52+00:00</updated>
<author>
<name>Kent Overstreet</name>
<email>kent.overstreet@linux.dev</email>
</author>
<published>2024-06-29T22:32:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=92e1c29ae803f85d2f50b4450becb258acae4311'/>
<id>92e1c29ae803f85d2f50b4450becb258acae4311</id>
<content type='text'>
Add a new helper for checking references to write buffer btrees, where
we need a flush before we definitively know we have an inconsistency.

Signed-off-by: Kent Overstreet &lt;kent.overstreet@linux.dev&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add a new helper for checking references to write buffer btrees, where
we need a flush before we definitively know we have an inconsistency.

Signed-off-by: Kent Overstreet &lt;kent.overstreet@linux.dev&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bcachefs: __bch2_journal_key_to_wb -&gt; bch2_journal_key_to_wb_slowpath</title>
<updated>2024-01-06T04:24:19+00:00</updated>
<author>
<name>Kent Overstreet</name>
<email>kent.overstreet@linux.dev</email>
</author>
<published>2023-12-28T01:26:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=8feaebb0ae8882d688b4152c5d693e248d17e623'/>
<id>8feaebb0ae8882d688b4152c5d693e248d17e623</id>
<content type='text'>
Signed-off-by: Kent Overstreet &lt;kent.overstreet@linux.dev&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Signed-off-by: Kent Overstreet &lt;kent.overstreet@linux.dev&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bcachefs: btree write buffer now slurps keys from journal</title>
<updated>2024-01-01T16:47:41+00:00</updated>
<author>
<name>Kent Overstreet</name>
<email>kent.overstreet@linux.dev</email>
</author>
<published>2023-11-02T22:57:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=09caeabe1a5daa3b667b797c401d43206d583f23'/>
<id>09caeabe1a5daa3b667b797c401d43206d583f23</id>
<content type='text'>
Previosuly, the transaction commit path would have to add keys to the
btree write buffer as a separate operation, requiring additional global
synchronization.

This patch introduces a new journal entry type, which indicates that the
keys need to be copied into the btree write buffer prior to being
written out. We switch the journal entry type back to
JSET_ENTRY_btree_keys prior to write, so this is not an on disk format
change.

Flushing the btree write buffer may require pulling keys out of journal
entries yet to be written, and quiescing outstanding journal
reservations; we previously added journal-&gt;buf_lock for synchronization
with the journal write path.

We also can't put strict bounds on the number of keys in the journal
destined for the write buffer, which means we might overflow the size of
the preallocated buffer and have to reallocate - this introduces a
potentially fatal memory allocation failure. This is something we'll
have to watch for, if it becomes an issue in practice we can do
additional mitigation.

The transaction commit path no longer has to explicitly check if the
write buffer is full and wait on flushing; this is another performance
optimization. Instead, when the btree write buffer is close to full we
change the journal watermark, so that only reservations for journal
reclaim are allowed.

Signed-off-by: Kent Overstreet &lt;kent.overstreet@linux.dev&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Previosuly, the transaction commit path would have to add keys to the
btree write buffer as a separate operation, requiring additional global
synchronization.

This patch introduces a new journal entry type, which indicates that the
keys need to be copied into the btree write buffer prior to being
written out. We switch the journal entry type back to
JSET_ENTRY_btree_keys prior to write, so this is not an on disk format
change.

Flushing the btree write buffer may require pulling keys out of journal
entries yet to be written, and quiescing outstanding journal
reservations; we previously added journal-&gt;buf_lock for synchronization
with the journal write path.

We also can't put strict bounds on the number of keys in the journal
destined for the write buffer, which means we might overflow the size of
the preallocated buffer and have to reallocate - this introduces a
potentially fatal memory allocation failure. This is something we'll
have to watch for, if it becomes an issue in practice we can do
additional mitigation.

The transaction commit path no longer has to explicitly check if the
write buffer is full and wait on flushing; this is another performance
optimization. Instead, when the btree write buffer is close to full we
change the journal watermark, so that only reservations for journal
reclaim are allowed.

Signed-off-by: Kent Overstreet &lt;kent.overstreet@linux.dev&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bcachefs: bch2_btree_write_buffer_flush() -&gt; bch2_btree_write_buffer_tryflush()</title>
<updated>2024-01-01T16:47:39+00:00</updated>
<author>
<name>Kent Overstreet</name>
<email>kent.overstreet@linux.dev</email>
</author>
<published>2023-11-03T00:36:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=cb13f471390ce646a3d5aa9c599f7eec43ddb2ac'/>
<id>cb13f471390ce646a3d5aa9c599f7eec43ddb2ac</id>
<content type='text'>
More accurate naming.

Signed-off-by: Kent Overstreet &lt;kent.overstreet@linux.dev&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
More accurate naming.

Signed-off-by: Kent Overstreet &lt;kent.overstreet@linux.dev&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bcachefs: bch2_btree_write_buffer_flush_locked()</title>
<updated>2024-01-01T16:47:39+00:00</updated>
<author>
<name>Kent Overstreet</name>
<email>kent.overstreet@linux.dev</email>
</author>
<published>2023-11-03T00:32:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=d3083cf28d54991c299d5c05790e40e52cf75df0'/>
<id>d3083cf28d54991c299d5c05790e40e52cf75df0</id>
<content type='text'>
Minor refactoring - improved naming, and move the responsibility for
flush_lock to the caller instead of having it be shared.

Signed-off-by: Kent Overstreet &lt;kent.overstreet@linux.dev&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Minor refactoring - improved naming, and move the responsibility for
flush_lock to the caller instead of having it be shared.

Signed-off-by: Kent Overstreet &lt;kent.overstreet@linux.dev&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>bcachefs: Clean up btree write buffer write ref handling</title>
<updated>2024-01-01T16:47:39+00:00</updated>
<author>
<name>Kent Overstreet</name>
<email>kent.overstreet@linux.dev</email>
</author>
<published>2023-11-02T23:37:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=183bcc89b855c412bfefa545b799006d66f689a6'/>
<id>183bcc89b855c412bfefa545b799006d66f689a6</id>
<content type='text'>
__bch2_btree_write_buffer_flush() now assumes a write ref is already
held (as called by the transaction commit path); and the wrappers
bch2_write_buffer_flush() and flush_sync() take an explicit write ref.

This means internally the write buffer code can always use
BTREE_INSERT_NOCHECK_RW, instead of in the previous code passing flags
around and hoping the NOCHECK_RW flag was always carried around
correctly.

Signed-off-by: Kent Overstreet &lt;kent.overstreet@linux.dev&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
__bch2_btree_write_buffer_flush() now assumes a write ref is already
held (as called by the transaction commit path); and the wrappers
bch2_write_buffer_flush() and flush_sync() take an explicit write ref.

This means internally the write buffer code can always use
BTREE_INSERT_NOCHECK_RW, instead of in the previous code passing flags
around and hoping the NOCHECK_RW flag was always carried around
correctly.

Signed-off-by: Kent Overstreet &lt;kent.overstreet@linux.dev&gt;
</pre>
</div>
</content>
</entry>
</feed>
