linux.git/fs/btrfs/backref.c, branch v4.15

btrfs: track refs in a rb_tree instead of a list

2017-11-01T19:45:35+00:00

If we get a significant amount of delayed refs for a single block (think
modifying multiple snapshots) we can end up spending an ungodly amount
of time looping through all of the entries trying to see if they can be
merged.  This is because we only add them to a list, so we have O(2n)
for every ref head.  This doesn't make any sense as we likely have refs
for different roots, and so they cannot be merged.  Tracking in a tree
will allow us to break as soon as we hit an entry that doesn't match,
making our worst case O(n).

With this we can also merge entries more easily.  Before we had to hope
that matching refs were on the ends of our list, but with the tree we
can search down to exact matches and merge them at insert time.

Signed-off-by: Josef Bacik 
Signed-off-by: David Sterba

btrfs: add a flag to iterate_inodes_from_logical to find all extent refs for uncompressed extents

2017-11-01T19:45:34+00:00

The LOGICAL_INO ioctl provides a backward mapping from extent bytenr and
offset (encoded as a single logical address) to a list of extent refs.
LOGICAL_INO complements TREE_SEARCH, which provides the forward mapping
(extent ref -> extent bytenr and offset, or logical address).  These are
useful capabilities for programs that manipulate extents and extent
references from userspace (e.g. dedup and defrag utilities).

When the extents are uncompressed (and not encrypted and not other),
check_extent_in_eb performs filtering of the extent refs to remove any
extent refs which do not contain the same extent offset as the 'logical'
parameter's extent offset.  This prevents LOGICAL_INO from returning
references to more than a single block.

To find the set of extent references to an uncompressed extent from [a, b),
userspace has to run a loop like this pseudocode:

	for (i = a; i < b; ++i)
		extent_ref_set += LOGICAL_INO(i);

At each iteration of the loop (up to 32768 iterations for a 128M extent),
data we are interested in is collected in the kernel, then deleted by
the filter in check_extent_in_eb.

When the extents are compressed (or encrypted or other), the 'logical'
parameter must be an extent bytenr (the 'a' parameter in the loop).
No filtering by extent offset is done (or possible?) so the result is
the complete set of extent refs for the entire extent.  This removes
the need for the loop, since we get all the extent refs in one call.

Add an 'ignore_offset' argument to iterate_inodes_from_logical,
[...several levels of function call graph...], and check_extent_in_eb, so
that we can disable the extent offset filtering for uncompressed extents.
This flag can be set by an improved version of the LOGICAL_INO ioctl to
get either behavior as desired.

There is no functional change in this patch.  The new flag is always
false.

Signed-off-by: Zygo Blaxell 
Reviewed-by: David Sterba 
[ minor coding style fixes ]
Signed-off-by: David Sterba

btrfs: remove delayed_ref_node from ref_head

2017-10-30T11:28:00+00:00

This is just excessive information in the ref_head, and makes the code
complicated.  It is a relic from when we had the heads and the refs in
the same tree, which is no longer the case.  With this removal I've
cleaned up a bunch of the cruft around this old assumption as well.

Signed-off-by: Josef Bacik 
Reviewed-by: David Sterba 
Signed-off-by: David Sterba

Btrfs: convert to use btrfs_get_extent_inline_ref_type

2017-08-21T15:47:43+00:00

Since we have a helper which can do sanity check, this converts all
btrfs_extent_inline_ref_type to it.

Signed-off-by: Liu Bo 
Reviewed-by: David Sterba 
Signed-off-by: David Sterba

btrfs: clean up extraneous computations in add_delayed_refs

2017-08-16T14:12:01+00:00

Repeating the same computation in multiple places is not
necessary.

Signed-off-by: Edmund Nadolski 
Signed-off-by: Jeff Mahoney 
Reviewed-by: David Sterba 
Signed-off-by: David Sterba

btrfs: allow backref search checks for shared extents

2017-08-16T14:12:01+00:00

When called with a struct share_check, find_parent_nodes()
will detect a shared extent and immediately return with
BACKREF_SHARED_FOUND.

Signed-off-by: Edmund Nadolski 
Signed-off-by: Jeff Mahoney 
Reviewed-by: Liu Bo 
Signed-off-by: David Sterba

btrfs: add cond_resched() calls when resolving backrefs

2017-08-16T14:12:01+00:00

Since backref resolution is CPU-intensive, the cond_resched calls
should help alleviate soft lockup occurences.

Signed-off-by: Edmund Nadolski 
Signed-off-by: Jeff Mahoney 
Reviewed-by: David Sterba 
Signed-off-by: David Sterba

btrfs: backref, add tracepoints for prelim_ref insertion and merging

2017-08-16T14:12:01+00:00

This patch adds a tracepoint event for prelim_ref insertion and
merging.  For each, the ref being inserted or merged and the count
of tree nodes is issued.

Signed-off-by: Jeff Mahoney 
Reviewed-by: David Sterba 
Signed-off-by: David Sterba

btrfs: add a node counter to each of the rbtrees

2017-08-16T14:12:01+00:00

This patch adds counters to each of the rbtrees so that we can tell
how large they are growing for a given workload.  These counters
will be exported by tracepoints in the next patch.

Signed-off-by: Jeff Mahoney 
Reviewed-by: David Sterba 
Signed-off-by: David Sterba

btrfs: convert prelimary reference tracking to use rbtrees

2017-08-16T14:11:55+00:00

It's been known for a while that the use of multiple lists
that are periodically merged was an algorithmic problem within
btrfs.  There are several workloads that don't complete in any
reasonable amount of time (e.g. btrfs/130) and others that cause
soft lockups.

The solution is to use a set of rbtrees that do insertion merging
for both indirect and direct refs, with the former converting
refs into the latter.  The result is a btrfs/130 workload that
used to take several hours now takes about half of that. This
runtime still isn't acceptable and a future patch will address that
by moving the rbtrees higher in the stack so the lookups can be
shared across multiple calls to find_parent_nodes.

Signed-off-by: Edmund Nadolski 
Signed-off-by: Jeff Mahoney 
Reviewed-by: Liu Bo 
Signed-off-by: David Sterba