Merge branch 'bpf-replace-path-sensitive-with-path-insensitive-live-stack-analysis'

Eduard Zingerman says: ==================== bpf: replace path-sensitive with path-insensitive live stack analysis Consider the following program, assuming checkpoint is created for a state at instruction (3): 1: call bpf_get_prandom_u32() 2: *(u64 *)(r10 - 8) = 42 -- checkpoint #1 -- 3: if r0 != 0 goto +1 4: exit; 5: r0 = *(u64 *)(r10 - 8) 6: exit The verifier processes this program by exploring two paths: - 1 -> 2 -> 3 -> 4 - 1 -> 2 -> 3 -> 5 -> 6 When instruction (5) is processed, the current liveness tracking mechanism moves up the register parent links and records a "read" mark for stack slot -8 at checkpoint #1, stopping because of the "write" mark recorded at instruction (2). This patch set replaces the existing liveness tracking mechanism with a path-insensitive data flow analysis. The program above is processed as follows: - a data structure representing live stack slots for instructions 1-6 in frame #0 is allocated; - when instruction (2) is processed, record that slot -8 is written at instruction (2) in frame #0; - when instruction (5) is processed, record that slot -8 is read at instruction (5) in frame #0; - when instruction (6) is processed, propagate read mark for slot -8 up the control flow graph to instructions 3 and 2. The key difference is that the new mechanism operates on a control flow graph and associates read and write marks with pairs of (call chain, instruction index). In contrast, the old mechanism operates on verifier states and register parent links, associating read and write marks with verifier states. Motivation ========== As it stands, this patch set makes liveness tracking slightly less precise, as it no longer distinguishes individual program paths taken by the verifier during symbolic execution. See the "Impact on verification performance" section for details. However, this change is intended as a stepping stone toward the following goals: - Short term, integrate precision tracking into liveness analysis and remove the following code: - verifier backedge states accumulation in is_state_visited(); - most of the logic for precision tracking; - jump history tracking. - Long term, help with more efficient loop verification handling. Why integrating precision tracking? ----------------------------------- In a sense, precision tracking is very similar to liveness tracking. The data flow equations for liveness tracking look as follows: live_after = U [state[s].live_before for s in insn_successors(i)] state[i].live_before = (live_after / state[i].must_write) U state[i].may_read While data flow equations for precision tracking look as follows: precise_after = U [state[s].precise_before for s in insn_successors(i)] // if some of the instruction outputs are precise, // assume its inputs to be precise induced_precise = ⎧ state[i].may_read if (state[i].may_write ∩ precise_after) ≠ ∅ ⎨ ⎩ ∅ otherwise state[i].precise_before = (precise_after / state[i].must_write) ∩ induced_precise Where: - `may_read` set represents a union of all possibly read slots (any slot in `may_read` set might be by the instruction); - `must_write` set represents an intersection of all possibly written slots (any slot in `must_write` set is guaranteed to be written by the instruction). - `may_write` set represents a union of all possibly written slots (any slot in `may_write` set might be written by the instruction). This means that precision tracking can be implemented as a logical extension of liveness tracking: - track registers as well as stack slots; - add bit masks to represent `precise_before` and `may_write`; - add above equations for `precise_before` computation; - (linked registers require some additional consideration). Such extension would allow removal of: - precision propagation logic in verifier.c: - backtrack_insn() - mark_chain_precision() - propagate_{precision,backedges}() - push_jmp_history() and related data structures, which are only used by precision tracking; - add_scc_backedge() and related backedge state accumulation in is_state_visited(), superseded by per-callchain function state accumulated by liveness analysis. The hope here is that unifying liveness and precision tracking will reduce overall amount of code and make it easier to reason about. How this helps with loops? -------------------------- As it stands, this patch set shares the same deficiency as the current liveness tracking mechanism. Liveness marks on stack slots cannot be used to prune states when processing iterator-based loops: - such states still have branches to be explored; - meaning that not all stack slot reads have been discovered. For example: 1: while(iter_next()) { 2: if (...) 3: r0 = *(u64 *)(r10 - 8) 4: if (...) 5: r0 = *(u64 *)(r10 - 16) 6: ... 7: } For any checkpoint state created at instruction (1), it is only possible to rely on read marks for slots fp[-8] and fp[-16] once all child states of (1) have been explored. Thus, when the verifier transitions from (7) to (1), it cannot rely on read marks. However, sacrificing path-sensitivity makes it possible to run analysis defined in this patch set before main verification pass, if estimates for value ranges are available. E.g. for the following program: 1: while(iter_next()) { 2: r0 = r10 3: r0 += r2 4: r0 = *(u64 *)(r2 + 0) 5: ... 6: } If an estimate for `r2` range is available before the main verification pass, it can be used to populate read marks at instruction (4) and run the liveness analysis. Thus making conservative liveness information available during loops verification. Such estimates can be provided by some form of value range analysis. Value range analysis is also necessary to address loop verification from another angle: computing boundaries for loop induction variables and iteration counts. The hope here is that the new liveness tracking mechanism will support the broader goal of making loop verification more efficient. Validation ========== The change was tested on three program sets: - bpf selftests - sched_ext - Meta's internal set of programs Commit [#8] enables a special mode where both the current and new liveness analyses are enabled simultaneously. This mode signals an error if the new algorithm considers a stack slot dead while the current algorithm assumes it is alive. This mode was very useful for debugging. At the time of posting, no such errors have been reported for the above program sets. [#8] "bpf: signal error if old liveness is more conservative than new" Impact on memory consumption ============================ Debug patch [1] extends the kernel and veristat to count the amount of memory allocated for storing analysis data. This patch is not included in the submission. The maximal observed impact for the above program sets is 2.6Mb. Data below is shown in bytes. For bpf selftests top 5 consumers look as follows: File Program liveness mem ----------------------- ---------------- ------------ pyperf180.bpf.o on_event 2629740 pyperf600.bpf.o on_event 2287662 pyperf100.bpf.o on_event 1427022 test_verif_scale3.bpf.o balancer_ingress 1121283 pyperf_subprogs.bpf.o on_event 756900 For sched_ext top 5 consumers loog as follows: File Program liveness mem --------- ------------------------------- ------------ bpf.bpf.o lavd_enqueue 164686 bpf.bpf.o lavd_select_cpu 157393 bpf.bpf.o layered_enqueue 154817 bpf.bpf.o lavd_init 127865 bpf.bpf.o layered_dispatch 110129 For Meta's internal set of programs top consumer is 1Mb. [1] https://github.com/kernel-patches/bpf/commit/085588e787b7998a296eb320666897d80bca7c08 Impact on verification performance ================================== Veristat results below are reported using `-f insns_pct>1 -f !insns<500` filter and -t option (BPF_F_TEST_STATE_FREQ flag). master vs patch-set, selftests (out of ~4K programs) ---------------------------------------------------- File Program Insns (A) Insns (B) Insns (DIFF) -------------------------------- -------------------------------------- --------- --------- --------------- cpumask_success.bpf.o test_global_mask_nested_deep_array_rcu 1622 1655 +33 (+2.03%) strobemeta_bpf_loop.bpf.o on_event 2163 2684 +521 (+24.09%) test_cls_redirect.bpf.o cls_redirect 36001 42515 +6514 (+18.09%) test_cls_redirect_dynptr.bpf.o cls_redirect 2299 2339 +40 (+1.74%) test_cls_redirect_subprogs.bpf.o cls_redirect 69545 78497 +8952 (+12.87%) test_l4lb_noinline.bpf.o balancer_ingress 2993 3084 +91 (+3.04%) test_xdp_noinline.bpf.o balancer_ingress_v4 3539 3616 +77 (+2.18%) test_xdp_noinline.bpf.o balancer_ingress_v6 3608 3685 +77 (+2.13%) master vs patch-set, sched_ext (out of 148 programs) ---------------------------------------------------- File Program Insns (A) Insns (B) Insns (DIFF) --------- ---------------- --------- --------- --------------- bpf.bpf.o chaos_dispatch 2257 2287 +30 (+1.33%) bpf.bpf.o lavd_enqueue 20735 22101 +1366 (+6.59%) bpf.bpf.o lavd_select_cpu 22100 24409 +2309 (+10.45%) bpf.bpf.o layered_dispatch 25051 25606 +555 (+2.22%) bpf.bpf.o p2dq_dispatch 961 990 +29 (+3.02%) bpf.bpf.o rusty_quiescent 526 534 +8 (+1.52%) bpf.bpf.o rusty_runnable 541 547 +6 (+1.11%) Perf report =========== In relative terms, the analysis does not consume much CPU time. For example, here is a perf report collected for pyperf180 selftest: # Children Self Command Shared Object Symbol # ........ ........ ........ .................... ........................................ ... 1.22% 1.22% veristat [kernel.kallsyms] [k] bpf_update_live_stack ... Changelog ========= v1: https://lore.kernel.org/bpf/20250911010437.2779173-1-eddyz87@gmail.com/T/ v1 -> v2: - compute_postorder() fixed to handle jumps with offset -1 (syzbot). - is_state_visited() in patch #9 fixed access to uninitialized `err` (kernel test robot, Dan Carpenter). - Selftests added. - Fixed bug with write marks propagation from callee to caller, see verifier_live_stack.c:caller_stack_write() test case. - Added a patch for __not_msg() annotation for test_loader based tests. v2: https://lore.kernel.org/bpf/20250918-callchain-sensitive-liveness-v2-0-214ed2653eee@gmail.com/ v2 -> v3: - Added __diag_ignore_all("-Woverride-init", ...) in liveness.c for bpf_insn_successors() (suggested by Alexei). Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> ==================== Link: https://patch.msgid.link/20250918-callchain-sensitive-liveness-v3-0-c3cd27bacc60@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
author: Alexei Starovoitov <ast@kernel.org> 2025-09-19 09:27:24 -0700
committer: Alexei Starovoitov <ast@kernel.org> 2025-09-19 09:27:35 -0700
commit: 815276dbfbb79abd531f09eca7d3208a847d6a0b (patch)
tree: cdbea79ce3890965b52d94378ea10870349eebc7 /include/linux
parent: 8cd189e414bb705312fbfff7f7b5605f6de2459a (diff)
parent: fdcecdff905cf712279d3ba72ec8ac7bc02be7ff (diff)
1 files changed, 26 insertions, 26 deletions
diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index 020de62bd09c..4c497e839526 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -26,28 +26,6 @@
 /* Patch buffer size */
 #define INSN_BUF_SIZE 32
 
-/* Liveness marks, used for registers and spilled-regs (in stack slots).
- * Read marks propagate upwards until they find a write mark; they record that
- * "one of this state's descendants read this reg" (and therefore the reg is
- * relevant for states_equal() checks).
- * Write marks collect downwards and do not propagate; they record that "the
- * straight-line code that reached this state (from its parent) wrote this reg"
- * (and therefore that reads propagated from this state or its descendants
- * should not propagate to its parent).
- * A state with a write mark can receive read marks; it just won't propagate
- * them to its parent, since the write mark is a property, not of the state,
- * but of the link between it and its parent.  See mark_reg_read() and
- * mark_stack_slot_read() in kernel/bpf/verifier.c.
- */
-enum bpf_reg_liveness {
-	REG_LIVE_NONE = 0, /* reg hasn't been read or written this branch */
-	REG_LIVE_READ32 = 0x1, /* reg was read, so we're sensitive to initial value */
-	REG_LIVE_READ64 = 0x2, /* likewise, but full 64-bit content matters */
-	REG_LIVE_READ = REG_LIVE_READ32 | REG_LIVE_READ64,
-	REG_LIVE_WRITTEN = 0x4, /* reg was written first, screening off later reads */
-	REG_LIVE_DONE = 0x8, /* liveness won't be updating this register anymore */
-};
-
 #define ITER_PREFIX "bpf_iter_"
 
 enum bpf_iter_state {
@@ -212,8 +190,6 @@ struct bpf_reg_state {
 	 * allowed and has the same effect as bpf_sk_release(sk).
 	 */
 	u32 ref_obj_id;
-	/* parentage chain for liveness checking */
-	struct bpf_reg_state *parent;
 	/* Inside the callee two registers can be both PTR_TO_STACK like
 	 * R1=fp-8 and R2=fp-8, but one of them points to this function stack
 	 * while another to the caller's stack. To differentiate them 'frameno'
@@ -226,7 +202,6 @@ struct bpf_reg_state {
 	 * patching which only happens after main verification finished.
 	 */
 	s32 subreg_def;
-	enum bpf_reg_liveness live;
 	/* if (!precise && SCALAR_VALUE) min/max/tnum don't affect safety */
 	bool precise;
 };
@@ -445,6 +420,7 @@ struct bpf_verifier_state {
 
 	bool speculative;
 	bool in_sleepable;
+	bool cleaned;
 
 	/* first and last insn idx of this verifier state */
 	u32 first_insn_idx;
@@ -665,6 +641,7 @@ struct bpf_subprog_info {
 	/* 'start' has to be the first field otherwise find_subprog() won't work */
 	u32 start; /* insn idx of function entry point */
 	u32 linfo_idx; /* The idx to the main_prog->aux->linfo */
+	u32 postorder_start; /* The idx to the env->cfg.insn_postorder */
 	u16 stack_depth; /* max. stack depth used by this function */
 	u16 stack_extra;
 	/* offsets in range [stack_depth .. fastcall_stack_off)
@@ -744,6 +721,8 @@ struct bpf_scc_info {
 	struct bpf_scc_visit visits[];
 };
 
+struct bpf_liveness;
+
 /* single container for all structs
  * one verifier_env per bpf_check() call
  */
@@ -794,7 +773,10 @@ struct bpf_verifier_env {
 	struct {
 		int *insn_state;
 		int *insn_stack;
-		/* vector of instruction indexes sorted in post-order */
+		/*
+		 * vector of instruction indexes sorted in post-order, grouped by subprogram,
+		 * see bpf_subprog_info->postorder_start.
+		 */
 		int *insn_postorder;
 		int cur_stack;
 		/* current position in the insn_postorder vector */
@@ -842,6 +824,7 @@ struct bpf_verifier_env {
 	struct bpf_insn insn_buf[INSN_BUF_SIZE];
 	struct bpf_insn epilogue_buf[INSN_BUF_SIZE];
 	struct bpf_scc_callchain callchain_buf;
+	struct bpf_liveness *liveness;
 	/* array of pointers to bpf_scc_info indexed by SCC id */
 	struct bpf_scc_info **scc_info;
 	u32 scc_cnt;
@@ -1065,4 +1048,21 @@ void print_verifier_state(struct bpf_verifier_env *env, const struct bpf_verifie
 void print_insn_state(struct bpf_verifier_env *env, const struct bpf_verifier_state *vstate,
 		      u32 frameno);
 
+struct bpf_subprog_info *bpf_find_containing_subprog(struct bpf_verifier_env *env, int off);
+int bpf_jmp_offset(struct bpf_insn *insn);
+int bpf_insn_successors(struct bpf_prog *prog, u32 idx, u32 succ[2]);
+void bpf_fmt_stack_mask(char *buf, ssize_t buf_sz, u64 stack_mask);
+bool bpf_calls_callback(struct bpf_verifier_env *env, int insn_idx);
+
+int bpf_stack_liveness_init(struct bpf_verifier_env *env);
+void bpf_stack_liveness_free(struct bpf_verifier_env *env);
+int bpf_update_live_stack(struct bpf_verifier_env *env);
+int bpf_mark_stack_read(struct bpf_verifier_env *env, u32 frameno, u32 insn_idx, u64 mask);
+void bpf_mark_stack_write(struct bpf_verifier_env *env, u32 frameno, u64 mask);
+int bpf_reset_stack_write_marks(struct bpf_verifier_env *env, u32 insn_idx);
+int bpf_commit_stack_write_marks(struct bpf_verifier_env *env);
+int bpf_live_stack_query_init(struct bpf_verifier_env *env, struct bpf_verifier_state *st);
+bool bpf_stack_slot_alive(struct bpf_verifier_env *env, u32 frameno, u32 spi);
+void bpf_reset_live_stack_callchain(struct bpf_verifier_env *env);
+
 #endif /* _LINUX_BPF_VERIFIER_H */
author	Alexei Starovoitov <ast@kernel.org>	2025-09-19 09:27:24 -0700
committer	Alexei Starovoitov <ast@kernel.org>	2025-09-19 09:27:35 -0700
commit	815276dbfbb79abd531f09eca7d3208a847d6a0b (patch)
tree	cdbea79ce3890965b52d94378ea10870349eebc7 /include/linux
parent	8cd189e414bb705312fbfff7f7b5605f6de2459a (diff)
parent	fdcecdff905cf712279d3ba72ec8ac7bc02be7ff (diff)