<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/net/ipv4, branch v7.1-rc7</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>udp: clear skb-&gt;dev before running a sockmap verdict</title>
<updated>2026-06-04T16:01:51+00:00</updated>
<author>
<name>Sechang Lim</name>
<email>rhkrqnwk98@gmail.com</email>
</author>
<published>2026-06-03T16:27:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=3c94f241f776562c489876ff506f366224565c21'/>
<id>3c94f241f776562c489876ff506f366224565c21</id>
<content type='text'>
On the UDP receive path skb-&gt;dev is repurposed as dev_scratch (the
truesize/state cache set by udp_set_dev_scratch()), through the
union { struct net_device *dev; unsigned long dev_scratch; } in sk_buff.

When a UDP socket is in a sockmap, sk_data_ready is
sk_psock_verdict_data_ready(), which calls udp_read_skb() -&gt; recv_actor()
(sk_psock_verdict_recv) to run the attached SK_SKB verdict program in softirq.
If that program calls a socket-lookup helper (bpf_sk_lookup_tcp/udp,
bpf_skc_lookup_tcp), bpf_skc_lookup() does:

	if (skb-&gt;dev)
		caller_net = dev_net(skb-&gt;dev);

skb-&gt;dev still holds the dev_scratch value (a non-NULL integer), so dev_net()
dereferences it as a struct net_device * and the kernel takes a general
protection fault on a non-canonical address in softirq:

  Oops: general protection fault, probably for non-canonical address 0x1010000800004a0
  CPU: 1 UID: 0 PID: 1406 Comm: syz.2.19 Not tainted 7.1.0-rc6 #1 PREEMPT(full)
  RIP: 0010:bpf_skc_lookup net/core/filter.c:7033 [inline]
  RIP: 0010:bpf_sk_lookup+0x45/0x160 net/core/filter.c:7047
  Call Trace:
   &lt;IRQ&gt;
   bpf_prog_4675cb904b7071f8+0x12e/0x14e
   bpf_prog_run_pin_on_cpu+0xc6/0x1f0
   sk_psock_verdict_recv+0x1ba/0x350
   udp_read_skb+0x31a/0x370
   sk_psock_verdict_data_ready+0x2e3/0x600
   __udp_enqueue_schedule_skb+0x4c8/0x650
   udpv6_queue_rcv_one_skb+0x3ec/0x740
   udp6_unicast_rcv_skb+0x11d/0x140
   ip6_protocol_deliver_rcu+0x61e/0x950
   ip6_input_finish+0xa9/0x150
   NF_HOOK+0x286/0x2f0
   ip6_input+0x117/0x220
   NF_HOOK+0x286/0x2f0
   __netif_receive_skb+0x85/0x200
   process_backlog+0x374/0x9a0
   __napi_poll+0x4f/0x1c0
   net_rx_action+0x3b0/0x770
   handle_softirqs+0x15a/0x460
   do_softirq+0x57/0x80
   &lt;/IRQ&gt;

The rmem charge that dev_scratch accounted for is released by skb_recv_udp() on
dequeue, just above, so the scratch is dead by the time recv_actor() runs. Clear
skb-&gt;dev so bpf_skc_lookup() falls back to sock_net(skb-&gt;sk), which
skb_set_owner_sk_safe() set just above.

Fixes: 965b57b469a5 ("net: Introduce a new proto_ops -&gt;read_skb()")
Cc: stable@vger.kernel.org
Signed-off-by: Sechang Lim &lt;rhkrqnwk98@gmail.com&gt;
Reviewed-by: Jiayuan Chen &lt;jiayuan.chen@linux.dev&gt;
Reviewed-by: Eric Dumazet &lt;edumazet@google.com&gt;
Link: https://patch.msgid.link/20260603162737.697215-1-rhkrqnwk98@gmail.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
On the UDP receive path skb-&gt;dev is repurposed as dev_scratch (the
truesize/state cache set by udp_set_dev_scratch()), through the
union { struct net_device *dev; unsigned long dev_scratch; } in sk_buff.

When a UDP socket is in a sockmap, sk_data_ready is
sk_psock_verdict_data_ready(), which calls udp_read_skb() -&gt; recv_actor()
(sk_psock_verdict_recv) to run the attached SK_SKB verdict program in softirq.
If that program calls a socket-lookup helper (bpf_sk_lookup_tcp/udp,
bpf_skc_lookup_tcp), bpf_skc_lookup() does:

	if (skb-&gt;dev)
		caller_net = dev_net(skb-&gt;dev);

skb-&gt;dev still holds the dev_scratch value (a non-NULL integer), so dev_net()
dereferences it as a struct net_device * and the kernel takes a general
protection fault on a non-canonical address in softirq:

  Oops: general protection fault, probably for non-canonical address 0x1010000800004a0
  CPU: 1 UID: 0 PID: 1406 Comm: syz.2.19 Not tainted 7.1.0-rc6 #1 PREEMPT(full)
  RIP: 0010:bpf_skc_lookup net/core/filter.c:7033 [inline]
  RIP: 0010:bpf_sk_lookup+0x45/0x160 net/core/filter.c:7047
  Call Trace:
   &lt;IRQ&gt;
   bpf_prog_4675cb904b7071f8+0x12e/0x14e
   bpf_prog_run_pin_on_cpu+0xc6/0x1f0
   sk_psock_verdict_recv+0x1ba/0x350
   udp_read_skb+0x31a/0x370
   sk_psock_verdict_data_ready+0x2e3/0x600
   __udp_enqueue_schedule_skb+0x4c8/0x650
   udpv6_queue_rcv_one_skb+0x3ec/0x740
   udp6_unicast_rcv_skb+0x11d/0x140
   ip6_protocol_deliver_rcu+0x61e/0x950
   ip6_input_finish+0xa9/0x150
   NF_HOOK+0x286/0x2f0
   ip6_input+0x117/0x220
   NF_HOOK+0x286/0x2f0
   __netif_receive_skb+0x85/0x200
   process_backlog+0x374/0x9a0
   __napi_poll+0x4f/0x1c0
   net_rx_action+0x3b0/0x770
   handle_softirqs+0x15a/0x460
   do_softirq+0x57/0x80
   &lt;/IRQ&gt;

The rmem charge that dev_scratch accounted for is released by skb_recv_udp() on
dequeue, just above, so the scratch is dead by the time recv_actor() runs. Clear
skb-&gt;dev so bpf_skc_lookup() falls back to sock_net(skb-&gt;sk), which
skb_set_owner_sk_safe() set just above.

Fixes: 965b57b469a5 ("net: Introduce a new proto_ops -&gt;read_skb()")
Cc: stable@vger.kernel.org
Signed-off-by: Sechang Lim &lt;rhkrqnwk98@gmail.com&gt;
Reviewed-by: Jiayuan Chen &lt;jiayuan.chen@linux.dev&gt;
Reviewed-by: Eric Dumazet &lt;edumazet@google.com&gt;
Link: https://patch.msgid.link/20260603162737.697215-1-rhkrqnwk98@gmail.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ipv4: restrict IPOPT_SSRR and IPOPT_LSRR options</title>
<updated>2026-06-04T01:53:14+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2026-06-02T16:15:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=d3915a1f5a4bc0ac911032903c3c6ab8df9fcc7c'/>
<id>d3915a1f5a4bc0ac911032903c3c6ab8df9fcc7c</id>
<content type='text'>
This patch restricts setting Loose Source and Record Route (LSRR)
and Strict Source and Record Route (SSRR) IP options to users
with CAP_NET_RAW capability.

This prevents unprivileged applications from forcing packets to route
through attacker-controlled nodes to leak TCP ISN and possibly other
protocol information.

While LSRR and SSRR are commonly filtered in many network environments,
they may still be supported and forwarded along some network paths.

RFC 7126 (Recommendations on Filtering of IPv4 Packets Containing
IPv4 Options) recommend to drop these options in 4.3 and 4.4.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: Tamir Shahar &lt;tamirthesis@gmail.com&gt;
Reported-by: Amit Klein &lt;aksecurity@gmail.com&gt;
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Reviewed-by: David Ahern &lt;dsahern@kernel.org&gt;
Reviewed-by: Ido Schimmel &lt;idosch@nvidia.com&gt;
Link: https://patch.msgid.link/20260602161547.2642155-1-edumazet@google.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch restricts setting Loose Source and Record Route (LSRR)
and Strict Source and Record Route (SSRR) IP options to users
with CAP_NET_RAW capability.

This prevents unprivileged applications from forcing packets to route
through attacker-controlled nodes to leak TCP ISN and possibly other
protocol information.

While LSRR and SSRR are commonly filtered in many network environments,
they may still be supported and forwarded along some network paths.

RFC 7126 (Recommendations on Filtering of IPv4 Packets Containing
IPv4 Options) recommend to drop these options in 4.3 and 4.4.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: Tamir Shahar &lt;tamirthesis@gmail.com&gt;
Reported-by: Amit Klein &lt;aksecurity@gmail.com&gt;
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Reviewed-by: David Ahern &lt;dsahern@kernel.org&gt;
Reviewed-by: Ido Schimmel &lt;idosch@nvidia.com&gt;
Link: https://patch.msgid.link/20260602161547.2642155-1-edumazet@google.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tcp: Add preempt_{disable,enable}_nested() in reqsk_queue_hash_req().</title>
<updated>2026-06-02T18:55:56+00:00</updated>
<author>
<name>Kuniyuki Iwashima</name>
<email>kuniyu@google.com</email>
</author>
<published>2026-06-01T18:20:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=e10902df24488ca722303133acfc82490f7d59ad'/>
<id>e10902df24488ca722303133acfc82490f7d59ad</id>
<content type='text'>
syzbot reported a weird reqsk-&gt;rsk_refcnt underflow in
__inet_csk_reqsk_queue_drop().

The captured reqsk_put() in __inet_csk_reqsk_queue_drop()
is called only when it successfully removes reqsk from ehash.

Moreover, reqsk_timer_handler() calls another reqsk_put()
after that.

This indicates that the reqsk was missing both refcnts for
ehash and the timer itself.

Since all the syzbot reports had PREEMPT_RT enabled, the only
possible scenario is that reqsk_queue_hash_req() is preempted
after mod_timer() and before refcount_set(), and then the timer
triggered after 1s aborts the reqsk due to its listener's close().

Let's wrap mod_timer() and refcount_set() with
preempt_disable_nested() and preempt_enable_nested().

Note that inet_ehash_insert() holds the normal spin_lock()
(mutex in PREEMPT_RT), so it must be called outside of
preempt_disable_nested(), but this is fine.

The lookup path just ignores 0 sk_refcnt entries in ehash
and tries to create another reqsk, but this will fail at
inet_ehash_insert().

[0]:
refcount_t: underflow; use-after-free.
WARNING: lib/refcount.c:28 at refcount_warn_saturate+0xb2/0x110 lib/refcount.c:28, CPU#0: ktimers/0/16
Modules linked in:
CPU: 0 UID: 0 PID: 16 Comm: ktimers/0 Tainted: G             L      syzkaller #0 PREEMPT_{RT,(full)}
Tainted: [L]=SOFTLOCKUP
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/18/2026
RIP: 0010:refcount_warn_saturate+0xb2/0x110 lib/refcount.c:28
Code: e4 7d d1 0a 67 48 0f b9 3a eb 4a e8 38 3d 23 fd 48 8d 3d e1 7d d1 0a 67 48 0f b9 3a eb 37 e8 25 3d 23 fd 48 8d 3d de 7d d1 0a &lt;67&gt; 48 0f b9 3a eb 24 e8 12 3d 23 fd 48 8d 3d db 7d d1 0a 67 48 0f
RSP: 0000:ffffc90000157948 EFLAGS: 00010246
RAX: ffffffff84a1301b RBX: 0000000000000003 RCX: ffff88801ca98000
RDX: 0000000000000100 RSI: 0000000000000000 RDI: ffffffff8f72ae00
RBP: ffffffff99ae3b01 R08: ffff88801ca98000 R09: 0000000000000005
R10: 0000000000000100 R11: 0000000000000004 R12: ffff8880425ef568
R13: ffff8880425ef4f8 R14: ffff8880425ef578 R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffff888126386000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f7b46710e9c CR3: 000000000dbb6000 CR4: 00000000003526f0
Call Trace:
 &lt;TASK&gt;
 __refcount_sub_and_test include/linux/refcount.h:400 [inline]
 __refcount_dec_and_test include/linux/refcount.h:432 [inline]
 refcount_dec_and_test include/linux/refcount.h:450 [inline]
 reqsk_put include/net/request_sock.h:136 [inline]
 __inet_csk_reqsk_queue_drop+0x3ce/0x440 net/ipv4/inet_connection_sock.c:1007
 reqsk_timer_handler+0x651/0xdf0 net/ipv4/inet_connection_sock.c:1137
 call_timer_fn+0x192/0x5e0 kernel/time/timer.c:1748
 expire_timers kernel/time/timer.c:1799 [inline]
 __run_timers kernel/time/timer.c:2374 [inline]
 __run_timer_base+0x6a3/0x9f0 kernel/time/timer.c:2386
 run_timer_base kernel/time/timer.c:2395 [inline]
 run_timer_softirq+0x67/0x170 kernel/time/timer.c:2403
 handle_softirqs+0x1de/0x6d0 kernel/softirq.c:622
 __do_softirq kernel/softirq.c:656 [inline]
 run_ktimerd+0x69/0x100 kernel/softirq.c:1151
 smpboot_thread_fn+0x541/0xa50 kernel/smpboot.c:160
 kthread+0x388/0x470 kernel/kthread.c:436
 ret_from_fork+0x514/0xb70 arch/x86/kernel/process.c:158
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
 &lt;/TASK&gt;

Fixes: d2d6422f8bd1 ("x86: Allow to enable PREEMPT_RT.")
Reported-by: syzbot+e809069bc15f26300526@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/6a1a7bcf.0a9e871e.332604.000b.GAE@google.com/
Signed-off-by: Kuniyuki Iwashima &lt;kuniyu@google.com&gt;
Reviewed-by: Eric Dumazet &lt;edumazet@google.com&gt;
Reviewed-by: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Link: https://patch.msgid.link/20260601182101.3183993-1-kuniyu@google.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
syzbot reported a weird reqsk-&gt;rsk_refcnt underflow in
__inet_csk_reqsk_queue_drop().

The captured reqsk_put() in __inet_csk_reqsk_queue_drop()
is called only when it successfully removes reqsk from ehash.

Moreover, reqsk_timer_handler() calls another reqsk_put()
after that.

This indicates that the reqsk was missing both refcnts for
ehash and the timer itself.

Since all the syzbot reports had PREEMPT_RT enabled, the only
possible scenario is that reqsk_queue_hash_req() is preempted
after mod_timer() and before refcount_set(), and then the timer
triggered after 1s aborts the reqsk due to its listener's close().

Let's wrap mod_timer() and refcount_set() with
preempt_disable_nested() and preempt_enable_nested().

Note that inet_ehash_insert() holds the normal spin_lock()
(mutex in PREEMPT_RT), so it must be called outside of
preempt_disable_nested(), but this is fine.

The lookup path just ignores 0 sk_refcnt entries in ehash
and tries to create another reqsk, but this will fail at
inet_ehash_insert().

[0]:
refcount_t: underflow; use-after-free.
WARNING: lib/refcount.c:28 at refcount_warn_saturate+0xb2/0x110 lib/refcount.c:28, CPU#0: ktimers/0/16
Modules linked in:
CPU: 0 UID: 0 PID: 16 Comm: ktimers/0 Tainted: G             L      syzkaller #0 PREEMPT_{RT,(full)}
Tainted: [L]=SOFTLOCKUP
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/18/2026
RIP: 0010:refcount_warn_saturate+0xb2/0x110 lib/refcount.c:28
Code: e4 7d d1 0a 67 48 0f b9 3a eb 4a e8 38 3d 23 fd 48 8d 3d e1 7d d1 0a 67 48 0f b9 3a eb 37 e8 25 3d 23 fd 48 8d 3d de 7d d1 0a &lt;67&gt; 48 0f b9 3a eb 24 e8 12 3d 23 fd 48 8d 3d db 7d d1 0a 67 48 0f
RSP: 0000:ffffc90000157948 EFLAGS: 00010246
RAX: ffffffff84a1301b RBX: 0000000000000003 RCX: ffff88801ca98000
RDX: 0000000000000100 RSI: 0000000000000000 RDI: ffffffff8f72ae00
RBP: ffffffff99ae3b01 R08: ffff88801ca98000 R09: 0000000000000005
R10: 0000000000000100 R11: 0000000000000004 R12: ffff8880425ef568
R13: ffff8880425ef4f8 R14: ffff8880425ef578 R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffff888126386000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f7b46710e9c CR3: 000000000dbb6000 CR4: 00000000003526f0
Call Trace:
 &lt;TASK&gt;
 __refcount_sub_and_test include/linux/refcount.h:400 [inline]
 __refcount_dec_and_test include/linux/refcount.h:432 [inline]
 refcount_dec_and_test include/linux/refcount.h:450 [inline]
 reqsk_put include/net/request_sock.h:136 [inline]
 __inet_csk_reqsk_queue_drop+0x3ce/0x440 net/ipv4/inet_connection_sock.c:1007
 reqsk_timer_handler+0x651/0xdf0 net/ipv4/inet_connection_sock.c:1137
 call_timer_fn+0x192/0x5e0 kernel/time/timer.c:1748
 expire_timers kernel/time/timer.c:1799 [inline]
 __run_timers kernel/time/timer.c:2374 [inline]
 __run_timer_base+0x6a3/0x9f0 kernel/time/timer.c:2386
 run_timer_base kernel/time/timer.c:2395 [inline]
 run_timer_softirq+0x67/0x170 kernel/time/timer.c:2403
 handle_softirqs+0x1de/0x6d0 kernel/softirq.c:622
 __do_softirq kernel/softirq.c:656 [inline]
 run_ktimerd+0x69/0x100 kernel/softirq.c:1151
 smpboot_thread_fn+0x541/0xa50 kernel/smpboot.c:160
 kthread+0x388/0x470 kernel/kthread.c:436
 ret_from_fork+0x514/0xb70 arch/x86/kernel/process.c:158
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
 &lt;/TASK&gt;

Fixes: d2d6422f8bd1 ("x86: Allow to enable PREEMPT_RT.")
Reported-by: syzbot+e809069bc15f26300526@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/6a1a7bcf.0a9e871e.332604.000b.GAE@google.com/
Signed-off-by: Kuniyuki Iwashima &lt;kuniyu@google.com&gt;
Reviewed-by: Eric Dumazet &lt;edumazet@google.com&gt;
Reviewed-by: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Link: https://patch.msgid.link/20260601182101.3183993-1-kuniyu@google.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'ipsec-2026-05-29' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec</title>
<updated>2026-05-29T19:57:23+00:00</updated>
<author>
<name>Jakub Kicinski</name>
<email>kuba@kernel.org</email>
</author>
<published>2026-05-29T19:57:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=c84ff04def255edb51e57c9f969efdfade0da16a'/>
<id>c84ff04def255edb51e57c9f969efdfade0da16a</id>
<content type='text'>
Steffen Klassert says:

====================
pull request (net): ipsec 2026-05-29

1) xfrm: route MIGRATE notifications to caller's netns
   Thread the caller's netns through km_migrate() so that
   MIGRATE notifications go to the issuing netns, fixing both the
   init_net listener leak and MOBIKE notifications inside
   non-init netns. From Maoyi Xie.

2) xfrm: ipcomp: Free destination pages on acomp errors
   Move the out_free_req label up so that allocated destination
   pages are released on decompression errors, not only on success.
   From Herbert Xu.

3) xfrm: Check for underflow in xfrm_state_mtu
   Reject configurations that cause xfrm_state_mtu() to underflow,
   preventing a negative TFCPAD value from becoming a memset size
   that triggers an out-of-bounds write of several terabytes.
   From David Ahern.

4) xfrm: ah: use skb_to_full_sk in async output callbacks
   Convert the possibly-incomplete skb-&gt;sk to a full socket pointer
   in async AH callbacks so that a request_sock or timewait_sock
   never reaches xfrm_output_resume() downstream consumers.
   From Michael Bommarito.

5) Add and revert: esp: fix page frag reference leak on skb_to_sgvec failure
   The patch does not fix te issue completely.

6) xfrm: esp: restore combined single-frag length gate
   Check the aligned post-trailer combined length against a page limit
   in the fast path, preventing skb_page_frag_refill() from falling
   back to a page too small for the destination scatterlist.
   From Jingguo Tan.

7) xfrm: iptfs: reset runtime state when cloning SAs
   Reinitialise the clone's mode_data runtime objects before
   publishing it, preventing queued skbs from being freed with
   list state copied from the original SA when migration fails.
   From Shaomin Chen.

8) xfrm: move policy_bydst RCU sync from per-netns .exit to .pre_exit
   Flush policy tables and drain the workqueue in a .pre_exit handler
   so that cleanup_net() pays one RCU grace period per batch instead
   of one per namespace, fixing stalls at high CLONE_NEWNET rates.
   From Usama Arif.

9) xfrm: input: hold netns during deferred transport reinjection
   Take a netns reference when queueing deferred transport reinjection
   work and drop it after the callback completes, keeping the skb-&gt;cb
   net pointer valid until the deferred work runs.
   From Zhengchuan Liang.

* tag 'ipsec-2026-05-29' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec:
  Revert "esp: fix page frag reference leak on skb_to_sgvec failure"
  xfrm: input: hold netns during deferred transport reinjection
  xfrm: move policy_bydst RCU sync from per-netns .exit to .pre_exit
  xfrm: iptfs: reset runtime state when cloning SAs
  xfrm: esp: restore combined single-frag length gate
  esp: fix page frag reference leak on skb_to_sgvec failure
  xfrm: ah: use skb_to_full_sk in async output callbacks
  xfrm: Check for underflow in xfrm_state_mtu
  xfrm: ipcomp: Free destination pages on acomp errors
  xfrm: route MIGRATE notifications to caller's netns
====================

Link: https://patch.msgid.link/20260529092648.3878973-1-steffen.klassert@secunet.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Steffen Klassert says:

====================
pull request (net): ipsec 2026-05-29

1) xfrm: route MIGRATE notifications to caller's netns
   Thread the caller's netns through km_migrate() so that
   MIGRATE notifications go to the issuing netns, fixing both the
   init_net listener leak and MOBIKE notifications inside
   non-init netns. From Maoyi Xie.

2) xfrm: ipcomp: Free destination pages on acomp errors
   Move the out_free_req label up so that allocated destination
   pages are released on decompression errors, not only on success.
   From Herbert Xu.

3) xfrm: Check for underflow in xfrm_state_mtu
   Reject configurations that cause xfrm_state_mtu() to underflow,
   preventing a negative TFCPAD value from becoming a memset size
   that triggers an out-of-bounds write of several terabytes.
   From David Ahern.

4) xfrm: ah: use skb_to_full_sk in async output callbacks
   Convert the possibly-incomplete skb-&gt;sk to a full socket pointer
   in async AH callbacks so that a request_sock or timewait_sock
   never reaches xfrm_output_resume() downstream consumers.
   From Michael Bommarito.

5) Add and revert: esp: fix page frag reference leak on skb_to_sgvec failure
   The patch does not fix te issue completely.

6) xfrm: esp: restore combined single-frag length gate
   Check the aligned post-trailer combined length against a page limit
   in the fast path, preventing skb_page_frag_refill() from falling
   back to a page too small for the destination scatterlist.
   From Jingguo Tan.

7) xfrm: iptfs: reset runtime state when cloning SAs
   Reinitialise the clone's mode_data runtime objects before
   publishing it, preventing queued skbs from being freed with
   list state copied from the original SA when migration fails.
   From Shaomin Chen.

8) xfrm: move policy_bydst RCU sync from per-netns .exit to .pre_exit
   Flush policy tables and drain the workqueue in a .pre_exit handler
   so that cleanup_net() pays one RCU grace period per batch instead
   of one per namespace, fixing stalls at high CLONE_NEWNET rates.
   From Usama Arif.

9) xfrm: input: hold netns during deferred transport reinjection
   Take a netns reference when queueing deferred transport reinjection
   work and drop it after the callback completes, keeping the skb-&gt;cb
   net pointer valid until the deferred work runs.
   From Zhengchuan Liang.

* tag 'ipsec-2026-05-29' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec:
  Revert "esp: fix page frag reference leak on skb_to_sgvec failure"
  xfrm: input: hold netns during deferred transport reinjection
  xfrm: move policy_bydst RCU sync from per-netns .exit to .pre_exit
  xfrm: iptfs: reset runtime state when cloning SAs
  xfrm: esp: restore combined single-frag length gate
  esp: fix page frag reference leak on skb_to_sgvec failure
  xfrm: ah: use skb_to_full_sk in async output callbacks
  xfrm: Check for underflow in xfrm_state_mtu
  xfrm: ipcomp: Free destination pages on acomp errors
  xfrm: route MIGRATE notifications to caller's netns
====================

Link: https://patch.msgid.link/20260529092648.3878973-1-steffen.klassert@secunet.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Revert "esp: fix page frag reference leak on skb_to_sgvec failure"</title>
<updated>2026-05-29T08:23:25+00:00</updated>
<author>
<name>Steffen Klassert</name>
<email>steffen.klassert@secunet.com</email>
</author>
<published>2026-05-29T08:23:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=6851161feb01cea41358c9ec304bd2f981fc8505'/>
<id>6851161feb01cea41358c9ec304bd2f981fc8505</id>
<content type='text'>
This reverts commit 2982e599fff6faa21c8df147d96fc7af6c1a2f24.

The patch does not fully fix the issue and the Author does
not match the 'Signed-off-by:' tag, so revert it for now.

Signed-off-by: Steffen Klassert &lt;steffen.klassert@secunet.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This reverts commit 2982e599fff6faa21c8df147d96fc7af6c1a2f24.

The patch does not fully fix the issue and the Author does
not match the 'Signed-off-by:' tag, so revert it for now.

Signed-off-by: Steffen Klassert &lt;steffen.klassert@secunet.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tunnels: do not assume transport header in iptunnel_pmtud_check_icmp()</title>
<updated>2026-05-27T01:11:47+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2026-05-22T11:55:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=509323077ef79a26ba0c60bb556e45c12c398b2d'/>
<id>509323077ef79a26ba0c60bb556e45c12c398b2d</id>
<content type='text'>
In some cases, iptunnel_pmtud_check_icmp() can be called while
skb transport header is not set.

This triggers an out-of-bound access, because
(typeof(skb-&gt;transport_header))~0U is 65535.

Access the icmp header based on IPv4 network header,
after making sure icmp-&gt;type is present in skb linear part.

Note that iptunnel_pmtud_check_icmpv6()) is fine.

Fixes: 4cb47a8644cc ("tunnels: PMTU discovery support for directly bridged IP packets")
Reported-by: Damiano Melotti &lt;melotti@google.com&gt;
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Reviewed-by: Kuniyuki Iwashima &lt;kuniyu@google.com&gt;
Link: https://patch.msgid.link/20260522115512.1519110-1-edumazet@google.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In some cases, iptunnel_pmtud_check_icmp() can be called while
skb transport header is not set.

This triggers an out-of-bound access, because
(typeof(skb-&gt;transport_header))~0U is 65535.

Access the icmp header based on IPv4 network header,
after making sure icmp-&gt;type is present in skb linear part.

Note that iptunnel_pmtud_check_icmpv6()) is fine.

Fixes: 4cb47a8644cc ("tunnels: PMTU discovery support for directly bridged IP packets")
Reported-by: Damiano Melotti &lt;melotti@google.com&gt;
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Reviewed-by: Kuniyuki Iwashima &lt;kuniyu@google.com&gt;
Link: https://patch.msgid.link/20260522115512.1519110-1-edumazet@google.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tunnels: load network headers after skb_cow() in iptunnel_pmtud_build_icmp[v6]()</title>
<updated>2026-05-27T01:10:25+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2026-05-25T20:13:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=b4bc94353050b1fa7b702bd4c6600710dd926cff'/>
<id>b4bc94353050b1fa7b702bd4c6600710dd926cff</id>
<content type='text'>
Sashiko found that iptunnel_pmtud_build_icmp() and
iptunnel_pmtud_build_icmpv6() were caching ip_hdr() and ipv6_hdr()
before an skb_cow() call which can reallocate skb-&gt;head.

Fix this possible UAF by initializing the local variables
after the skb_cow() call.

Remove skb_reset_network_header() calls which were not needed.

Fixes: 4cb47a8644cc ("tunnels: PMTU discovery support for directly bridged IP packets")
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Reviewed-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
Link: https://patch.msgid.link/20260525201335.2361845-1-edumazet@google.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Sashiko found that iptunnel_pmtud_build_icmp() and
iptunnel_pmtud_build_icmpv6() were caching ip_hdr() and ipv6_hdr()
before an skb_cow() call which can reallocate skb-&gt;head.

Fix this possible UAF by initializing the local variables
after the skb_cow() call.

Remove skb_reset_network_header() calls which were not needed.

Fixes: 4cb47a8644cc ("tunnels: PMTU discovery support for directly bridged IP packets")
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Reviewed-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
Link: https://patch.msgid.link/20260525201335.2361845-1-edumazet@google.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ipv4: free net-&gt;ipv4.sysctl_local_reserved_ports after unregister_net_sysctl_table()</title>
<updated>2026-05-23T02:05:31+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2026-05-21T12:21:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=87a1e0fe7776da7ab411be332b4be58ac8840d10'/>
<id>87a1e0fe7776da7ab411be332b4be58ac8840d10</id>
<content type='text'>
ipv4_sysctl_exit_net() is currently freeing net-&gt;ipv4.sysctl_local_reserved_ports
too soon.

Only after unregister_net_sysctl_table() we can be sure no threads can possibly
use the sysctls, including /proc/sys/net/ipv4/ip_local_reserved_ports.

Fixes: 122ff243f5f1 ("ipv4: make ip_local_reserved_ports per netns")
Reported-by: Ji'an Zhou &lt;eilaimemedsnaimel@gmail.com&gt;
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Cong Wang &lt;xiyou.wangcong@gmail.com&gt;
Reviewed-by: Jason Xing &lt;kerneljasonxing@gmail.com&gt;
Reviewed-by: Jiayuan Chen &lt;jiayuan.chen@linux.dev&gt;
Link: https://patch.msgid.link/20260521122147.3584624-1-edumazet@google.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
ipv4_sysctl_exit_net() is currently freeing net-&gt;ipv4.sysctl_local_reserved_ports
too soon.

Only after unregister_net_sysctl_table() we can be sure no threads can possibly
use the sysctls, including /proc/sys/net/ipv4/ip_local_reserved_ports.

Fixes: 122ff243f5f1 ("ipv4: make ip_local_reserved_ports per netns")
Reported-by: Ji'an Zhou &lt;eilaimemedsnaimel@gmail.com&gt;
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Cong Wang &lt;xiyou.wangcong@gmail.com&gt;
Reviewed-by: Jason Xing &lt;kerneljasonxing@gmail.com&gt;
Reviewed-by: Jiayuan Chen &lt;jiayuan.chen@linux.dev&gt;
Link: https://patch.msgid.link/20260521122147.3584624-1-edumazet@google.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>xfrm: esp: restore combined single-frag length gate</title>
<updated>2026-05-22T07:20:26+00:00</updated>
<author>
<name>Jingguo Tan</name>
<email>tanjingguo@huawei.com</email>
</author>
<published>2026-05-18T09:06:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=dfa0d7b0ff1eb6b2c416b8fdb9b4f2cefba57a40'/>
<id>dfa0d7b0ff1eb6b2c416b8fdb9b4f2cefba57a40</id>
<content type='text'>
The ESP out-of-place fast path appends the trailer in esp_output_head()
before esp_output_tail() allocates the destination page frag. The
head-side gate currently checks skb-&gt;data_len and tailen separately, but
the tail code allocates a single destination frag from the combined
post-trailer skb-&gt;data_len.

Reject the page-frag fast path when the combined aligned length exceeds a
page. Otherwise skb_page_frag_refill() may fall back to a single page while
the destination sg still spans the combined skb-&gt;data_len.

Restore this combined-length page gate for both IPv4 and IPv6.

Fixes: 5bd8baab087d ("esp: limit skb_page_frag_refill use to a single page")
Cc: stable@vger.kernel.org
Signed-off-by: Lin Ma &lt;malin89@huawei.com&gt;
Signed-off-by: Chenyuan Mi &lt;michenyuan@huawei.com&gt;
Signed-off-by: Jingguo Tan &lt;tanjingguo@huawei.com&gt;
Reviewed-by: Sabrina Dubroca &lt;sd@queasysnail.net&gt;
Signed-off-by: Steffen Klassert &lt;steffen.klassert@secunet.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The ESP out-of-place fast path appends the trailer in esp_output_head()
before esp_output_tail() allocates the destination page frag. The
head-side gate currently checks skb-&gt;data_len and tailen separately, but
the tail code allocates a single destination frag from the combined
post-trailer skb-&gt;data_len.

Reject the page-frag fast path when the combined aligned length exceeds a
page. Otherwise skb_page_frag_refill() may fall back to a single page while
the destination sg still spans the combined skb-&gt;data_len.

Restore this combined-length page gate for both IPv4 and IPv6.

Fixes: 5bd8baab087d ("esp: limit skb_page_frag_refill use to a single page")
Cc: stable@vger.kernel.org
Signed-off-by: Lin Ma &lt;malin89@huawei.com&gt;
Signed-off-by: Chenyuan Mi &lt;michenyuan@huawei.com&gt;
Signed-off-by: Jingguo Tan &lt;tanjingguo@huawei.com&gt;
Reviewed-by: Sabrina Dubroca &lt;sd@queasysnail.net&gt;
Signed-off-by: Steffen Klassert &lt;steffen.klassert@secunet.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>esp: fix page frag reference leak on skb_to_sgvec failure</title>
<updated>2026-05-22T07:19:37+00:00</updated>
<author>
<name>e521588</name>
<email>alessandro.schino@sbb.ch</email>
</author>
<published>2026-05-20T07:27:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=2982e599fff6faa21c8df147d96fc7af6c1a2f24'/>
<id>2982e599fff6faa21c8df147d96fc7af6c1a2f24</id>
<content type='text'>
In esp_output_tail(), when esp-&gt;inplace is false, the old skb page frags
are replaced with a new page from the xfrm page_frag cache. The source
scatterlist (sg) is built from the old frags before the replacement, and
esp_ssg_unref() is responsible for releasing the old page references
after the crypto operation completes.

However, if the second skb_to_sgvec() call (which builds the destination
scatterlist from the new page) fails, the code jumps to error_free which
only calls kfree(tmp). The old page frag references captured in the
source scatterlist are never released:

  1. sg[] is built from old frags via skb_to_sgvec() (no extra get_page)
  2. nr_frags is set to 1 and frag[0] is replaced with the new page
  3. Second skb_to_sgvec() fails -&gt; goto error_free
  4. kfree(tmp) frees the sg[] memory but old frags are not unref'd
  5. kfree_skb() only releases frag[0] (the new page), not the old ones

Fix this by adding a bool parameter to esp_ssg_unref() that, when true,
unconditionally unrefs the source scatterlist frags without checking
req-&gt;src and req-&gt;dst, since those fields are not yet initialized by
aead_request_set_crypt() at the point of the error. Existing callers
pass false to preserve the original behavior.

The same issue exists in both esp4 and esp6 as the code is identical.

Fixes: cac2661c53f3 ("esp4: Avoid skb_cow_data whenever possible")
Fixes: 03e2a30f6a27 ("esp6: Avoid skb_cow_data whenever possible")

Signed-off-by: Alessandro Schino &lt;7991aleschino@gmail.com&gt;
Signed-off-by: Steffen Klassert &lt;steffen.klassert@secunet.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In esp_output_tail(), when esp-&gt;inplace is false, the old skb page frags
are replaced with a new page from the xfrm page_frag cache. The source
scatterlist (sg) is built from the old frags before the replacement, and
esp_ssg_unref() is responsible for releasing the old page references
after the crypto operation completes.

However, if the second skb_to_sgvec() call (which builds the destination
scatterlist from the new page) fails, the code jumps to error_free which
only calls kfree(tmp). The old page frag references captured in the
source scatterlist are never released:

  1. sg[] is built from old frags via skb_to_sgvec() (no extra get_page)
  2. nr_frags is set to 1 and frag[0] is replaced with the new page
  3. Second skb_to_sgvec() fails -&gt; goto error_free
  4. kfree(tmp) frees the sg[] memory but old frags are not unref'd
  5. kfree_skb() only releases frag[0] (the new page), not the old ones

Fix this by adding a bool parameter to esp_ssg_unref() that, when true,
unconditionally unrefs the source scatterlist frags without checking
req-&gt;src and req-&gt;dst, since those fields are not yet initialized by
aead_request_set_crypt() at the point of the error. Existing callers
pass false to preserve the original behavior.

The same issue exists in both esp4 and esp6 as the code is identical.

Fixes: cac2661c53f3 ("esp4: Avoid skb_cow_data whenever possible")
Fixes: 03e2a30f6a27 ("esp6: Avoid skb_cow_data whenever possible")

Signed-off-by: Alessandro Schino &lt;7991aleschino@gmail.com&gt;
Signed-off-by: Steffen Klassert &lt;steffen.klassert@secunet.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
