<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/net/ipv4, branch v7.2-rc1</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>Merge tag 'net-7.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net</title>
<updated>2026-06-25T19:25:36+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2026-06-25T19:25:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=805185b7c7a1069e407b6f7b3bc98e44d415f484'/>
<id>805185b7c7a1069e407b6f7b3bc98e44d415f484</id>
<content type='text'>
Pull networking fixes from Jakub Kicinski:
 "Including fixes from netfilter and IPsec.

  Current release - regressions:

   - do not acquire dev-&gt;tx_global_lock in netdev_watchdog_up()

   - ethtool: keep rtnl_lock for ops using ethtool_op_get_link()

   - fix deadlock in nested UP notifier events

  Current release - new code bugs:

   - eth:
      - cn20k: fix subbank free list indexing for search order
      - airoha: fix BQL underflow in shared QDMA TX ring

  Previous releases - regressions:

   - netfilter:
     - flowtable: fix offloaded ct timeout never being extended
     - nf_conncount: prevent connlimit drops for early confirmed ct

  Previous releases - always broken:

   - require CAP_NET_ADMIN in the originating netns when modifying
     cross-netns devices

   - report NAPI thread PID in the caller's pid namespace

   - mac802154: fix dirty frag in in-place crypto for IOT radios

   - sctp: hold socket lock when dumping endpoints in sctp_diag, avoid
     an overflow

   - eth: gve: fix header buffer corruption with header-split and HW-GRO

   - af_key: initialize alg_key_len for IPComp states, prevent OOB read"

* tag 'net-7.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (213 commits)
  selftests: bonding: add a test for VLAN propagation over a bonded real device
  vlan: defer real device state propagation to netdev_work
  net: add the driver-facing netdev_work scheduling API
  net: turn the rx_mode work into a generic netdev_work facility
  net: ethtool: keep rtnl_lock for ops using ethtool_op_get_link()
  rxrpc: Fix rxrpc_rotate_tx_rotate() to check there's something to rotate
  rxrpc: Fix leak of released call in recvmsg(MSG_PEEK)
  rxrpc: Fix socket notification race
  rxrpc: Fix potential infinite loop in rxrpc_recvmsg()
  rxrpc: Fix oob challenge leak in cleanup after notification failure
  rxrpc: Fix the reception of a reply packet before data transmission
  afs: Fix uncancelled rxrpc OOB message handler
  afs: Fix further netns teardown to cancel the preallocation charger
  rxrpc: Fix double unlock in rxrpc_recvmsg()
  rxrpc: Fix leak of connection from OOB challenge
  rxrpc: Fix ACKALL packet handling
  net: hns3: differentiate autoneg default values between copper and fiber
  net: hns3: fix permanent link down deadlock after reset
  net: hns3: refactor MAC autoneg and speed configuration
  net: hns3: unify copper port ksettings configuration path
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull networking fixes from Jakub Kicinski:
 "Including fixes from netfilter and IPsec.

  Current release - regressions:

   - do not acquire dev-&gt;tx_global_lock in netdev_watchdog_up()

   - ethtool: keep rtnl_lock for ops using ethtool_op_get_link()

   - fix deadlock in nested UP notifier events

  Current release - new code bugs:

   - eth:
      - cn20k: fix subbank free list indexing for search order
      - airoha: fix BQL underflow in shared QDMA TX ring

  Previous releases - regressions:

   - netfilter:
     - flowtable: fix offloaded ct timeout never being extended
     - nf_conncount: prevent connlimit drops for early confirmed ct

  Previous releases - always broken:

   - require CAP_NET_ADMIN in the originating netns when modifying
     cross-netns devices

   - report NAPI thread PID in the caller's pid namespace

   - mac802154: fix dirty frag in in-place crypto for IOT radios

   - sctp: hold socket lock when dumping endpoints in sctp_diag, avoid
     an overflow

   - eth: gve: fix header buffer corruption with header-split and HW-GRO

   - af_key: initialize alg_key_len for IPComp states, prevent OOB read"

* tag 'net-7.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (213 commits)
  selftests: bonding: add a test for VLAN propagation over a bonded real device
  vlan: defer real device state propagation to netdev_work
  net: add the driver-facing netdev_work scheduling API
  net: turn the rx_mode work into a generic netdev_work facility
  net: ethtool: keep rtnl_lock for ops using ethtool_op_get_link()
  rxrpc: Fix rxrpc_rotate_tx_rotate() to check there's something to rotate
  rxrpc: Fix leak of released call in recvmsg(MSG_PEEK)
  rxrpc: Fix socket notification race
  rxrpc: Fix potential infinite loop in rxrpc_recvmsg()
  rxrpc: Fix oob challenge leak in cleanup after notification failure
  rxrpc: Fix the reception of a reply packet before data transmission
  afs: Fix uncancelled rxrpc OOB message handler
  afs: Fix further netns teardown to cancel the preallocation charger
  rxrpc: Fix double unlock in rxrpc_recvmsg()
  rxrpc: Fix leak of connection from OOB challenge
  rxrpc: Fix ACKALL packet handling
  net: hns3: differentiate autoneg default values between copper and fiber
  net: hns3: fix permanent link down deadlock after reset
  net: hns3: refactor MAC autoneg and speed configuration
  net: hns3: unify copper port ksettings configuration path
  ...
</pre>
</div>
</content>
</entry>
<entry>
<title>net: udp_tunnel: prevent double queueing in udp_tunnel_nic_device_sync</title>
<updated>2026-06-25T15:35:51+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2026-06-25T06:59:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=ecf69d4b43370c587e48d4d70289dbdb7e039d4d'/>
<id>ecf69d4b43370c587e48d4d70289dbdb7e039d4d</id>
<content type='text'>
Yue Sun reported a use-after-free and debugobjects warning in
udp_tunnel_nic_device_sync_work() during concurrent device operations.

The workqueue core clears the internal pending bit before invoking the
worker. At that point, a concurrent thread can queue the work again.
When the already running worker eventually clears the work_pending flag
to 0, it mistakenly clears the flag for the newly queued instance.
udp_tunnel_nic_unregister() then observes work_pending as 0 and frees
the structure while the second work item is still active in the queue,
leading to UAF.

Fix this by returning early in udp_tunnel_nic_device_sync() if
work_pending is already set, preventing redundant work queueing.

Fixes: cc4e3835eff4 ("udp_tunnel: add central NIC RX port offload infrastructure")
Reported-by: Yue Sun &lt;samsun1006219@gmail.com&gt;
Suggested-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Link: https://patch.msgid.link/20260625065938.654652-2-edumazet@google.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Yue Sun reported a use-after-free and debugobjects warning in
udp_tunnel_nic_device_sync_work() during concurrent device operations.

The workqueue core clears the internal pending bit before invoking the
worker. At that point, a concurrent thread can queue the work again.
When the already running worker eventually clears the work_pending flag
to 0, it mistakenly clears the flag for the newly queued instance.
udp_tunnel_nic_unregister() then observes work_pending as 0 and frees
the structure while the second work item is still active in the queue,
leading to UAF.

Fix this by returning early in udp_tunnel_nic_device_sync() if
work_pending is already set, preventing redundant work queueing.

Fixes: cc4e3835eff4 ("udp_tunnel: add central NIC RX port offload infrastructure")
Reported-by: Yue Sun &lt;samsun1006219@gmail.com&gt;
Suggested-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Link: https://patch.msgid.link/20260625065938.654652-2-edumazet@google.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net/tcp-ao: fix use-after-free of key in del_async path</title>
<updated>2026-06-25T02:25:35+00:00</updated>
<author>
<name>HanQuan</name>
<email>eilaimemedsnaimel@gmail.com</email>
</author>
<published>2026-06-23T01:52:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=5ba9950bc9078e19b69cca1e56d1553b125c6857'/>
<id>5ba9950bc9078e19b69cca1e56d1553b125c6857</id>
<content type='text'>
In tcp_ao_delete_key(), the del_async path skips the current_key
and rnext_key validity checks present in the synchronous path,
assuming these pointers are always NULL on LISTEN sockets.  However,
if a key was added with set_current=1/set_rnext=1 while the socket
was in CLOSE state, current_key and rnext_key will be non-NULL
after listen() transitions the socket to LISTEN.

When such a key is deleted with del_async=1, hlist_del_rcu() and
call_rcu() free the key without clearing the dangling pointers.
After the RCU grace period, getsockopt(TCP_AO_INFO) dereferences
current_key-&gt;sndid and rnext_key-&gt;rcvid from freed slab memory.

Clear current_key and rnext_key in the del_async path when they
reference the key being deleted.

Fixes: d6732b95b6fb ("net/tcp: Allow asynchronous delete for TCP-AO keys (MKTs)")
Signed-off-by: HanQuan &lt;eilaimemedsnaimel@gmail.com&gt;
Reviewed-by: Eric Dumazet &lt;edumazet@google.com&gt;
Link: https://patch.msgid.link/20260623015208.1191687-1-eilaimemedsnaimel@gmail.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In tcp_ao_delete_key(), the del_async path skips the current_key
and rnext_key validity checks present in the synchronous path,
assuming these pointers are always NULL on LISTEN sockets.  However,
if a key was added with set_current=1/set_rnext=1 while the socket
was in CLOSE state, current_key and rnext_key will be non-NULL
after listen() transitions the socket to LISTEN.

When such a key is deleted with del_async=1, hlist_del_rcu() and
call_rcu() free the key without clearing the dangling pointers.
After the RCU grace period, getsockopt(TCP_AO_INFO) dereferences
current_key-&gt;sndid and rnext_key-&gt;rcvid from freed slab memory.

Clear current_key and rnext_key in the del_async path when they
reference the key being deleted.

Fixes: d6732b95b6fb ("net/tcp: Allow asynchronous delete for TCP-AO keys (MKTs)")
Signed-off-by: HanQuan &lt;eilaimemedsnaimel@gmail.com&gt;
Reviewed-by: Eric Dumazet &lt;edumazet@google.com&gt;
Link: https://patch.msgid.link/20260623015208.1191687-1-eilaimemedsnaimel@gmail.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'ipsec-2026-06-22' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec</title>
<updated>2026-06-23T23:22:24+00:00</updated>
<author>
<name>Jakub Kicinski</name>
<email>kuba@kernel.org</email>
</author>
<published>2026-06-23T23:22:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=e9deb406c10f5a73bcfd62f42ca1187b220bc188'/>
<id>e9deb406c10f5a73bcfd62f42ca1187b220bc188</id>
<content type='text'>
Steffen Klassert says:

====================
pull request (net): ipsec 2026-06-22

1) xfrm: use compat translator only for u64 alignment mismatch
   Gate the XFRM_USER_COMPAT translator on COMPAT_FOR_U64_ALIGNMENT
   so 32-bit compat tasks on arches whose 32-bit ABI already matches
   the native 64-bit layout are no longer rejected with -EOPNOTSUPP.
   From Sanman Pradhan.

2) net: af_key: initialize alg_key_len for IPComp states
   Initialize the alg_key_len to 0 in the IPComp branch of
   pfkey_msg2xfrm_state() so an uninitialized value cannot drive
   xfrm_alg_len() into a slab-out-of-bounds kmemdup during
   XFRM_MSG_MIGRATE. From Zijing Yin.

3) xfrm: Fix dev use-after-free in xfrm async resumption
   Stash the original skb-&gt;dev and extend the RCU critical section
   across xfrm_rcv_cb() and transport_finish() to prevent a
   tunnel-device UAF and original-device refcount leak when a
   callback replaces skb-&gt;dev. From Dong Chenchen.

4) xfrm: Fix xfrm state cache insertion race
   Move the state-validity check inside xfrm_state_lock in the
   input state cache insertion path so a state cannot be killed
   between the check and the insert. From Herbert Xu.

5) xfrm: annotate data-races around xfrm_policy_count[] and xfrm_policy_default[]
   Add READ_ONCE()/WRITE_ONCE() annotations on xfrm_policy_count
   and xfrm_policy_default to silence the KCSAN data race reported
   on net-&gt;xfrm.policy_count. From Eric Dumazet.

6) espintcp: use sk_msg_free_partial to fix partial send
   Replace the manual skmsg accounting in espintcp with
   sk_msg_free_partial() so the skmsg stays consistent on every
   iteration and the partial-send accounting bugs go away.
   From Sabrina Dubroca.

7) xfrm: validate selector family and prefixlen during match
   Reject mismatched address families in xfrm_selector_match() and
   bound prefixlen in addr4_match()/addr_match() to prevent the
   shift-out-of-bounds syzbot reported when an AF_UNSPEC selector
   with a large prefixlen is matched against an IPv4 flow.
   From Eric Dumazet.

* tag 'ipsec-2026-06-22' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec:
  xfrm: validate selector family and prefixlen during match
  espintcp: use sk_msg_free_partial to fix partial send
  xfrm: annotate data-races around xfrm_policy_count[] and xfrm_policy_default[]
  xfrm: Fix xfrm state cache insertion race
  xfrm: Fix dev use-after-free in xfrm async resumption
  net: af_key: initialize alg_key_len for IPComp states
  xfrm: use compat translator only for u64 alignment mismatch
====================

Link: https://patch.msgid.link/20260622075726.29685-1-steffen.klassert@secunet.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Steffen Klassert says:

====================
pull request (net): ipsec 2026-06-22

1) xfrm: use compat translator only for u64 alignment mismatch
   Gate the XFRM_USER_COMPAT translator on COMPAT_FOR_U64_ALIGNMENT
   so 32-bit compat tasks on arches whose 32-bit ABI already matches
   the native 64-bit layout are no longer rejected with -EOPNOTSUPP.
   From Sanman Pradhan.

2) net: af_key: initialize alg_key_len for IPComp states
   Initialize the alg_key_len to 0 in the IPComp branch of
   pfkey_msg2xfrm_state() so an uninitialized value cannot drive
   xfrm_alg_len() into a slab-out-of-bounds kmemdup during
   XFRM_MSG_MIGRATE. From Zijing Yin.

3) xfrm: Fix dev use-after-free in xfrm async resumption
   Stash the original skb-&gt;dev and extend the RCU critical section
   across xfrm_rcv_cb() and transport_finish() to prevent a
   tunnel-device UAF and original-device refcount leak when a
   callback replaces skb-&gt;dev. From Dong Chenchen.

4) xfrm: Fix xfrm state cache insertion race
   Move the state-validity check inside xfrm_state_lock in the
   input state cache insertion path so a state cannot be killed
   between the check and the insert. From Herbert Xu.

5) xfrm: annotate data-races around xfrm_policy_count[] and xfrm_policy_default[]
   Add READ_ONCE()/WRITE_ONCE() annotations on xfrm_policy_count
   and xfrm_policy_default to silence the KCSAN data race reported
   on net-&gt;xfrm.policy_count. From Eric Dumazet.

6) espintcp: use sk_msg_free_partial to fix partial send
   Replace the manual skmsg accounting in espintcp with
   sk_msg_free_partial() so the skmsg stays consistent on every
   iteration and the partial-send accounting bugs go away.
   From Sabrina Dubroca.

7) xfrm: validate selector family and prefixlen during match
   Reject mismatched address families in xfrm_selector_match() and
   bound prefixlen in addr4_match()/addr_match() to prevent the
   shift-out-of-bounds syzbot reported when an AF_UNSPEC selector
   with a large prefixlen is matched against an IPv4 flow.
   From Eric Dumazet.

* tag 'ipsec-2026-06-22' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec:
  xfrm: validate selector family and prefixlen during match
  espintcp: use sk_msg_free_partial to fix partial send
  xfrm: annotate data-races around xfrm_policy_count[] and xfrm_policy_default[]
  xfrm: Fix xfrm state cache insertion race
  xfrm: Fix dev use-after-free in xfrm async resumption
  net: af_key: initialize alg_key_len for IPComp states
  xfrm: use compat translator only for u64 alignment mismatch
====================

Link: https://patch.msgid.link/20260622075726.29685-1-steffen.klassert@secunet.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'nf-26-06-21' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf</title>
<updated>2026-06-22T17:33:38+00:00</updated>
<author>
<name>Jakub Kicinski</name>
<email>kuba@kernel.org</email>
</author>
<published>2026-06-22T17:33:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=56abdaebbf0da304b860bed1f2b5a85f5a6a16a0'/>
<id>56abdaebbf0da304b860bed1f2b5a85f5a6a16a0</id>
<content type='text'>
Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The following patchset contains Netfilter fixes for net. This batches
fixes for real crashes with trivial/correctness fixes. There is too
a rework of the conntrack expectation timeout strategy to deal with
a possible race when removing an expectation.

1) Fix the incorrect flowtable timeout extension for entries in
   hw offload, from Adrian Bente. This is correcting a defect in
   the functionality, no crash.

2) Hold reference to device under the fake dst in br_netfilter,
   from Haoze Xie. This is fixing a possible UaF if the device
   is removed while packet is sitting in nfqueue.

3) Reject template conntrack in xt_cluster, otherwise access to
   uninitialize conntrack fields are possible leading to WARN_ON
   due to unset layer 3 protocol. From Wyatt Feng.

4) Make sure the IPv6 tunnel header is in the linear skb data
   area before pulling. While at it remove incomplete NEXTHDR_DEST
   support. From Lorenzo Bianconi. This possibly leading to crash
   if IPv4 header is not in the linear area.

5) Use test_bit_acquire in ipset hash set to avoid reordering
   of subsequent memory access. This is addressing a LLM related
   report, no crash has been observed. From Jozsef Kadlecsik.

6) Use test_bit_acquire in ipset bitmap set too, for the same
   reason as in the previous patch, from Jozsef Kadlecsik.

7) Call kfree_rcu() after rcu_assign_pointer() to address a
   possible UaF if kfree_rcu() runs inmediately, which to my
   understanding never happens. Never observed in practise,
   reported by LLM. Also from Jozsef Kadlecsik.

8) Use disable_delayed_work_sync() instead cancel_delayed_work_sync()
   to avoid that ipset GC handler re-queues work as reported by LLM.
   From Jozsef Kadlecsik. This is for correctness.

9) Restore the check in nft_payload for exceeding payloda offset
    over 2^16. From Florian Westphal. This fixes a silent truncation,
    not a big deal, but better be assertive and reject it.

10) Validate NFT_META_BRI_IIFHWADDR can only run from bridge
    prerouting. From Florian Westphal. Harmless but it could allow
    to read bytes from skb-&gt;cb.

11) Zero out destination hardware address during the flowtable
    path setup, also from Florian. This is a correctness fix, LLM
    points that possible infoleak can happen but topology to achieve
    it is not clear.

12) Skip IPv4 options if present when building the IPV4 reject reply.
    Otherwise bytes in the IPv4 options header can be sent back to
    origin where the ICMP header is being expected. Again from
    Florian Westphal.

13) Replace timer API for expectation by GC worker approach. This
    is implicitly fixing a race between nf_ct_remove_expectations()
    which might fail to remove the expectation due to timer_del()
    returning false because timer has expired and callback is
    being run concurrently. This fix is addressing a crash that has
    been already reported with a reproducer.

14) Check if br_vlan_get_pvid_rcu() fails, otherwise possible stack
    infoleak of 4-bytes. From Florian Westphal.

* tag 'nf-26-06-21' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
  netfilter: nft_meta_bridge: fix NFT_META_BRI_IIFPVID stack leak
  netfilter: nf_conntrack_expect: use conntrack GC to reap expectations
  netfilter: nf_reject: skip iphdr options when looking for icmp header
  netfilter: nft_flow_offload: zero device address for non-ether case
  netfilter: nft_meta_bridge: add validate callback for get operations
  netfilter: nft_payload: reject offsets exceeding 65535 bytes
  netfilter: ipset: make sure gc is properly stopped
  netfilter: ipset: fix order of kfree_rcu() and rcu_assign_pointer()
  netfilter: ipset: Don't use test_bit() in lockless RCU readers in bitmap types
  netfilter: ipset: Don't use test_bit() in lockless RCU readers in hash types
  netfilter: flowtable: fix and simplify IP6IP6 tunnel handling
  netfilter: xt_cluster: reject template conntracks in hash match
  netfilter: nf_queue: pin bridge device while NFQUEUE holds fake dst
  netfilter: flowtable: fix offloaded ct timeout never being extended
====================

Link: https://patch.msgid.link/20260620222738.112506-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The following patchset contains Netfilter fixes for net. This batches
fixes for real crashes with trivial/correctness fixes. There is too
a rework of the conntrack expectation timeout strategy to deal with
a possible race when removing an expectation.

1) Fix the incorrect flowtable timeout extension for entries in
   hw offload, from Adrian Bente. This is correcting a defect in
   the functionality, no crash.

2) Hold reference to device under the fake dst in br_netfilter,
   from Haoze Xie. This is fixing a possible UaF if the device
   is removed while packet is sitting in nfqueue.

3) Reject template conntrack in xt_cluster, otherwise access to
   uninitialize conntrack fields are possible leading to WARN_ON
   due to unset layer 3 protocol. From Wyatt Feng.

4) Make sure the IPv6 tunnel header is in the linear skb data
   area before pulling. While at it remove incomplete NEXTHDR_DEST
   support. From Lorenzo Bianconi. This possibly leading to crash
   if IPv4 header is not in the linear area.

5) Use test_bit_acquire in ipset hash set to avoid reordering
   of subsequent memory access. This is addressing a LLM related
   report, no crash has been observed. From Jozsef Kadlecsik.

6) Use test_bit_acquire in ipset bitmap set too, for the same
   reason as in the previous patch, from Jozsef Kadlecsik.

7) Call kfree_rcu() after rcu_assign_pointer() to address a
   possible UaF if kfree_rcu() runs inmediately, which to my
   understanding never happens. Never observed in practise,
   reported by LLM. Also from Jozsef Kadlecsik.

8) Use disable_delayed_work_sync() instead cancel_delayed_work_sync()
   to avoid that ipset GC handler re-queues work as reported by LLM.
   From Jozsef Kadlecsik. This is for correctness.

9) Restore the check in nft_payload for exceeding payloda offset
    over 2^16. From Florian Westphal. This fixes a silent truncation,
    not a big deal, but better be assertive and reject it.

10) Validate NFT_META_BRI_IIFHWADDR can only run from bridge
    prerouting. From Florian Westphal. Harmless but it could allow
    to read bytes from skb-&gt;cb.

11) Zero out destination hardware address during the flowtable
    path setup, also from Florian. This is a correctness fix, LLM
    points that possible infoleak can happen but topology to achieve
    it is not clear.

12) Skip IPv4 options if present when building the IPV4 reject reply.
    Otherwise bytes in the IPv4 options header can be sent back to
    origin where the ICMP header is being expected. Again from
    Florian Westphal.

13) Replace timer API for expectation by GC worker approach. This
    is implicitly fixing a race between nf_ct_remove_expectations()
    which might fail to remove the expectation due to timer_del()
    returning false because timer has expired and callback is
    being run concurrently. This fix is addressing a crash that has
    been already reported with a reproducer.

14) Check if br_vlan_get_pvid_rcu() fails, otherwise possible stack
    infoleak of 4-bytes. From Florian Westphal.

* tag 'nf-26-06-21' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
  netfilter: nft_meta_bridge: fix NFT_META_BRI_IIFPVID stack leak
  netfilter: nf_conntrack_expect: use conntrack GC to reap expectations
  netfilter: nf_reject: skip iphdr options when looking for icmp header
  netfilter: nft_flow_offload: zero device address for non-ether case
  netfilter: nft_meta_bridge: add validate callback for get operations
  netfilter: nft_payload: reject offsets exceeding 65535 bytes
  netfilter: ipset: make sure gc is properly stopped
  netfilter: ipset: fix order of kfree_rcu() and rcu_assign_pointer()
  netfilter: ipset: Don't use test_bit() in lockless RCU readers in bitmap types
  netfilter: ipset: Don't use test_bit() in lockless RCU readers in hash types
  netfilter: flowtable: fix and simplify IP6IP6 tunnel handling
  netfilter: xt_cluster: reject template conntracks in hash match
  netfilter: nf_queue: pin bridge device while NFQUEUE holds fake dst
  netfilter: flowtable: fix offloaded ct timeout never being extended
====================

Link: https://patch.msgid.link/20260620222738.112506-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ipv4: account for fraggap on the paged allocation path</title>
<updated>2026-06-21T22:24:49+00:00</updated>
<author>
<name>Wongi Lee</name>
<email>qw3rtyp0@gmail.com</email>
</author>
<published>2026-06-16T13:38:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=eca856950f7cb1a221e02b99d758409f2c5cec42'/>
<id>eca856950f7cb1a221e02b99d758409f2c5cec42</id>
<content type='text'>
In __ip_append_data(), when the paged-allocation branch is taken,
alloclen and pagedlen are computed as

	alloclen = fragheaderlen + transhdrlen;
	pagedlen = datalen - transhdrlen;

datalen already includes fraggap, but the fraggap bytes carried over
from the previous skb are copied into the new skb's linear area at
offset transhdrlen by the subsequent skb_copy_and_csum_bits(). The
linear area is therefore undersized by fraggap bytes while pagedlen is
overstated by the same amount.

The non-paged branch sets alloclen to fraglen, which already accounts
for fraggap because datalen does. Bring the paged branch in line by
adding fraggap to alloclen and subtracting it from pagedlen.

After this adjustment, copy no longer collapses to -fraggap on the
paged path, so remove the stale comment describing that old arithmetic.

Fixes: 8eb77cc73977 ("ipv4: avoid partial copy for zc")
Signed-off-by: Jungwoo Lee &lt;jwlee2217@gmail.com&gt;
Signed-off-by: Wongi Lee &lt;qw3rtyp0@gmail.com&gt;
Reviewed-by: Ido Schimmel &lt;idosch@nvidia.com&gt;
Link: https://patch.msgid.link/ajFR1eLAIs42TN3g@DESKTOP-19IMU7U.localdomain
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In __ip_append_data(), when the paged-allocation branch is taken,
alloclen and pagedlen are computed as

	alloclen = fragheaderlen + transhdrlen;
	pagedlen = datalen - transhdrlen;

datalen already includes fraggap, but the fraggap bytes carried over
from the previous skb are copied into the new skb's linear area at
offset transhdrlen by the subsequent skb_copy_and_csum_bits(). The
linear area is therefore undersized by fraggap bytes while pagedlen is
overstated by the same amount.

The non-paged branch sets alloclen to fraglen, which already accounts
for fraggap because datalen does. Bring the paged branch in line by
adding fraggap to alloclen and subtracting it from pagedlen.

After this adjustment, copy no longer collapses to -fraggap on the
paged path, so remove the stale comment describing that old arithmetic.

Fixes: 8eb77cc73977 ("ipv4: avoid partial copy for zc")
Signed-off-by: Jungwoo Lee &lt;jwlee2217@gmail.com&gt;
Signed-off-by: Wongi Lee &lt;qw3rtyp0@gmail.com&gt;
Reviewed-by: Ido Schimmel &lt;idosch@nvidia.com&gt;
Link: https://patch.msgid.link/ajFR1eLAIs42TN3g@DESKTOP-19IMU7U.localdomain
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>netfilter: nf_reject: skip iphdr options when looking for icmp header</title>
<updated>2026-06-20T22:18:27+00:00</updated>
<author>
<name>Florian Westphal</name>
<email>fw@strlen.de</email>
</author>
<published>2026-06-18T08:49:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=af8d6ae09c0a5f8b8a0d5680203c74b3c1daa85b'/>
<id>af8d6ae09c0a5f8b8a0d5680203c74b3c1daa85b</id>
<content type='text'>
Not a big deal but this hould have used the real ip header length and not the
base header size.  As-is, if there are options then
nf_skb_is_icmp_unreach() result will be random.

Fixes: db99b2f2b3e2 ("netfilter: nf_reject: don't reply to icmp error messages")
Signed-off-by: Florian Westphal &lt;fw@strlen.de&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Not a big deal but this hould have used the real ip header length and not the
base header size.  As-is, if there are options then
nf_skb_is_icmp_unreach() result will be random.

Fixes: db99b2f2b3e2 ("netfilter: nf_reject: don't reply to icmp error messages")
Signed-off-by: Florian Westphal &lt;fw@strlen.de&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: ipv4: bound TCP reordering sysctl writes and MTU probe sizes</title>
<updated>2026-06-17T23:35:13+00:00</updated>
<author>
<name>Wyatt Feng</name>
<email>bronzed_45_vested@icloud.com</email>
</author>
<published>2026-06-15T10:31:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=efb8763d7bbb40cff4cc55a6b62c3095a038149c'/>
<id>efb8763d7bbb40cff4cc55a6b62c3095a038149c</id>
<content type='text'>
Reject invalid `net.ipv4.tcp_reordering` values before they reach TCP
socket state. The sysctl is stored as an `int` but copied into the
`u32` `tp-&gt;reordering` field for new sockets, so negative writes wrap
to large values.

With `tcp_mtu_probing=2`, the wrapped value can overflow the
`tcp_mtu_probe()` size calculation and drive the MTU probing path into
an out-of-bounds read. Route `tcp_reordering` writes through
`proc_dointvec_minmax()` and require it to be at least 1. Also require
`tcp_max_reordering` to be at least 1 so the configured maximum cannot
become negative either.

When registering the table for a non-init network namespace, relocate
`extra2` pointers that refer into `init_net.ipv4` so the
`tcp_reordering` upper bound follows that namespace's
`tcp_max_reordering`.

Harden `tcp_mtu_probe()` itself by computing `size_needed` as `u64`.
This keeps the send queue and window checks from being bypassed through
signed integer overflow.

Fixes: 91cc17c0e5e5 ("[TCP]: MTUprobe: receiver window &amp; data available checks fixed")
Cc: stable@vger.kernel.org
Reported-by: Yuan Tan &lt;yuantan098@gmail.com&gt;
Reported-by: Zhengchuan Liang &lt;zcliangcn@gmail.com&gt;
Reported-by: Xin Liu &lt;bird@lzu.edu.cn&gt;
Suggested-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: Wyatt Feng &lt;bronzed_45_vested@icloud.com&gt;
Signed-off-by: Ren Wei &lt;n05ec@lzu.edu.cn&gt;
Reviewed-by: Eric Dumazet &lt;edumazet@google.com&gt;
Link: https://patch.msgid.link/1a5b7e1ef4d70fbad8c8ee0b82d8405f3c964a3d.1781395200.git.bronzed_45_vested@icloud.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Reject invalid `net.ipv4.tcp_reordering` values before they reach TCP
socket state. The sysctl is stored as an `int` but copied into the
`u32` `tp-&gt;reordering` field for new sockets, so negative writes wrap
to large values.

With `tcp_mtu_probing=2`, the wrapped value can overflow the
`tcp_mtu_probe()` size calculation and drive the MTU probing path into
an out-of-bounds read. Route `tcp_reordering` writes through
`proc_dointvec_minmax()` and require it to be at least 1. Also require
`tcp_max_reordering` to be at least 1 so the configured maximum cannot
become negative either.

When registering the table for a non-init network namespace, relocate
`extra2` pointers that refer into `init_net.ipv4` so the
`tcp_reordering` upper bound follows that namespace's
`tcp_max_reordering`.

Harden `tcp_mtu_probe()` itself by computing `size_needed` as `u64`.
This keeps the send queue and window checks from being bypassed through
signed integer overflow.

Fixes: 91cc17c0e5e5 ("[TCP]: MTUprobe: receiver window &amp; data available checks fixed")
Cc: stable@vger.kernel.org
Reported-by: Yuan Tan &lt;yuantan098@gmail.com&gt;
Reported-by: Zhengchuan Liang &lt;zcliangcn@gmail.com&gt;
Reported-by: Xin Liu &lt;bird@lzu.edu.cn&gt;
Suggested-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: Wyatt Feng &lt;bronzed_45_vested@icloud.com&gt;
Signed-off-by: Ren Wei &lt;n05ec@lzu.edu.cn&gt;
Reviewed-by: Eric Dumazet &lt;edumazet@google.com&gt;
Link: https://patch.msgid.link/1a5b7e1ef4d70fbad8c8ee0b82d8405f3c964a3d.1781395200.git.bronzed_45_vested@icloud.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: ip_vti: require CAP_NET_ADMIN in the device netns for changelink</title>
<updated>2026-06-17T23:01:52+00:00</updated>
<author>
<name>Maoyi Xie</name>
<email>maoyixie.tju@gmail.com</email>
</author>
<published>2026-06-12T08:59:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=95cceadbfd52d7239bd730afdda0655287d77425'/>
<id>95cceadbfd52d7239bd730afdda0655287d77425</id>
<content type='text'>
vti_changelink() operates on at most two netns, dev_net(dev) and the
tunnel link netns t-&gt;net. They differ once the device is created in or
moved to a netns other than the one the request runs in. The rtnl
changelink path checks CAP_NET_ADMIN only against dev_net(dev), so a
caller privileged there but not in t-&gt;net can rewrite a tunnel that
lives in t-&gt;net.

Gate vti_changelink() on rtnl_dev_link_net_capable() at its top,
before any attribute is parsed.

Reported-by: Xiao Liang &lt;shaw.leon@gmail.com&gt;
Closes: https://lore.kernel.org/netdev/CABAhCOSzP1vaThGV35_VnsRCb=87_CPjPVsTHbq905k8A+BuUg@mail.gmail.com/
Fixes: 895de9a3488a ("vti4: Enable namespace changing")
Cc: stable@vger.kernel.org
Signed-off-by: Maoyi Xie &lt;maoyixie.tju@gmail.com&gt;
Reviewed-by: Kuniyuki Iwashima &lt;kuniyu@google.com&gt;
Link: https://patch.msgid.link/20260612085941.3158249-4-maoyixie.tju@gmail.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
vti_changelink() operates on at most two netns, dev_net(dev) and the
tunnel link netns t-&gt;net. They differ once the device is created in or
moved to a netns other than the one the request runs in. The rtnl
changelink path checks CAP_NET_ADMIN only against dev_net(dev), so a
caller privileged there but not in t-&gt;net can rewrite a tunnel that
lives in t-&gt;net.

Gate vti_changelink() on rtnl_dev_link_net_capable() at its top,
before any attribute is parsed.

Reported-by: Xiao Liang &lt;shaw.leon@gmail.com&gt;
Closes: https://lore.kernel.org/netdev/CABAhCOSzP1vaThGV35_VnsRCb=87_CPjPVsTHbq905k8A+BuUg@mail.gmail.com/
Fixes: 895de9a3488a ("vti4: Enable namespace changing")
Cc: stable@vger.kernel.org
Signed-off-by: Maoyi Xie &lt;maoyixie.tju@gmail.com&gt;
Reviewed-by: Kuniyuki Iwashima &lt;kuniyu@google.com&gt;
Link: https://patch.msgid.link/20260612085941.3158249-4-maoyixie.tju@gmail.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: ipip: require CAP_NET_ADMIN in the device netns for changelink</title>
<updated>2026-06-17T23:01:52+00:00</updated>
<author>
<name>Maoyi Xie</name>
<email>maoyixie.tju@gmail.com</email>
</author>
<published>2026-06-12T08:59:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=8211a26324667980a463c069469a818e71207e02'/>
<id>8211a26324667980a463c069469a818e71207e02</id>
<content type='text'>
ipip_changelink() operates on at most two netns, dev_net(dev) and the
tunnel link netns t-&gt;net. They differ once the device is created in or
moved to a netns other than the one the request runs in. The rtnl
changelink path checks CAP_NET_ADMIN only against dev_net(dev), so a
caller privileged there but not in t-&gt;net can rewrite a tunnel that
lives in t-&gt;net.

Gate ipip_changelink() on rtnl_dev_link_net_capable() at its top,
before any attribute is parsed.

Reported-by: Xiao Liang &lt;shaw.leon@gmail.com&gt;
Closes: https://lore.kernel.org/netdev/CABAhCOSzP1vaThGV35_VnsRCb=87_CPjPVsTHbq905k8A+BuUg@mail.gmail.com/
Fixes: 6c742e714d8c ("ipip: add x-netns support")
Cc: stable@vger.kernel.org
Signed-off-by: Maoyi Xie &lt;maoyixie.tju@gmail.com&gt;
Reviewed-by: Kuniyuki Iwashima &lt;kuniyu@google.com&gt;
Link: https://patch.msgid.link/20260612085941.3158249-3-maoyixie.tju@gmail.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
ipip_changelink() operates on at most two netns, dev_net(dev) and the
tunnel link netns t-&gt;net. They differ once the device is created in or
moved to a netns other than the one the request runs in. The rtnl
changelink path checks CAP_NET_ADMIN only against dev_net(dev), so a
caller privileged there but not in t-&gt;net can rewrite a tunnel that
lives in t-&gt;net.

Gate ipip_changelink() on rtnl_dev_link_net_capable() at its top,
before any attribute is parsed.

Reported-by: Xiao Liang &lt;shaw.leon@gmail.com&gt;
Closes: https://lore.kernel.org/netdev/CABAhCOSzP1vaThGV35_VnsRCb=87_CPjPVsTHbq905k8A+BuUg@mail.gmail.com/
Fixes: 6c742e714d8c ("ipip: add x-netns support")
Cc: stable@vger.kernel.org
Signed-off-by: Maoyi Xie &lt;maoyixie.tju@gmail.com&gt;
Reviewed-by: Kuniyuki Iwashima &lt;kuniyu@google.com&gt;
Link: https://patch.msgid.link/20260612085941.3158249-3-maoyixie.tju@gmail.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
