<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/net/openvswitch, branch v7.2-rc1</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>openvswitch: conntrack: annotate ct limit hlist traversal</title>
<updated>2026-06-25T15:38:00+00:00</updated>
<author>
<name>Runyu Xiao</name>
<email>runyu.xiao@seu.edu.cn</email>
</author>
<published>2026-06-24T15:01:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=0e901ee5c6f9e1a382099cd7dbee1360c80c441c'/>
<id>0e901ee5c6f9e1a382099cd7dbee1360c80c441c</id>
<content type='text'>
ct_limit_set() is documented as being called with ovs_mutex held. It
walks the ct limit hlist with hlist_for_each_entry_rcu(), but the
iterator does not currently pass the OVS lockdep condition used
elsewhere for RCU-protected OVS objects.

Pass lockdep_ovsl_is_held() to the iterator. This matches the function's
existing caller contract and lets CONFIG_PROVE_RCU_LIST distinguish the
ovs_mutex-protected update path from the RCU read-side ct_limit_get()
path.

This was found by our static analysis tool and then manually reviewed
against the current tree. In the reviewed CONFIG_PROVE_RCU_LIST triage
run, the writer-side ct limit update produced the expected "RCU-list
traversed in non-reader section!!" warning while ovs_mutex was held,
with the stack matching ct_limit_set() and ovs_ct_limit_set_zone_limit().
The change is limited to documenting the existing protection contract.

This is a lockdep annotation cleanup. It does not change the conntrack
limit list update or release behavior.

Signed-off-by: Runyu Xiao &lt;runyu.xiao@seu.edu.cn&gt;
Reviewed-by: Eelco Chaudron &lt;echaudro@redhat.com&gt;
Link: https://patch.msgid.link/20260624150149.3510541-1-runyu.xiao@seu.edu.cn
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
ct_limit_set() is documented as being called with ovs_mutex held. It
walks the ct limit hlist with hlist_for_each_entry_rcu(), but the
iterator does not currently pass the OVS lockdep condition used
elsewhere for RCU-protected OVS objects.

Pass lockdep_ovsl_is_held() to the iterator. This matches the function's
existing caller contract and lets CONFIG_PROVE_RCU_LIST distinguish the
ovs_mutex-protected update path from the RCU read-side ct_limit_get()
path.

This was found by our static analysis tool and then manually reviewed
against the current tree. In the reviewed CONFIG_PROVE_RCU_LIST triage
run, the writer-side ct limit update produced the expected "RCU-list
traversed in non-reader section!!" warning while ovs_mutex was held,
with the stack matching ct_limit_set() and ovs_ct_limit_set_zone_limit().
The change is limited to documenting the existing protection contract.

This is a lockdep annotation cleanup. It does not change the conntrack
limit list update or release behavior.

Signed-off-by: Runyu Xiao &lt;runyu.xiao@seu.edu.cn&gt;
Reviewed-by: Eelco Chaudron &lt;echaudro@redhat.com&gt;
Link: https://patch.msgid.link/20260624150149.3510541-1-runyu.xiao@seu.edu.cn
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'nf-next-26-06-14' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next</title>
<updated>2026-06-15T21:09:57+00:00</updated>
<author>
<name>Jakub Kicinski</name>
<email>kuba@kernel.org</email>
</author>
<published>2026-06-15T21:09:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=eaf398831e35dbda6b52e46cac36bdfbcb7cd2b5'/>
<id>eaf398831e35dbda6b52e46cac36bdfbcb7cd2b5</id>
<content type='text'>
Pablo Neira Ayuso says:

====================
Netfilter/IPVS updates for net-next

The following patchset contains Netfilter/IPVS updates for net-next.
More specifically, this contains conncount rework to address AI related
reports, assorted Netfiter updates and two small incremental updates on
IPVS:

1) Replace old obsolete workqueues (system_wq, system_unbound_wq)
   in IPVS, from Marco Crivellari.

2) Replace WARN_ON{_ONCE} by DEBUG_NET_WARN_ON_ONCE in nf_tables.
   In the recent years, reporters say that the use of WARN_ON{_ONCE}
   in conjunction with panic_on_warn=1 results in DoS. Let's replace
   it by DEBUG_NET_WARN_ON_ONCE so this is only exercised by test
   infrastructure and fuzzers, while also providing context to AI
   agents. From Fernando F. Mancera.

Five patches from Florian Westphal to address AI reports in the conncount
infrastructures:

3) Fix missing rcu read lock section when calling
   __ovs_ct_limit_get_zone_limit().

4) Add a dedicate lock per rbtree tree, this increases memory
   usage but it should improve scalability.

5) Add a helper function to find the rbtree node, no functional
   changes are intented.

6) Add sequence counter to detect concurrent tree modifications
   and retry lookups.

7) Add locks to GC conncount walk and address other nitpicks.

Then, several assorted updates:

8) Defensive Tree-wide addition of NULL checks for ct extensions.

9) Bail out if flowtable bypass cannot be fully set up from the
   flow offload expression, instead of lazy building a likely
   incomplete one.

10) Fix documentation for the new conn_max sysctl toggle in IPVS.

11) Add nf_dev_xmit_recursion*() helpers and use them, to address
    recent AI reports.

* tag 'nf-next-26-06-14' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next:
  netfilter: nf_dup_netdev: add nf_dev_xmit_recursion*() helpers and use them
  ipvs: fix doc syntax for conn_max sysctl
  netfilter: flowtable: bail out if forward path cannot be discovered
  netfilter: conntrack: check NULL when retrieving ct extension
  netfilter: nf_conncount: gc and rcu fixes
  netfilter: nf_conncount: add sequence counter to detect tree modifications
  netfilter: nf_conncount: split count_tree_node rbtree walk into helper
  netfilter: nf_conncount: use per nf_conncount_data spinlocks
  netfilter: nf_conncount: callers must hold rcu read lock
  netfilter: nf_tables: use DEBUG_NET_WARN_ON_ONCE in packet and control paths
  ipvs: Replace use of system_unbound_wq with system_dfl_long_wq
====================

Link: https://patch.msgid.link/20260614114605.474783-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pablo Neira Ayuso says:

====================
Netfilter/IPVS updates for net-next

The following patchset contains Netfilter/IPVS updates for net-next.
More specifically, this contains conncount rework to address AI related
reports, assorted Netfiter updates and two small incremental updates on
IPVS:

1) Replace old obsolete workqueues (system_wq, system_unbound_wq)
   in IPVS, from Marco Crivellari.

2) Replace WARN_ON{_ONCE} by DEBUG_NET_WARN_ON_ONCE in nf_tables.
   In the recent years, reporters say that the use of WARN_ON{_ONCE}
   in conjunction with panic_on_warn=1 results in DoS. Let's replace
   it by DEBUG_NET_WARN_ON_ONCE so this is only exercised by test
   infrastructure and fuzzers, while also providing context to AI
   agents. From Fernando F. Mancera.

Five patches from Florian Westphal to address AI reports in the conncount
infrastructures:

3) Fix missing rcu read lock section when calling
   __ovs_ct_limit_get_zone_limit().

4) Add a dedicate lock per rbtree tree, this increases memory
   usage but it should improve scalability.

5) Add a helper function to find the rbtree node, no functional
   changes are intented.

6) Add sequence counter to detect concurrent tree modifications
   and retry lookups.

7) Add locks to GC conncount walk and address other nitpicks.

Then, several assorted updates:

8) Defensive Tree-wide addition of NULL checks for ct extensions.

9) Bail out if flowtable bypass cannot be fully set up from the
   flow offload expression, instead of lazy building a likely
   incomplete one.

10) Fix documentation for the new conn_max sysctl toggle in IPVS.

11) Add nf_dev_xmit_recursion*() helpers and use them, to address
    recent AI reports.

* tag 'nf-next-26-06-14' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next:
  netfilter: nf_dup_netdev: add nf_dev_xmit_recursion*() helpers and use them
  ipvs: fix doc syntax for conn_max sysctl
  netfilter: flowtable: bail out if forward path cannot be discovered
  netfilter: conntrack: check NULL when retrieving ct extension
  netfilter: nf_conncount: gc and rcu fixes
  netfilter: nf_conncount: add sequence counter to detect tree modifications
  netfilter: nf_conncount: split count_tree_node rbtree walk into helper
  netfilter: nf_conncount: use per nf_conncount_data spinlocks
  netfilter: nf_conncount: callers must hold rcu read lock
  netfilter: nf_tables: use DEBUG_NET_WARN_ON_ONCE in packet and control paths
  ipvs: Replace use of system_unbound_wq with system_dfl_long_wq
====================

Link: https://patch.msgid.link/20260614114605.474783-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>netfilter: nf_conncount: callers must hold rcu read lock</title>
<updated>2026-06-14T10:51:50+00:00</updated>
<author>
<name>Florian Westphal</name>
<email>fw@strlen.de</email>
</author>
<published>2026-06-05T13:11:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=64d7d5abe2160bba369b4a8f06bdf5630573bab0'/>
<id>64d7d5abe2160bba369b4a8f06bdf5630573bab0</id>
<content type='text'>
rcu_derefence_raw() should not have been used here, it concealed this bug.
Its used because struct rb_node lacks __rcu annotated pointers, so plain
rcu_derefence causes sparse warnings.

The major tradeoff is that rcu_derefence_raw() doesn't warn when the caller
isn't in a rcu read section.

Extend the rcu read lock scope accordingly and cause sparse warnings,
those warnings are the lesser evil.

Fixes: 11efd5cb04a1 ("openvswitch: Support conntrack zone limit")
Closes: https://sashiko.dev/#/patchset/20260603230610.7900-1-fw%40strlen.de
Signed-off-by: Florian Westphal &lt;fw@strlen.de&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
rcu_derefence_raw() should not have been used here, it concealed this bug.
Its used because struct rb_node lacks __rcu annotated pointers, so plain
rcu_derefence causes sparse warnings.

The major tradeoff is that rcu_derefence_raw() doesn't warn when the caller
isn't in a rcu read section.

Extend the rcu read lock scope accordingly and cause sparse warnings,
those warnings are the lesser evil.

Fixes: 11efd5cb04a1 ("openvswitch: Support conntrack zone limit")
Closes: https://sashiko.dev/#/patchset/20260603230610.7900-1-fw%40strlen.de
Signed-off-by: Florian Westphal &lt;fw@strlen.de&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: openvswitch: fix possible kfree_skb of ERR_PTR</title>
<updated>2026-06-09T03:13:02+00:00</updated>
<author>
<name>Adrian Moreno</name>
<email>amorenoz@redhat.com</email>
</author>
<published>2026-06-04T12:19:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=ee30dd2909d8b98619f4341c70ec8dc8e155ab02'/>
<id>ee30dd2909d8b98619f4341c70ec8dc8e155ab02</id>
<content type='text'>
After the patch in the "Fixes" tag, the allocation of the "reply" skb
can happen either before or after locking the ovs_mutex.

However, error cleanups still follow the classical reversed order,
assuming "reply" is allocated before locking: it is freed after unlocking.

If "reply" allocation happens after locking the mutex and it fails,
"reply" is left with an ERR_PTR, and execution jumps to the correspondent
cleanup stage which will try to free an invalid pointer.

Fix this by setting the pointer to NULL after having saved its error
value.

Fixes: 893f139b9a6c ("openvswitch: Minimize ovs_flow_cmd_new|set critical sections.")
Signed-off-by: Adrian Moreno &lt;amorenoz@redhat.com&gt;
Reviewed-by: Aaron Conole &lt;aconole@redhat.com&gt;
Acked-by: Eelco Chaudron &lt;echaudro@redhat.com&gt;
Link: https://patch.msgid.link/20260604121946.942164-1-amorenoz@redhat.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
After the patch in the "Fixes" tag, the allocation of the "reply" skb
can happen either before or after locking the ovs_mutex.

However, error cleanups still follow the classical reversed order,
assuming "reply" is allocated before locking: it is freed after unlocking.

If "reply" allocation happens after locking the mutex and it fails,
"reply" is left with an ERR_PTR, and execution jumps to the correspondent
cleanup stage which will try to free an invalid pointer.

Fix this by setting the pointer to NULL after having saved its error
value.

Fixes: 893f139b9a6c ("openvswitch: Minimize ovs_flow_cmd_new|set critical sections.")
Signed-off-by: Adrian Moreno &lt;amorenoz@redhat.com&gt;
Reviewed-by: Aaron Conole &lt;aconole@redhat.com&gt;
Acked-by: Eelco Chaudron &lt;echaudro@redhat.com&gt;
Link: https://patch.msgid.link/20260604121946.942164-1-amorenoz@redhat.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>openvswitch: vport: fix race between linking and the device notifier</title>
<updated>2026-05-18T23:38:45+00:00</updated>
<author>
<name>Ilya Maximets</name>
<email>i.maximets@ovn.org</email>
</author>
<published>2026-05-14T18:46:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=bae3ee802c21e83ad1eb805519e6f32ea528b4d2'/>
<id>bae3ee802c21e83ad1eb805519e6f32ea528b4d2</id>
<content type='text'>
Sashiko reports that it is technically possible that we got the device
reference, but by the time we're linking it to the OVS datapath, it
may be already in the process of being deleted.  In this case if the
notifier wins the race for RTNL, it will see that the device is not
yet in the OVS datapath (ovs_netdev_get_vport() will fail in the
dp_device_event()) and will do nothing.  Then the ovs_netdev_link()
will take the RTNL and link the unregistering device to OVS datapath.

Eventually, netdev_wait_allrefs_any() will re-broadcast the event and
the device will be properly detached, but it will take at least a
second before that happens, so it's not something we should rely on.

Let's avoid linking the non-registered device in the first place.

Note: As per documentation, RTNL doesn't protect the reg_state, but
it actually does for all the state transitions we care about here,
so it should not be necessary to use READ_ONCE or taking the instance
lock.  We can still do that, but we have a few more places even in
this file where the reg_state is accessed without those while under
RTNL, and many more places like this across the kernel code, so it
might make more sense to change all of them in a more centralized
fashion in the future, if necessary.

Fixes: ccb1352e76cf ("net: Add Open vSwitch kernel components.")
Signed-off-by: Ilya Maximets &lt;i.maximets@ovn.org&gt;
Reviewed-by: Aaron Conole &lt;aconole@redhat.com&gt;
Acked-by: Eelco Chaudron &lt;echaudro@redhat.com&gt;
Link: https://patch.msgid.link/20260514184702.2461435-1-i.maximets@ovn.org
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Sashiko reports that it is technically possible that we got the device
reference, but by the time we're linking it to the OVS datapath, it
may be already in the process of being deleted.  In this case if the
notifier wins the race for RTNL, it will see that the device is not
yet in the OVS datapath (ovs_netdev_get_vport() will fail in the
dp_device_event()) and will do nothing.  Then the ovs_netdev_link()
will take the RTNL and link the unregistering device to OVS datapath.

Eventually, netdev_wait_allrefs_any() will re-broadcast the event and
the device will be properly detached, but it will take at least a
second before that happens, so it's not something we should rely on.

Let's avoid linking the non-registered device in the first place.

Note: As per documentation, RTNL doesn't protect the reg_state, but
it actually does for all the state transitions we care about here,
so it should not be necessary to use READ_ONCE or taking the instance
lock.  We can still do that, but we have a few more places even in
this file where the reg_state is accessed without those while under
RTNL, and many more places like this across the kernel code, so it
might make more sense to change all of them in a more centralized
fashion in the future, if necessary.

Fixes: ccb1352e76cf ("net: Add Open vSwitch kernel components.")
Signed-off-by: Ilya Maximets &lt;i.maximets@ovn.org&gt;
Reviewed-by: Aaron Conole &lt;aconole@redhat.com&gt;
Acked-by: Eelco Chaudron &lt;echaudro@redhat.com&gt;
Link: https://patch.msgid.link/20260514184702.2461435-1-i.maximets@ovn.org
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>openvswitch: vport: fix self-deadlock on release of tunnel ports</title>
<updated>2026-05-05T13:19:37+00:00</updated>
<author>
<name>Ilya Maximets</name>
<email>i.maximets@ovn.org</email>
</author>
<published>2026-04-30T23:38:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=aa69918bd418e700309fdd08509dba324fb24296'/>
<id>aa69918bd418e700309fdd08509dba324fb24296</id>
<content type='text'>
vports are used concurrently and protected by RCU, so netdev_put()
must happen after the RCU grace period.  So, either in an RCU call or
after the synchronize_net().  The rtnl_delete_link() must happen under
RTNL and so can't be executed in RCU context.  Calling synchronize_net()
while holding RTNL is not a good idea for performance and system
stability under load in general, so calling netdev_put() in RCU call
is the right solution here.

However,
when the device is deleted, rtnl_unlock() will call netdev_run_todo()
and block until all the references are gone.  In the current code this
means that we never reach the call_rcu() and the vport is never freed
and the reference is never released, causing a self-deadlock on device
removal.

Fix that by moving the rcu_call() before the rtnl_unlock(), so the
scheduled RCU callback will be executed when synchronize_net() is
called from the rtnl_unlock()-&gt;netdev_run_todo() while the RTNL itself
is already released.

Fixes: 6931d21f87bc ("openvswitch: defer tunnel netdev_put to RCU release")
Cc: stable@vger.kernel.org
Acked-by: Eelco Chaudron &lt;echaudro@redhat.com&gt;
Signed-off-by: Ilya Maximets &lt;i.maximets@ovn.org&gt;
Acked-by: Aaron Conole &lt;aconole@redhat.com&gt;
Link: https://patch.msgid.link/20260430233848.440994-2-i.maximets@ovn.org
Signed-off-by: Paolo Abeni &lt;pabeni@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
vports are used concurrently and protected by RCU, so netdev_put()
must happen after the RCU grace period.  So, either in an RCU call or
after the synchronize_net().  The rtnl_delete_link() must happen under
RTNL and so can't be executed in RCU context.  Calling synchronize_net()
while holding RTNL is not a good idea for performance and system
stability under load in general, so calling netdev_put() in RCU call
is the right solution here.

However,
when the device is deleted, rtnl_unlock() will call netdev_run_todo()
and block until all the references are gone.  In the current code this
means that we never reach the call_rcu() and the vport is never freed
and the reference is never released, causing a self-deadlock on device
removal.

Fix that by moving the rcu_call() before the rtnl_unlock(), so the
scheduled RCU callback will be executed when synchronize_net() is
called from the rtnl_unlock()-&gt;netdev_run_todo() while the RTNL itself
is already released.

Fixes: 6931d21f87bc ("openvswitch: defer tunnel netdev_put to RCU release")
Cc: stable@vger.kernel.org
Acked-by: Eelco Chaudron &lt;echaudro@redhat.com&gt;
Signed-off-by: Ilya Maximets &lt;i.maximets@ovn.org&gt;
Acked-by: Aaron Conole &lt;aconole@redhat.com&gt;
Link: https://patch.msgid.link/20260430233848.440994-2-i.maximets@ovn.org
Signed-off-by: Paolo Abeni &lt;pabeni@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>openvswitch: vport: fix race between tunnel creation and linking</title>
<updated>2026-05-05T13:14:33+00:00</updated>
<author>
<name>Ilya Maximets</name>
<email>i.maximets@ovn.org</email>
</author>
<published>2026-04-30T21:32:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=83861c48ba122f85cc8384780764b3a791341678'/>
<id>83861c48ba122f85cc8384780764b3a791341678</id>
<content type='text'>
When a tunnel vport is created it first creates the tunnel device, e.g.,
with geneve_dev_create_fb(), then it calls ovs_netdev_link() to take a
reference and link it to the device that represents openvswitch datapath.

The creation of the device is happening under RTNL, but then RTNL is
released and re-acquired to find the device by name.  It is technically
possible for the tunnel device to be re-named or deleted within that
window while RTNL is not held, and some other device created in its
place.  This will cause a non-tunnel device to be referenced in the
vport and tunnel-specific functions used on it, e.g. vxlan_get_options()
that directly casts the private netdev data into a struct vxlan_dev
causing an invalid memory access:

 BUG: KASAN: slab-use-after-free in vxlan_get_options+0x323/0x3a0
  vxlan_get_options+0x323/0x3a0
  ovs_vport_cmd_new+0x6e3/0xd30

Fix that by taking a reference to the just created device before
releasing RTNL.  This ensures that the device in the vport is always
the one that was just created.  The search by name is only needed
for a standard vport-netdev that links pre-existing devices, so that
functionality and device type checks are moved to netdev_create().

It is also awkward that ovs_netdev_link() takes ownership of the vport
and destroys it on failure.  It doesn't know the type of the port it is
dealing with, so we need to pass down the indicator that it's a tunnel,
so the link can be properly deleted on failure.

It's possible to refactor the logic to make the ovs_netdev_link() do
only the linking part and let the callers perform a proper destruction,
but it will be much more code for each legacy tunnel port type, so it
is not worth it for the bug fix.

Fixes: 614732eaa12d ("openvswitch: Use regular VXLAN net_device device")
Reported-by: Yuan Tan &lt;tanyuan98@outlook.com&gt;
Reported-by: Yifan Wu &lt;yifanwucs@gmail.com&gt;
Reported-by: Juefei Pu &lt;tomapufckgml@gmail.com&gt;
Reported-by: Xin Liu &lt;bird@lzu.edu.cn&gt;
Reported-by: Yang Yang &lt;n05ec@lzu.edu.cn&gt;
Signed-off-by: Ilya Maximets &lt;i.maximets@ovn.org&gt;
Acked-by: Eelco Chaudron &lt;echaudro@redhat.com&gt;
Link: https://patch.msgid.link/20260430213349.407991-1-i.maximets@ovn.org
Signed-off-by: Paolo Abeni &lt;pabeni@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When a tunnel vport is created it first creates the tunnel device, e.g.,
with geneve_dev_create_fb(), then it calls ovs_netdev_link() to take a
reference and link it to the device that represents openvswitch datapath.

The creation of the device is happening under RTNL, but then RTNL is
released and re-acquired to find the device by name.  It is technically
possible for the tunnel device to be re-named or deleted within that
window while RTNL is not held, and some other device created in its
place.  This will cause a non-tunnel device to be referenced in the
vport and tunnel-specific functions used on it, e.g. vxlan_get_options()
that directly casts the private netdev data into a struct vxlan_dev
causing an invalid memory access:

 BUG: KASAN: slab-use-after-free in vxlan_get_options+0x323/0x3a0
  vxlan_get_options+0x323/0x3a0
  ovs_vport_cmd_new+0x6e3/0xd30

Fix that by taking a reference to the just created device before
releasing RTNL.  This ensures that the device in the vport is always
the one that was just created.  The search by name is only needed
for a standard vport-netdev that links pre-existing devices, so that
functionality and device type checks are moved to netdev_create().

It is also awkward that ovs_netdev_link() takes ownership of the vport
and destroys it on failure.  It doesn't know the type of the port it is
dealing with, so we need to pass down the indicator that it's a tunnel,
so the link can be properly deleted on failure.

It's possible to refactor the logic to make the ovs_netdev_link() do
only the linking part and let the callers perform a proper destruction,
but it will be much more code for each legacy tunnel port type, so it
is not worth it for the bug fix.

Fixes: 614732eaa12d ("openvswitch: Use regular VXLAN net_device device")
Reported-by: Yuan Tan &lt;tanyuan98@outlook.com&gt;
Reported-by: Yifan Wu &lt;yifanwucs@gmail.com&gt;
Reported-by: Juefei Pu &lt;tomapufckgml@gmail.com&gt;
Reported-by: Xin Liu &lt;bird@lzu.edu.cn&gt;
Reported-by: Yang Yang &lt;n05ec@lzu.edu.cn&gt;
Signed-off-by: Ilya Maximets &lt;i.maximets@ovn.org&gt;
Acked-by: Eelco Chaudron &lt;echaudro@redhat.com&gt;
Link: https://patch.msgid.link/20260430213349.407991-1-i.maximets@ovn.org
Signed-off-by: Paolo Abeni &lt;pabeni@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>openvswitch: cap upcall PID array size and pre-size vport replies</title>
<updated>2026-04-20T18:43:04+00:00</updated>
<author>
<name>Weiming Shi</name>
<email>bestswngs@gmail.com</email>
</author>
<published>2026-04-16T02:46:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=2091c6aa0df6aba47deb5c8ab232b1cb60af3519'/>
<id>2091c6aa0df6aba47deb5c8ab232b1cb60af3519</id>
<content type='text'>
The vport netlink reply helpers allocate a fixed-size skb with
nlmsg_new(NLMSG_DEFAULT_SIZE, ...) but serialize the full upcall PID
array via ovs_vport_get_upcall_portids().  Since
ovs_vport_set_upcall_portids() accepts any non-zero multiple of
sizeof(u32) with no upper bound, a CAP_NET_ADMIN user can install a PID
array large enough to overflow the reply buffer, causing nla_put() to
fail with -EMSGSIZE and hitting BUG_ON(err &lt; 0).  On systems with
unprivileged user namespaces enabled (e.g., Ubuntu default), this is
reachable via unshare -Urn since OVS vport mutation operations use
GENL_UNS_ADMIN_PERM.

 kernel BUG at net/openvswitch/datapath.c:2414!
 Oops: invalid opcode: 0000 [#1] SMP KASAN NOPTI
 CPU: 1 UID: 0 PID: 65 Comm: poc Not tainted 7.0.0-rc7-00195-geb216e422044 #1
 RIP: 0010:ovs_vport_cmd_set+0x34c/0x400
 Call Trace:
  &lt;TASK&gt;
  genl_family_rcv_msg_doit (net/netlink/genetlink.c:1116)
  genl_rcv_msg (net/netlink/genetlink.c:1194)
  netlink_rcv_skb (net/netlink/af_netlink.c:2550)
  genl_rcv (net/netlink/genetlink.c:1219)
  netlink_unicast (net/netlink/af_netlink.c:1344)
  netlink_sendmsg (net/netlink/af_netlink.c:1894)
  __sys_sendto (net/socket.c:2206)
  __x64_sys_sendto (net/socket.c:2209)
  do_syscall_64 (arch/x86/entry/syscall_64.c:63)
  entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
  &lt;/TASK&gt;
 Kernel panic - not syncing: Fatal exception

Reject attempts to set more PIDs than nr_cpu_ids in
ovs_vport_set_upcall_portids(), and pre-compute the worst-case reply
size in ovs_vport_cmd_msg_size() based on that bound, similar to the
existing ovs_dp_cmd_msg_size().  nr_cpu_ids matches the cap already
used by the per-CPU dispatch configuration on the datapath side
(ovs_dp_cmd_fill_info() serialises at most nr_cpu_ids PIDs), so the
two sides stay consistent.

Fixes: 5cd667b0a456 ("openvswitch: Allow each vport to have an array of 'port_id's.")
Reported-by: Xiang Mei &lt;xmei5@asu.edu&gt;
Assisted-by: Claude:claude-opus-4-6
Signed-off-by: Weiming Shi &lt;bestswngs@gmail.com&gt;
Reviewed-by: Ilya Maximets &lt;i.maximets@ovn.org&gt;
Link: https://patch.msgid.link/20260416024653.153456-2-bestswngs@gmail.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The vport netlink reply helpers allocate a fixed-size skb with
nlmsg_new(NLMSG_DEFAULT_SIZE, ...) but serialize the full upcall PID
array via ovs_vport_get_upcall_portids().  Since
ovs_vport_set_upcall_portids() accepts any non-zero multiple of
sizeof(u32) with no upper bound, a CAP_NET_ADMIN user can install a PID
array large enough to overflow the reply buffer, causing nla_put() to
fail with -EMSGSIZE and hitting BUG_ON(err &lt; 0).  On systems with
unprivileged user namespaces enabled (e.g., Ubuntu default), this is
reachable via unshare -Urn since OVS vport mutation operations use
GENL_UNS_ADMIN_PERM.

 kernel BUG at net/openvswitch/datapath.c:2414!
 Oops: invalid opcode: 0000 [#1] SMP KASAN NOPTI
 CPU: 1 UID: 0 PID: 65 Comm: poc Not tainted 7.0.0-rc7-00195-geb216e422044 #1
 RIP: 0010:ovs_vport_cmd_set+0x34c/0x400
 Call Trace:
  &lt;TASK&gt;
  genl_family_rcv_msg_doit (net/netlink/genetlink.c:1116)
  genl_rcv_msg (net/netlink/genetlink.c:1194)
  netlink_rcv_skb (net/netlink/af_netlink.c:2550)
  genl_rcv (net/netlink/genetlink.c:1219)
  netlink_unicast (net/netlink/af_netlink.c:1344)
  netlink_sendmsg (net/netlink/af_netlink.c:1894)
  __sys_sendto (net/socket.c:2206)
  __x64_sys_sendto (net/socket.c:2209)
  do_syscall_64 (arch/x86/entry/syscall_64.c:63)
  entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
  &lt;/TASK&gt;
 Kernel panic - not syncing: Fatal exception

Reject attempts to set more PIDs than nr_cpu_ids in
ovs_vport_set_upcall_portids(), and pre-compute the worst-case reply
size in ovs_vport_cmd_msg_size() based on that bound, similar to the
existing ovs_dp_cmd_msg_size().  nr_cpu_ids matches the cap already
used by the per-CPU dispatch configuration on the datapath side
(ovs_dp_cmd_fill_info() serialises at most nr_cpu_ids PIDs), so the
two sides stay consistent.

Fixes: 5cd667b0a456 ("openvswitch: Allow each vport to have an array of 'port_id's.")
Reported-by: Xiang Mei &lt;xmei5@asu.edu&gt;
Assisted-by: Claude:claude-opus-4-6
Signed-off-by: Weiming Shi &lt;bestswngs@gmail.com&gt;
Reviewed-by: Ilya Maximets &lt;i.maximets@ovn.org&gt;
Link: https://patch.msgid.link/20260416024653.153456-2-bestswngs@gmail.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: use get_random_u{16,32,64}() where appropriate</title>
<updated>2026-04-10T02:27:43+00:00</updated>
<author>
<name>David Carlier</name>
<email>devnexen@gmail.com</email>
</author>
<published>2026-04-07T15:07:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=9addea5d44b69d377ba97a36f7a19e1097969e18'/>
<id>9addea5d44b69d377ba97a36f7a19e1097969e18</id>
<content type='text'>
Use the typed random integer helpers instead of
get_random_bytes() when filling a single integer variable.
The helpers return the value directly, require no pointer
or size argument, and better express intent.

Skipped sites writing into __be16 (netdevsim) and __le64
(ceph) fields where a direct assignment would trigger
sparse endianness warnings.

Signed-off-by: David Carlier &lt;devnexen@gmail.com&gt;
Reviewed-by: Matthieu Baerts (NGI0) &lt;matttbe@kernel.org&gt;
Reviewed-by: Eric Dumazet &lt;edumazet@google.com&gt;
Link: https://patch.msgid.link/20260407150758.5889-1-devnexen@gmail.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Use the typed random integer helpers instead of
get_random_bytes() when filling a single integer variable.
The helpers return the value directly, require no pointer
or size argument, and better express intent.

Skipped sites writing into __be16 (netdevsim) and __le64
(ceph) fields where a direct assignment would trigger
sparse endianness warnings.

Signed-off-by: David Carlier &lt;devnexen@gmail.com&gt;
Reviewed-by: Matthieu Baerts (NGI0) &lt;matttbe@kernel.org&gt;
Reviewed-by: Eric Dumazet &lt;edumazet@google.com&gt;
Link: https://patch.msgid.link/20260407150758.5889-1-devnexen@gmail.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: convert remaining ipv6_stub users to direct function calls</title>
<updated>2026-03-29T18:21:23+00:00</updated>
<author>
<name>Fernando Fernandez Mancera</name>
<email>fmancera@suse.de</email>
</author>
<published>2026-03-25T12:08:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=d76f6b170a10414a9c674b6db4bd0473b1b30050'/>
<id>d76f6b170a10414a9c674b6db4bd0473b1b30050</id>
<content type='text'>
As IPv6 is built-in only, the ipv6_stub infrastructure is no longer
necessary.

Convert remaining ipv6_stub users to make direct function calls. The
fallback functions introduced previously will prevent linkage errors
when CONFIG_IPV6 is disabled.

Signed-off-by: Fernando Fernandez Mancera &lt;fmancera@suse.de&gt;
Tested-by: Ricardo B. Marlière &lt;rbm@suse.com&gt;
Link: https://patch.msgid.link/20260325120928.15848-9-fmancera@suse.de
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
As IPv6 is built-in only, the ipv6_stub infrastructure is no longer
necessary.

Convert remaining ipv6_stub users to make direct function calls. The
fallback functions introduced previously will prevent linkage errors
when CONFIG_IPV6 is disabled.

Signed-off-by: Fernando Fernandez Mancera &lt;fmancera@suse.de&gt;
Tested-by: Ricardo B. Marlière &lt;rbm@suse.com&gt;
Link: https://patch.msgid.link/20260325120928.15848-9-fmancera@suse.de
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
