linux.git/net/netlink, branch v6.11

net: netlink: remove the cb_mutex "injection" from netlink core

2024-06-10T12:15:40+00:00

Back in 2007, in commit af65bdfce98d ("[NETLINK]: Switch cb_lock spinlock
to mutex and allow to override it") netlink core was extended to allow
subsystems to replace the dump mutex lock with its own lock.

The mechanism was used by rtnetlink to take rtnl_lock but it isn't
sufficiently flexible for other users. Over the 17 years since
it was added no other user appeared. Since rtnetlink needs conditional
locking now, and doesn't use it either, axe this feature complete.

Signed-off-by: Jakub Kicinski 
Reviewed-by: Kuniyuki Iwashima 
Signed-off-by: David S. Miller

rtnetlink: move rtnl_lock handling out of af_netlink

2024-06-10T12:15:40+00:00

Now that we have an intermediate layer of code for handling
rtnl-level netlink dump quirks, we can move the rtnl_lock
taking there.

For dump handlers with RTNL_FLAG_DUMP_SPLIT_NLM_DONE we can
avoid taking rtnl_lock just to generate NLM_DONE, once again.

Signed-off-by: Jakub Kicinski 
Reviewed-by: Kuniyuki Iwashima 
Reviewed-by: Eric Dumazet 
Signed-off-by: David S. Miller

netlink: support all extack types in dumps

2024-04-23T17:09:49+00:00

Note that when this commit message refers to netlink dump
it only means the actual dumping part, the parsing / dump
start is handled by the same code as "doit".

Commit 4a19edb60d02 ("netlink: Pass extack to dump handlers")
added support for returning extack messages from dump handlers,
but left out other extack info, e.g. bad attribute.

This used to be fine because until YNL we had little practical
use for the machine readable attributes, and only messages were
used in practice.

YNL flips the preference 180 degrees, it's now much more useful
to point to a bad attr with NL_SET_BAD_ATTR() than type
an English message saying "attribute XYZ is $reason-why-bad".

Support all of extack. The fact that extack only gets added if
it fits remains unaddressed.

Reviewed-by: David Ahern 
Link: https://lore.kernel.org/r/20240420023543.3300306-4-kuba@kernel.org
Signed-off-by: Jakub Kicinski

netlink: move extack writing helpers

2024-04-23T17:09:49+00:00

Next change will need them in netlink_dump_done(), pure move.

Reviewed-by: David Ahern 
Link: https://lore.kernel.org/r/20240420023543.3300306-3-kuba@kernel.org
Signed-off-by: Jakub Kicinski

netlink: create a new header for internal genetlink symbols

2024-04-02T04:44:34+00:00

There are things in linux/genetlink.h which are only used
under net/netlink/. Move them to a new local header.
A new header with just 2 externs isn't great, but alternative
would be to include af_netlink.h in genetlink.c which feels
even worse.

Link: https://lore.kernel.org/r/20240329175710.291749-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski

net/netlink: Add getsockopt support for NETLINK_LISTEN_ALL_NSID

2024-03-11T22:48:34+00:00

Currently getsockopt does not support NETLINK_LISTEN_ALL_NSID,
and we are unable to get the value of NETLINK_LISTEN_ALL_NSID
socket option through getsockopt.

This patch adds getsockopt support for NETLINK_LISTEN_ALL_NSID.

Signed-off-by: Juntong Deng 
Link: https://lore.kernel.org/r/AM6PR03MB58482322B7B335308DA56FE599272@AM6PR03MB5848.eurprd03.prod.outlook.com
Signed-off-by: Jakub Kicinski

genetlink: fit NLMSG_DONE into same read() as families

2024-03-06T08:07:45+00:00

Make sure ctrl_fill_info() returns sensible error codes and
propagate them out to netlink core. Let netlink core decide
when to return skb->len and when to treat the exit as an
error. Netlink core does better job at it, if we always
return skb->len the core doesn't know when we're done
dumping and NLMSG_DONE ends up in a separate read().

Reviewed-by: Eric Dumazet 
Signed-off-by: Jakub Kicinski 
Reviewed-by: Ido Schimmel 
Signed-off-by: David S. Miller

netlink: handle EMSGSIZE errors in the core

2024-03-06T08:07:44+00:00

Eric points out that our current suggested way of handling
EMSGSIZE errors ((err == -EMSGSIZE) ? skb->len : err) will
break if we didn't fit even a single object into the buffer
provided by the user. This should not happen for well behaved
applications, but we can fix that, and free netlink families
from dealing with that completely by moving error handling
into the core.

Let's assume from now on that all EMSGSIZE errors in dumps are
because we run out of skb space. Families can now propagate
the error nla_put_*() etc generated and not worry about any
return value magic. If some family really wants to send EMSGSIZE
to user space, assuming it generates the same error on the next
dump iteration the skb->len should be 0, and user space should
still see the EMSGSIZE.

This should simplify families and prevent mistakes in return
values which lead to DONE being forced into a separate recv()
call as discovered by Ido some time ago.

Reviewed-by: Eric Dumazet 
Signed-off-by: Jakub Kicinski 
Reviewed-by: Ido Schimmel 
Signed-off-by: David S. Miller

Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

2024-02-29T22:24:56+00:00

Cross-merge networking fixes after downstream PR.

Conflicts:

net/mptcp/protocol.c
  adf1bb78dab5 ("mptcp: fix snd_wnd initialization for passive socket")
  9426ce476a70 ("mptcp: annotate lockless access for RX path fields")
https://lore.kernel.org/all/20240228103048.19255709@canb.auug.org.au/

Adjacent changes:

drivers/dpll/dpll_core.c
  0d60d8df6f49 ("dpll: rely on rcu for netdev_dpll_pin()")
  e7f8df0e81bf ("dpll: move xa_erase() call in to match dpll_pin_alloc() error path order")

drivers/net/veth.c
  1ce7d306ea63 ("veth: try harder when allocating queue memory")
  0bef512012b1 ("net: add netdev_lockdep_set_classes() to virtual drivers")

drivers/net/wireless/intel/iwlwifi/mvm/d3.c
  8c9bef26e98b ("wifi: iwlwifi: mvm: d3: implement suspend with MLO")
  78f65fbf421a ("wifi: iwlwifi: mvm: ensure offloading TID queue exists")

net/wireless/nl80211.c
  f78c1375339a ("wifi: nl80211: reject iftype change with mesh ID change")
  414532d8aa89 ("wifi: cfg80211: use IEEE80211_MAX_MESH_ID_LEN appropriately")

Signed-off-by: Jakub Kicinski

netlink: use kvmalloc() in netlink_alloc_large_skb()

2024-02-27T16:31:28+00:00

This is a followup of commit 234ec0b6034b ("netlink: fix potential
sleeping issue in mqueue_flush_file"), because vfree_atomic()
overhead is unfortunate for medium sized allocations.

1) If the allocation is smaller than PAGE_SIZE, do not bother
   with vmalloc() at all. Some arches have 64KB PAGE_SIZE,
   while NLMSG_GOODSIZE is smaller than 8KB.

2) Use kvmalloc(), which might allocate one high order page
   instead of vmalloc if memory is not too fragmented.

Signed-off-by: Eric Dumazet 
Reviewed-by: Zhengchao Shao 
Link: https://lore.kernel.org/r/20240224090630.605917-1-edumazet@google.com
Signed-off-by: Jakub Kicinski