<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/Documentation/netlink/specs, branch v6.9</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>netlink: specs: Add missing bridge linkinfo attrs</title>
<updated>2024-05-07T02:06:07+00:00</updated>
<author>
<name>Donald Hunter</name>
<email>donald.hunter@gmail.com</email>
</author>
<published>2024-05-03T16:43:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=9adcac6506185dd1a727f1784b89f30cd217ef7e'/>
<id>9adcac6506185dd1a727f1784b89f30cd217ef7e</id>
<content type='text'>
Attributes for FDB learned entries were added to the if_link netlink api
for bridge linkinfo but are missing from the rt_link.yaml spec. Add the
missing attributes to the spec.

Fixes: ddd1ad68826d ("net: bridge: Add netlink knobs for number / max learned FDB entries")
Signed-off-by: Donald Hunter &lt;donald.hunter@gmail.com&gt;
Acked-by: Nikolay Aleksandrov &lt;razor@blackwall.org&gt;
Reviewed-by: Jacob Keller &lt;jacob.e.keller@intel.com&gt;
Link: https://lore.kernel.org/r/20240503164304.87427-1-donald.hunter@gmail.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Attributes for FDB learned entries were added to the if_link netlink api
for bridge linkinfo but are missing from the rt_link.yaml spec. Add the
missing attributes to the spec.

Fixes: ddd1ad68826d ("net: bridge: Add netlink knobs for number / max learned FDB entries")
Signed-off-by: Donald Hunter &lt;donald.hunter@gmail.com&gt;
Acked-by: Nikolay Aleksandrov &lt;razor@blackwall.org&gt;
Reviewed-by: Jacob Keller &lt;jacob.e.keller@intel.com&gt;
Link: https://lore.kernel.org/r/20240503164304.87427-1-donald.hunter@gmail.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net</title>
<updated>2024-03-12T03:38:36+00:00</updated>
<author>
<name>Jakub Kicinski</name>
<email>kuba@kernel.org</email>
</author>
<published>2024-03-12T03:37:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=ed1f164038b50c5864aa85389f3ffd456f050cca'/>
<id>ed1f164038b50c5864aa85389f3ffd456f050cca</id>
<content type='text'>
Merge in late fixes to prepare for the 6.9 net-next PR.

Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Merge in late fixes to prepare for the 6.9 net-next PR.

Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>devlink: Fix length of eswitch inline-mode</title>
<updated>2024-03-11T20:13:53+00:00</updated>
<author>
<name>William Tu</name>
<email>witu@nvidia.com</email>
</author>
<published>2024-03-10T16:45:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=8f4cd89bf10607de08231d6d91a73dd63336808e'/>
<id>8f4cd89bf10607de08231d6d91a73dd63336808e</id>
<content type='text'>
Set eswitch inline-mode to be u8, not u16. Otherwise, errors below

$ devlink dev eswitch set pci/0000:08:00.0 mode switchdev \
  inline-mode network
    Error: Attribute failed policy validation.
    kernel answers: Numerical result out of rang
    netlink: 'devlink': attribute type 26 has an invalid length.

Fixes: f2f9dd164db0 ("netlink: specs: devlink: add the remaining command to generate complete split_ops")
Signed-off-by: William Tu &lt;witu@nvidia.com&gt;
Reviewed-by: Jiri Pirko &lt;jiri@nvidia.com&gt;
Reviewed-by: Simon Horman &lt;horms@kernel.org&gt;
Link: https://lore.kernel.org/r/20240310164547.35219-1-witu@nvidia.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Set eswitch inline-mode to be u8, not u16. Otherwise, errors below

$ devlink dev eswitch set pci/0000:08:00.0 mode switchdev \
  inline-mode network
    Error: Attribute failed policy validation.
    kernel answers: Numerical result out of rang
    netlink: 'devlink': attribute type 26 has an invalid length.

Fixes: f2f9dd164db0 ("netlink: specs: devlink: add the remaining command to generate complete split_ops")
Signed-off-by: William Tu &lt;witu@nvidia.com&gt;
Reviewed-by: Jiri Pirko &lt;jiri@nvidia.com&gt;
Reviewed-by: Simon Horman &lt;horms@kernel.org&gt;
Link: https://lore.kernel.org/r/20240310164547.35219-1-witu@nvidia.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>netdev: add queue stat for alloc failures</title>
<updated>2024-03-08T05:13:26+00:00</updated>
<author>
<name>Jakub Kicinski</name>
<email>kuba@kernel.org</email>
</author>
<published>2024-03-06T19:55:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=92f8b1f5ca0f157f564e75cef4c63641c172e0f1'/>
<id>92f8b1f5ca0f157f564e75cef4c63641c172e0f1</id>
<content type='text'>
Rx alloc failures are commonly counted by drivers.
Support reporting those via netdev-genl queue stats.

Acked-by: Stanislav Fomichev &lt;sdf@google.com&gt;
Reviewed-by: Amritha Nambiar &lt;amritha.nambiar@intel.com&gt;
Reviewed-by: Xuan Zhuo &lt;xuanzhuo@linux.alibaba.com&gt;
Link: https://lore.kernel.org/r/20240306195509.1502746-3-kuba@kernel.org
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Rx alloc failures are commonly counted by drivers.
Support reporting those via netdev-genl queue stats.

Acked-by: Stanislav Fomichev &lt;sdf@google.com&gt;
Reviewed-by: Amritha Nambiar &lt;amritha.nambiar@intel.com&gt;
Reviewed-by: Xuan Zhuo &lt;xuanzhuo@linux.alibaba.com&gt;
Link: https://lore.kernel.org/r/20240306195509.1502746-3-kuba@kernel.org
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>netdev: add per-queue statistics</title>
<updated>2024-03-08T05:13:25+00:00</updated>
<author>
<name>Jakub Kicinski</name>
<email>kuba@kernel.org</email>
</author>
<published>2024-03-06T19:55:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=ab63a2387cb906d43b72a8effb611bbaecb2d0cd'/>
<id>ab63a2387cb906d43b72a8effb611bbaecb2d0cd</id>
<content type='text'>
The ethtool-nl family does a good job exposing various protocol
related and IEEE/IETF statistics which used to get dumped under
ethtool -S, with creative names. Queue stats don't have a netlink
API, yet, and remain a lion's share of ethtool -S output for new
drivers. Not only is that bad because the names differ driver to
driver but it's also bug-prone. Intuitively drivers try to report
only the stats for active queues, but querying ethtool stats
involves multiple system calls, and the number of stats is
read separately from the stats themselves. Worse still when user
space asks for values of the stats, it doesn't inform the kernel
how big the buffer is. If number of stats increases in the meantime
kernel will overflow user buffer.

Add a netlink API for dumping queue stats. Queue information is
exposed via the netdev-genl family, so add the stats there.
Support per-queue and sum-for-device dumps. Latter will be useful
when subsequent patches add more interesting common stats than
just bytes and packets.

The API does not currently distinguish between HW and SW stats.
The expectation is that the source of the stats will either not
matter much (good packets) or be obvious (skb alloc errors).

Acked-by: Stanislav Fomichev &lt;sdf@google.com&gt;
Reviewed-by: Amritha Nambiar &lt;amritha.nambiar@intel.com&gt;
Reviewed-by: Xuan Zhuo &lt;xuanzhuo@linux.alibaba.com&gt;
Link: https://lore.kernel.org/r/20240306195509.1502746-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The ethtool-nl family does a good job exposing various protocol
related and IEEE/IETF statistics which used to get dumped under
ethtool -S, with creative names. Queue stats don't have a netlink
API, yet, and remain a lion's share of ethtool -S output for new
drivers. Not only is that bad because the names differ driver to
driver but it's also bug-prone. Intuitively drivers try to report
only the stats for active queues, but querying ethtool stats
involves multiple system calls, and the number of stats is
read separately from the stats themselves. Worse still when user
space asks for values of the stats, it doesn't inform the kernel
how big the buffer is. If number of stats increases in the meantime
kernel will overflow user buffer.

Add a netlink API for dumping queue stats. Queue information is
exposed via the netdev-genl family, so add the stats there.
Support per-queue and sum-for-device dumps. Latter will be useful
when subsequent patches add more interesting common stats than
just bytes and packets.

The API does not currently distinguish between HW and SW stats.
The expectation is that the source of the stats will either not
matter much (good packets) or be obvious (skb alloc errors).

Acked-by: Stanislav Fomichev &lt;sdf@google.com&gt;
Reviewed-by: Amritha Nambiar &lt;amritha.nambiar@intel.com&gt;
Reviewed-by: Xuan Zhuo &lt;xuanzhuo@linux.alibaba.com&gt;
Link: https://lore.kernel.org/r/20240306195509.1502746-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>dpll: spec: use proper enum for pin capabilities attribute</title>
<updated>2024-03-08T04:51:00+00:00</updated>
<author>
<name>Jiri Pirko</name>
<email>jiri@nvidia.com</email>
</author>
<published>2024-03-06T12:07:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=5c497a64820ef9f10962a9c37607df45d6395fa5'/>
<id>5c497a64820ef9f10962a9c37607df45d6395fa5</id>
<content type='text'>
The enum is defined, however the pin capabilities attribute does
refer to it. Add this missing enum field.

This fixes ynl cli output:

Example current output:
$ sudo ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/dpll.yaml --do pin-get --json '{"id": 0}'
{'capabilities': 4,
 ...
Example new output:
$ sudo ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/dpll.yaml --do pin-get --json '{"id": 0}'
{'capabilities': {'state-can-change'},
 ...

Fixes: 3badff3a25d8 ("dpll: spec: Add Netlink spec in YAML")
Signed-off-by: Jiri Pirko &lt;jiri@nvidia.com&gt;
Reviewed-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Link: https://lore.kernel.org/r/20240306120739.1447621-1-jiri@resnulli.us
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The enum is defined, however the pin capabilities attribute does
refer to it. Add this missing enum field.

This fixes ynl cli output:

Example current output:
$ sudo ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/dpll.yaml --do pin-get --json '{"id": 0}'
{'capabilities': 4,
 ...
Example new output:
$ sudo ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/dpll.yaml --do pin-get --json '{"id": 0}'
{'capabilities': {'state-can-change'},
 ...

Fixes: 3badff3a25d8 ("dpll: spec: Add Netlink spec in YAML")
Signed-off-by: Jiri Pirko &lt;jiri@nvidia.com&gt;
Reviewed-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Link: https://lore.kernel.org/r/20240306120739.1447621-1-jiri@resnulli.us
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>doc/netlink/specs: Add spec for nlctrl netlink family</title>
<updated>2024-03-08T04:28:38+00:00</updated>
<author>
<name>Donald Hunter</name>
<email>donald.hunter@gmail.com</email>
</author>
<published>2024-03-06T23:10:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=768e044a5fd4cb52e8677e8e18477fa46cfc5329'/>
<id>768e044a5fd4cb52e8677e8e18477fa46cfc5329</id>
<content type='text'>
Add a spec for the nlctrl family.

Example usage:

./tools/net/ynl/cli.py \
    --spec Documentation/netlink/specs/nlctrl.yaml \
    --do getfamily --json '{"family-name": "nlctrl"}'

./tools/net/ynl/cli.py \
    --spec Documentation/netlink/specs/nlctrl.yaml \
    --dump getpolicy --json '{"family-name": "nlctrl"}'

Signed-off-by: Donald Hunter &lt;donald.hunter@gmail.com&gt;
Link: https://lore.kernel.org/r/20240306231046.97158-7-donald.hunter@gmail.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add a spec for the nlctrl family.

Example usage:

./tools/net/ynl/cli.py \
    --spec Documentation/netlink/specs/nlctrl.yaml \
    --do getfamily --json '{"family-name": "nlctrl"}'

./tools/net/ynl/cli.py \
    --spec Documentation/netlink/specs/nlctrl.yaml \
    --dump getpolicy --json '{"family-name": "nlctrl"}'

Signed-off-by: Donald Hunter &lt;donald.hunter@gmail.com&gt;
Link: https://lore.kernel.org/r/20240306231046.97158-7-donald.hunter@gmail.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mptcp: add token for get-addr in yaml</title>
<updated>2024-03-04T13:07:45+00:00</updated>
<author>
<name>Geliang Tang</name>
<email>tanggeliang@kylinos.cn</email>
</author>
<published>2024-03-01T18:18:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=9e6c88e2f05b05892113fc56fabc652ac1ee7043'/>
<id>9e6c88e2f05b05892113fc56fabc652ac1ee7043</id>
<content type='text'>
This patch adds token parameter together with addr in get-addr section in
mptcp_pm.yaml, then use the following commands to update mptcp_pm_gen.c
and mptcp_pm_gen.h:

./tools/net/ynl/ynl-gen-c.py --mode kernel \
        --spec Documentation/netlink/specs/mptcp_pm.yaml --source \
        -o net/mptcp/mptcp_pm_gen.c
./tools/net/ynl/ynl-gen-c.py --mode kernel \
        --spec Documentation/netlink/specs/mptcp_pm.yaml --header \
        -o net/mptcp/mptcp_pm_gen.h

Signed-off-by: Geliang Tang &lt;tanggeliang@kylinos.cn&gt;
Reviewed-by: Matthieu Baerts (NGI0) &lt;matttbe@kernel.org&gt;
Reviewed-by: Mat Martineau &lt;martineau@kernel.org&gt;
Signed-off-by: Matthieu Baerts (NGI0) &lt;matttbe@kernel.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch adds token parameter together with addr in get-addr section in
mptcp_pm.yaml, then use the following commands to update mptcp_pm_gen.c
and mptcp_pm_gen.h:

./tools/net/ynl/ynl-gen-c.py --mode kernel \
        --spec Documentation/netlink/specs/mptcp_pm.yaml --source \
        -o net/mptcp/mptcp_pm_gen.c
./tools/net/ynl/ynl-gen-c.py --mode kernel \
        --spec Documentation/netlink/specs/mptcp_pm.yaml --header \
        -o net/mptcp/mptcp_pm_gen.h

Signed-off-by: Geliang Tang &lt;tanggeliang@kylinos.cn&gt;
Reviewed-by: Matthieu Baerts (NGI0) &lt;matttbe@kernel.org&gt;
Reviewed-by: Mat Martineau &lt;martineau@kernel.org&gt;
Signed-off-by: Matthieu Baerts (NGI0) &lt;matttbe@kernel.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net</title>
<updated>2024-02-16T00:20:04+00:00</updated>
<author>
<name>Jakub Kicinski</name>
<email>kuba@kernel.org</email>
</author>
<published>2024-02-15T22:01:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=73be9a3aabdd976123e7f05dd20dbcf131347e84'/>
<id>73be9a3aabdd976123e7f05dd20dbcf131347e84</id>
<content type='text'>
Cross-merge networking fixes after downstream PR.

No conflicts.

Adjacent changes:

net/core/dev.c
  9f30831390ed ("net: add rcu safety to rtnl_prop_list_size()")
  723de3ebef03 ("net: free altname using an RCU callback")

net/unix/garbage.c
  11498715f266 ("af_unix: Remove io_uring code for GC.")
  25236c91b5ab ("af_unix: Fix task hung while purging oob_skb in GC.")

drivers/net/ethernet/renesas/ravb_main.c
  ed4adc07207d ("net: ravb: Count packets instead of descriptors in GbEth RX path"
)
  c2da9408579d ("ravb: Add Rx checksum offload support for GbEth")

net/mptcp/protocol.c
  bdd70eb68913 ("mptcp: drop the push_pending field")
  28e5c1380506 ("mptcp: annotate lockless accesses around read-mostly fields")

Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Cross-merge networking fixes after downstream PR.

No conflicts.

Adjacent changes:

net/core/dev.c
  9f30831390ed ("net: add rcu safety to rtnl_prop_list_size()")
  723de3ebef03 ("net: free altname using an RCU callback")

net/unix/garbage.c
  11498715f266 ("af_unix: Remove io_uring code for GC.")
  25236c91b5ab ("af_unix: Fix task hung while purging oob_skb in GC.")

drivers/net/ethernet/renesas/ravb_main.c
  ed4adc07207d ("net: ravb: Count packets instead of descriptors in GbEth RX path"
)
  c2da9408579d ("ravb: Add Rx checksum offload support for GbEth")

net/mptcp/protocol.c
  bdd70eb68913 ("mptcp: drop the push_pending field")
  28e5c1380506 ("mptcp: annotate lockless accesses around read-mostly fields")

Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>dpll: fix possible deadlock during netlink dump operation</title>
<updated>2024-02-09T02:29:21+00:00</updated>
<author>
<name>Jiri Pirko</name>
<email>jiri@nvidia.com</email>
</author>
<published>2024-02-07T11:59:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=53c0441dd2c44ee93fddb5473885fd41e4bc2361'/>
<id>53c0441dd2c44ee93fddb5473885fd41e4bc2361</id>
<content type='text'>
Recently, I've been hitting following deadlock warning during dpll pin
dump:

[52804.637962] ======================================================
[52804.638536] WARNING: possible circular locking dependency detected
[52804.639111] 6.8.0-rc2jiri+ #1 Not tainted
[52804.639529] ------------------------------------------------------
[52804.640104] python3/2984 is trying to acquire lock:
[52804.640581] ffff88810e642678 (nlk_cb_mutex-GENERIC){+.+.}-{3:3}, at: netlink_dump+0xb3/0x780
[52804.641417]
               but task is already holding lock:
[52804.642010] ffffffff83bde4c8 (dpll_lock){+.+.}-{3:3}, at: dpll_lock_dumpit+0x13/0x20
[52804.642747]
               which lock already depends on the new lock.

[52804.643551]
               the existing dependency chain (in reverse order) is:
[52804.644259]
               -&gt; #1 (dpll_lock){+.+.}-{3:3}:
[52804.644836]        lock_acquire+0x174/0x3e0
[52804.645271]        __mutex_lock+0x119/0x1150
[52804.645723]        dpll_lock_dumpit+0x13/0x20
[52804.646169]        genl_start+0x266/0x320
[52804.646578]        __netlink_dump_start+0x321/0x450
[52804.647056]        genl_family_rcv_msg_dumpit+0x155/0x1e0
[52804.647575]        genl_rcv_msg+0x1ed/0x3b0
[52804.648001]        netlink_rcv_skb+0xdc/0x210
[52804.648440]        genl_rcv+0x24/0x40
[52804.648831]        netlink_unicast+0x2f1/0x490
[52804.649290]        netlink_sendmsg+0x36d/0x660
[52804.649742]        __sock_sendmsg+0x73/0xc0
[52804.650165]        __sys_sendto+0x184/0x210
[52804.650597]        __x64_sys_sendto+0x72/0x80
[52804.651045]        do_syscall_64+0x6f/0x140
[52804.651474]        entry_SYSCALL_64_after_hwframe+0x46/0x4e
[52804.652001]
               -&gt; #0 (nlk_cb_mutex-GENERIC){+.+.}-{3:3}:
[52804.652650]        check_prev_add+0x1ae/0x1280
[52804.653107]        __lock_acquire+0x1ed3/0x29a0
[52804.653559]        lock_acquire+0x174/0x3e0
[52804.653984]        __mutex_lock+0x119/0x1150
[52804.654423]        netlink_dump+0xb3/0x780
[52804.654845]        __netlink_dump_start+0x389/0x450
[52804.655321]        genl_family_rcv_msg_dumpit+0x155/0x1e0
[52804.655842]        genl_rcv_msg+0x1ed/0x3b0
[52804.656272]        netlink_rcv_skb+0xdc/0x210
[52804.656721]        genl_rcv+0x24/0x40
[52804.657119]        netlink_unicast+0x2f1/0x490
[52804.657570]        netlink_sendmsg+0x36d/0x660
[52804.658022]        __sock_sendmsg+0x73/0xc0
[52804.658450]        __sys_sendto+0x184/0x210
[52804.658877]        __x64_sys_sendto+0x72/0x80
[52804.659322]        do_syscall_64+0x6f/0x140
[52804.659752]        entry_SYSCALL_64_after_hwframe+0x46/0x4e
[52804.660281]
               other info that might help us debug this:

[52804.661077]  Possible unsafe locking scenario:

[52804.661671]        CPU0                    CPU1
[52804.662129]        ----                    ----
[52804.662577]   lock(dpll_lock);
[52804.662924]                                lock(nlk_cb_mutex-GENERIC);
[52804.663538]                                lock(dpll_lock);
[52804.664073]   lock(nlk_cb_mutex-GENERIC);
[52804.664490]

The issue as follows: __netlink_dump_start() calls control-&gt;start(cb)
with nlk-&gt;cb_mutex held. In control-&gt;start(cb) the dpll_lock is taken.
Then nlk-&gt;cb_mutex is released and taken again in netlink_dump(), while
dpll_lock still being held. That leads to ABBA deadlock when another
CPU races with the same operation.

Fix this by moving dpll_lock taking into dumpit() callback which ensures
correct lock taking order.

Fixes: 9d71b54b65b1 ("dpll: netlink: Add DPLL framework base functions")
Signed-off-by: Jiri Pirko &lt;jiri@nvidia.com&gt;
Reviewed-by: Arkadiusz Kubalewski &lt;arkadiusz.kubalewski@intel.com&gt;
Link: https://lore.kernel.org/r/20240207115902.371649-1-jiri@resnulli.us
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Recently, I've been hitting following deadlock warning during dpll pin
dump:

[52804.637962] ======================================================
[52804.638536] WARNING: possible circular locking dependency detected
[52804.639111] 6.8.0-rc2jiri+ #1 Not tainted
[52804.639529] ------------------------------------------------------
[52804.640104] python3/2984 is trying to acquire lock:
[52804.640581] ffff88810e642678 (nlk_cb_mutex-GENERIC){+.+.}-{3:3}, at: netlink_dump+0xb3/0x780
[52804.641417]
               but task is already holding lock:
[52804.642010] ffffffff83bde4c8 (dpll_lock){+.+.}-{3:3}, at: dpll_lock_dumpit+0x13/0x20
[52804.642747]
               which lock already depends on the new lock.

[52804.643551]
               the existing dependency chain (in reverse order) is:
[52804.644259]
               -&gt; #1 (dpll_lock){+.+.}-{3:3}:
[52804.644836]        lock_acquire+0x174/0x3e0
[52804.645271]        __mutex_lock+0x119/0x1150
[52804.645723]        dpll_lock_dumpit+0x13/0x20
[52804.646169]        genl_start+0x266/0x320
[52804.646578]        __netlink_dump_start+0x321/0x450
[52804.647056]        genl_family_rcv_msg_dumpit+0x155/0x1e0
[52804.647575]        genl_rcv_msg+0x1ed/0x3b0
[52804.648001]        netlink_rcv_skb+0xdc/0x210
[52804.648440]        genl_rcv+0x24/0x40
[52804.648831]        netlink_unicast+0x2f1/0x490
[52804.649290]        netlink_sendmsg+0x36d/0x660
[52804.649742]        __sock_sendmsg+0x73/0xc0
[52804.650165]        __sys_sendto+0x184/0x210
[52804.650597]        __x64_sys_sendto+0x72/0x80
[52804.651045]        do_syscall_64+0x6f/0x140
[52804.651474]        entry_SYSCALL_64_after_hwframe+0x46/0x4e
[52804.652001]
               -&gt; #0 (nlk_cb_mutex-GENERIC){+.+.}-{3:3}:
[52804.652650]        check_prev_add+0x1ae/0x1280
[52804.653107]        __lock_acquire+0x1ed3/0x29a0
[52804.653559]        lock_acquire+0x174/0x3e0
[52804.653984]        __mutex_lock+0x119/0x1150
[52804.654423]        netlink_dump+0xb3/0x780
[52804.654845]        __netlink_dump_start+0x389/0x450
[52804.655321]        genl_family_rcv_msg_dumpit+0x155/0x1e0
[52804.655842]        genl_rcv_msg+0x1ed/0x3b0
[52804.656272]        netlink_rcv_skb+0xdc/0x210
[52804.656721]        genl_rcv+0x24/0x40
[52804.657119]        netlink_unicast+0x2f1/0x490
[52804.657570]        netlink_sendmsg+0x36d/0x660
[52804.658022]        __sock_sendmsg+0x73/0xc0
[52804.658450]        __sys_sendto+0x184/0x210
[52804.658877]        __x64_sys_sendto+0x72/0x80
[52804.659322]        do_syscall_64+0x6f/0x140
[52804.659752]        entry_SYSCALL_64_after_hwframe+0x46/0x4e
[52804.660281]
               other info that might help us debug this:

[52804.661077]  Possible unsafe locking scenario:

[52804.661671]        CPU0                    CPU1
[52804.662129]        ----                    ----
[52804.662577]   lock(dpll_lock);
[52804.662924]                                lock(nlk_cb_mutex-GENERIC);
[52804.663538]                                lock(dpll_lock);
[52804.664073]   lock(nlk_cb_mutex-GENERIC);
[52804.664490]

The issue as follows: __netlink_dump_start() calls control-&gt;start(cb)
with nlk-&gt;cb_mutex held. In control-&gt;start(cb) the dpll_lock is taken.
Then nlk-&gt;cb_mutex is released and taken again in netlink_dump(), while
dpll_lock still being held. That leads to ABBA deadlock when another
CPU races with the same operation.

Fix this by moving dpll_lock taking into dumpit() callback which ensures
correct lock taking order.

Fixes: 9d71b54b65b1 ("dpll: netlink: Add DPLL framework base functions")
Signed-off-by: Jiri Pirko &lt;jiri@nvidia.com&gt;
Reviewed-by: Arkadiusz Kubalewski &lt;arkadiusz.kubalewski@intel.com&gt;
Link: https://lore.kernel.org/r/20240207115902.371649-1-jiri@resnulli.us
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
