| Age | Commit message (Collapse) | Author |
|
SG2042's PCIe root complexes are cache-coherent with the CPU. Mark all
four PCIe controller nodes (pcie_rc0 through pcie_rc3) as dma-coherent
so the kernel uses coherent DMA mappings instead of non-coherent bounce
buffering.
Cc: stable@vger.kernel.org
Signed-off-by: Han Gao <gaohan@iscas.ac.cn>
Link: https://patch.msgid.link/20260331171248.973014-3-gaohan@iscas.ac.cn
Signed-off-by: Inochi Amaoto <inochiama@gmail.com>
Signed-off-by: Chen Wang <unicorn_wang@outlook.com>
|
|
"userfaultfd: verify VMA state across UFFDIO_COPY retry", which is a
prerequisite for mm-unnstable's series "userfaultfd: merge
fs/userfaultfd.c into mm/userfaultfd.c".
|
|
dsa_port_from_netdev() may return a valid port from a different switch
chip. Programming another chip's port index into the local hardware
causes redirection to the wrong port, or an out-of-bounds access if the
index exceeds the local chip's port count.
Apply a minimal fix that adds a check to catch this case and adjusts the
extack message. When cls->common.skip_sw is not set, the operation could
instead redirect to the upstream port and let the software or upstream
switch(es) handle the forward, but that is not addressed here.
Signed-off-by: David Yang <mmyangfl@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://patch.msgid.link/20260530003940.2000994-1-mmyangfl@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The chenridong@huaweicloud.com is no longer a valid email,
replace it with the personal email ridong.chen@linux.dev
Signed-off-by: Ridong Chen <ridong.chen@linux.dev>
Acked-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
A WARN fires when systemd's user manager writes "+cpu +memory +pids" to
its own subtree_control while a sched_ext scheduler is loaded:
WARNING: at kernel/sched/ext.c:3227 scx_cgroup_move_task+0xa8/0xb0
scx_cgroup_move_task+0xa8/0xb0
sched_move_task+0x134/0x290
cpu_cgroup_attach+0x39/0x70
cgroup_migrate_execute+0x37d/0x450
cgroup_update_dfl_csses+0x1e3/0x270
cgroup_subtree_control_write+0x3e7/0x440
scx_cgroup_can_attach() arms cgrp_moving_from only when a task's cpu
cgroup changes. It can still be NULL when scx_cgroup_move_task() runs,
through this sequence:
Step Result
--------------------------------- ----------------------------------
1. cpu enabled on cgroup G cpu css = A
2. cpu toggled off then on for G A killed, B created (same cgroup)
3. an exiting task keeps A alive migration skips it, A now stale
4. +memory migrates G stale A vs current B pulls cpu in
5. cpu attach runs for all tasks hits a live, cpu-unchanged task
6. scx_cgroup_move_task() on it cgrp_moving_from NULL -> WARN
The mismatch is that scx_cgroup_can_attach() keys on cgroup identity
while migration drives the move on css identity, so a NULL cgrp_moving_from
here is a legitimate css-only migration, not a missing prep.
The call is already gated on cgrp_moving_from, so just drop the warning.
ops.cgroup_prep_move() and ops.cgroup_move() stay paired.
Fixes: 819513666966 ("sched_ext: Add cgroup support")
Cc: stable@vger.kernel.org # v6.12+
Reported-by: Matt Fleming <mfleming@cloudflare.com>
Closes: https://lore.kernel.org/all/20260601124156.2205704-1-mfleming@cloudflare.com/
Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Andrea Righi <arighi@nvidia.com>
|
|
Add the missing @may_sleep parameter description to the
mpc52xx_fec_stop kernel-doc comment.
Signed-off-by: Rosen Penev <rosenp@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://patch.msgid.link/20260531000042.369043-1-rosenp@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The SCTP exact sock_diag lookup can hold a transport reference, block on
lock_sock(sk), and then resume after sctp_association_free() has marked
the association dead and freed its bind address list.
When that happens, inet_assoc_attr_size() and
inet_diag_msg_sctpasoc_fill() can still dereference association state
that is no longer valid for reporting. In particular,
inet_diag_msg_sctpasoc_fill() may read an empty bind-address list as a
real sctp_sockaddr_entry and trigger an out-of-bounds read from
unrelated association memory.
Reject the association after taking the socket lock if it has been
reaped or detached from the endpoint, and report the lookup as stale.
This keeps the exact dump-one path from formatting torn association
state.
Fixes: 8f840e47f190 ("sctp: add the sctp_diag.c file")
Cc: stable@kernel.org
Reported-by: Yuan Tan <yuantan098@gmail.com>
Reported-by: Yifan Wu <yifanwucs@gmail.com>
Reported-by: Juefei Pu <tomapufckgml@gmail.com>
Reported-by: Zhengchuan Liang <zcliangcn@gmail.com>
Reported-by: Xin Liu <bird@lzu.edu.cn>
Signed-off-by: Zhao Zhang <zzhan461@ucr.edu>
Signed-off-by: Ren Wei <n05ec@lzu.edu.cn>
Acked-by: Xin Long <lucien.xin@gmail.com>
Link: https://patch.msgid.link/fac6043fa20a2ff68e12958c431836f692c51268.1780113823.git.zzhan461@ucr.edu
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add dec_ttl action support to the OVS kernel datapath selftest
framework:
- Add dec_ttl nested NLA class to ovs-dpctl.py with proper
OVS_DEC_TTL_ATTR_ACTION sub-attribute handling
- Add parse support for dec_ttl(le_1(<inner_actions>)) action
string, consistent with the odp-util.c format where le_1()
holds the actions taken when TTL reaches 1
- Add dpstr output formatting for dec_ttl actions
- Add test_dec_ttl() to openvswitch.sh that verifies:
* Normal TTL packets are forwarded after decrement
* TTL=1 packets are dropped (TTL expiry)
* Graceful skip via ksft_skip if kernel lacks dec_ttl support
The dec_ttl class uses late-binding type resolution to reference
ovsactions for its inner action list, avoiding circular references
at class definition time.
Signed-off-by: Minxi Hou <houminxi@gmail.com>
Reviewed-by: Aaron Conole <aconole@redhat.com>
Link: https://patch.msgid.link/20260530021443.1734484-1-houminxi@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In fec_resume(), fec_enet_clk_enable() is called before
pinctrl_pm_select_default_state() in the non-WoL path, inverting the
ordering used in fec_suspend() which correctly switches to the sleep
pinctrl state before disabling clocks.
For PHYs with the PHY_RST_AFTER_CLK_EN flag (e.g. TI DP83848 or
SMSC LAN87xx), fec_enet_clk_enable() triggers a hardware reset pulse
via the phy-reset GPIO. With the GPIO pin still in sleep pinctrl state
at that point, the GPIO write has no physical effect and the PHY never
receives the required reset after clock enable, leading to unreliable
link establishment after system resume.
Fix by restoring the default pinctrl state before enabling clocks,
making resume the proper mirror of suspend. The call is made
unconditionally: fec_suspend() only switches to the sleep pinctrl state
on the non-WoL path and leaves the pins in the default state when WoL
is enabled, so on a WoL resume the device is already in the default
state and pinctrl_pm_select_default_state() is a no-op.
Fixes: de40ed31b3c5 ("net: fec: add Wake-on-LAN support")
Signed-off-by: Tapio Reijonen <tapio.reijonen@vaisala.com>
Reviewed-by: Wei Fang <wei.fang@nxp.com>
Link: https://patch.msgid.link/20260529-b4-fec-resume-pinctrl-order-v3-1-6eda0f592fca@vaisala.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Currently there is no config fragment for the rseq selftests but there are
a couple of configuration options which are required for running them:
- CONFIG_RSEQ is required for obvious reasons, it is enabled by default
but it doesn't hurt to specify it in case the user is usinsg a
defconfig that disables it.
- CONFIG_RSEQ_SLICE_EXTENSION is tested by the slice_test test, the
test will fail without it.
Add a configuration fragment which enables these options, helping encourage
CI systems and people doing manual testing to run the tests with all the
features. This also requires CONFIG_EXPERT since it is a dependency for
slice extension.
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Link: https://patch.msgid.link/20260424-selftests-rseq-config-fragment-v2-1-a9475996edcb@kernel.org
|
|
'net-mlx5-avoid-payload-in-skb-s-linear-part-for-better-gro-processing'
Tariq Toukan says:
====================
net/mlx5: Avoid payload in skb's linear part for better GRO-processing
This is V7 of a series originally submitted by Christoph.
When LRO is enabled on the MLX, mlx5e_skb_from_cqe_mpwrq_nonlinear
copies parts of the payload to the linear part of the skb.
This triggers suboptimal processing in GRO, causing slow throughput.
This patch series addresses this by using eth_get_headlen to compute the
size of the protocol headers and only copy those bits. This results in a
significant throughput improvement (detailed results in the specific
patch).
====================
Link: https://patch.msgid.link/20260601061522.398044-1-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
mlx5e_skb_from_cqe_mpwrq_nonlinear() copies MLX5E_RX_MAX_HEAD (256)
bytes from the page-pool to the skb's linear part. Those 256 bytes
include part of the payload.
When attempting to do GRO in skb_gro_receive, if headlen > data_offset
(and skb->head_frag is not set), we end up aggregating packets in the
frag_list.
This is of course not good when we are CPU-limited. Also causes a worse
skb->len/truesize ratio,...
So, let's avoid copying parts of the payload to the linear part. We use
eth_get_headlen() to parse the headers and compute the length of the
protocol headers, which will be used to copy the relevant bits of the
skb's linear part.
We still allocate MLX5E_RX_MAX_HEAD for the skb so that if the networking
stack needs to call pskb_may_pull() later on, we don't need to reallocate
memory.
This gives a nice throughput increase (ARM Neoverse-V2 with CX-7 NIC and
LRO enabled):
BEFORE:
=======
(netserver pinned to core receiving interrupts)
$ netperf -H 10.221.81.118 -T 80,9 -P 0 -l 60 -- -m 256K -M 256K
87380 16384 262144 60.01 32547.82
(netserver pinned to adjacent core receiving interrupts)
$ netperf -H 10.221.81.118 -T 80,10 -P 0 -l 60 -- -m 256K -M 256K
87380 16384 262144 60.00 52531.67
AFTER:
======
(netserver pinned to core receiving interrupts)
$ netperf -H 10.221.81.118 -T 80,9 -P 0 -l 60 -- -m 256K -M 256K
87380 16384 262144 60.00 52896.06
(netserver pinned to adjacent core receiving interrupts)
$ netperf -H 10.221.81.118 -T 80,10 -P 0 -l 60 -- -m 256K -M 256K
87380 16384 262144 60.00 85094.90
Additional tests across a larger range of parameters w/ and w/o LRO, w/
and w/o IPv6-encapsulation, different MTUs (1500, 4096, 9000), different
TCP read/write-sizes as well as UDP benchmarks, all have shown equal or
better performance with this patch.
For XDP pull at most ETH_HLEN bytes in the linear area so that XDP_PASS
can also benefit from this improvement and keep things simple when
dealing with skb geometry changes from the XDP program.
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Christoph Paasch <cpaasch@openai.com>
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260601061522.398044-3-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Doing the call to dma_sync_single_for_cpu() earlier will allow us to
adjust headlen based on the actual size of the protocol headers.
Doing this earlier means that we don't need to call
mlx5e_copy_skb_header() anymore and rather can call
skb_copy_to_linear_data() directly.
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Christoph Paasch <cpaasch@openai.com>
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260601061522.398044-2-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add the necessary bits into the gen2 platforms tables and handlers
to allow decoding streams into 10bit pixel formats.
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Tested-by: Wangao Wang <wangao.wang@oss.qualcomm.com>
Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
Signed-off-by: Bryan O'Donoghue <bod@kernel.org>
|
|
The 10bit pixel format can be only used when the decoder identifies the
stream as decoding into 10bit pixel format buffers, so update the
find_format helper to filter the formats and only allow the proper
formats when setting or trying a capture format.
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Tested-by: Wangao Wang <wangao.wang@oss.qualcomm.com>
Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
Signed-off-by: Bryan O'Donoghue <bod@kernel.org>
|
|
Update the gen2 response and vdec s_fmt code to take in account
the P010 and QC010 when calculating the width, height and stride.
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Tested-by: Wangao Wang <wangao.wang@oss.qualcomm.com>
Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
Signed-off-by: Bryan O'Donoghue <bod@kernel.org>
|
|
Add the necessary plumbing into the HFi Gen2 to signal the decoder
the right 10bit pixel format and stride when in compressed mode.
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Tested-by: Wangao Wang <wangao.wang@oss.qualcomm.com>
Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
Signed-off-by: Bryan O'Donoghue <bod@kernel.org>
|
|
The P010 (YUV format with 16-bits per pixel with interleaved UV)
and QC10C (P010 compressed mode similar to QC08C) requires specific
buffer calculations to allocate the right buffer size for the DPB
(decoded picture buffer) frames and frames consumed by userspace.
Similar to 8bit, the 10bit DPB frames uses QC10C format.
Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Tested-by: Wangao Wang <wangao.wang@oss.qualcomm.com>
Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
Signed-off-by: Bryan O'Donoghue <bod@kernel.org>
|
|
To simplify code checking for pixel formats, add helpers to
check for 8bit and 10bit formats.
Reviewed-by: Dikshita Agarwal <dikshita.agarwal@oss.qualcomm.com>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Tested-by: Wangao Wang <wangao.wang@oss.qualcomm.com>
Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
Signed-off-by: Bryan O'Donoghue <bod@kernel.org>
|
|
Use div_u64() instead of mult_fract as u64 operator division fails on 32 bit
systems which don't link against libgcc.
Fixes: 5c66647a5c3e ("media: iris: add FPS calculation and VPP FW overhead in frequency formula")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202606030132.qnBXVDkM-lkp@intel.com/
Signed-off-by: Bryan O'Donoghue <bod@kernel.org>
|
|
VLAN-tagged interfaces on lan743x devices were previously unreachable via
SSH and failed to respond to large ping packets (e.g. "ping -s 1469" given
MTU=1500). In these scenarios, "ethtool -S" reports non-zero "RX Oversize
Frame Errors". According to Microchip AN2948, the MAC_RX FSE (VLAN field
size enforcement) bit determines whether frames with VLAN tags exceeding
the base MTU plus tag length are discarded.
The driver must set the MAC_RX.FSE bit before setting MAC_RX.RXEN to allow
VLAN-tagged frames up to the interface MTU, preventing them from being
treated as oversized. As a result, both the base and VLAN-tagged interfaces
can use the same MTU without receive errors.
Fixes: 23f0703c125b ("lan743x: Add main source files for new lan743x driver")
Signed-off-by: David Thompson <davthompson@nvidia.com>
Reviewed-by: Thangaraj Samynathan <Thangaraj.s@microchip.com>
Reviewed-by: Nicolai Buchwitz <nb@tipi-net.de>
Tested-by: Nicolai Buchwitz <nb@tipi-net.de> # lan7430 on arm64 (RevPi
Link: https://patch.msgid.link/20260529210300.433135-1-davthompson@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Replace hard-coded strings with the str_enabled_disabled() helper. This
unifies the output and helps the linker with deduplication, which can result
in a smaller binary. Additionally, address the following Coccinelle/coccicheck
warning reported by string_choices.cocci:
opportunity for str_enabled_disabled(uv_pch_intr_now_enabled)
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Kyle Meyer <kyle.meyer@hpe.com>
Link: https://patch.msgid.link/20260504181945.143928-2-thorsten.blum@linux.dev
|
|
Now that we have the ability to represent the context in which a DRM device
is in at compile-time, we can start carrying around this context with GEM
object types in order to allow a driver to safely create GEM objects before
a DRM device has registered with userspace.
Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Daniel Almeida <daniel.almeida@collabora.com>
Link: https://patch.msgid.link/20260507220044.3204919-4-lyude@redhat.com
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
|
|
This is just a type alias that resolves into the AllocImpl for a given
T: drm::gem::DriverObject.
Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Daniel Almeida <daniel.almeida@collabora.com>
Link: https://patch.msgid.link/20260507220044.3204919-3-lyude@redhat.com
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
|
|
The vcc_sdio regulator supports 1.8V to 3.4V output range according to
its datasheet.
The current DT incorrectly limits the max voltage to 3.0V. This limit
causes issues issues downstream with u-boot, which refuses to apply the
out-of range value, and falls back to the minimum in that range: 1.8V.
This is insufficient to power the SD card, so driver initialisation
fails and booting from it does not work.
Set regulator-max-microvolt to 3400000 µV to match hardware capability.
This matches the rk3399-orangepi for the same regulator.
Signed-off-by: Hugo Osvaldo Barrera <hugo@whynothugo.nl>
Reviewed-by: Dang Huynh <dang.huynh@mainlining.org>
Link: https://patch.msgid.link/20260519094439.7918-1-hugo@whynothugo.nl
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
|
|
The NanoPi Zero2 has one USB 2.0 Type-A HOST port and one USB 2.0 Type-C
OTG port.
Add support for using the USB 2.0 ports on NanoPi Zero2.
Signed-off-by: Jonas Karlman <jonas@kwiboo.se>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Link: https://patch.msgid.link/20260529190355.4148175-6-heiko@sntech.de
|
|
The ArmSoM Sige1 has two USB 2.0 Type-A HOST ports behind an onboard
USB hub, and one USB 2.0 Type-C OTG port.
Add support for using the USB 2.0 ports on ArmSoM Sige1.
The onboard USB hub handles OHCI so only the EHCI controller is enabled.
Signed-off-by: Jonas Karlman <jonas@kwiboo.se>
[added phy-supply for otg port]
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Link: https://patch.msgid.link/20260529190355.4148175-5-heiko@sntech.de
|
|
The ROCK 2A has three USB 2.0 Type-A HOST ports behind an onboard
USB hub, and one USB 3.0 Type-A port.
And the ROCK 2F has two USB 2.0 Type-A HOST ports behind an onboard
USB hub, and one USB 2.0 Type-C OTG port.
Add support for using the USB ports on Radxa ROCK 2A/2F.
The onboard USB hub handles OHCI so only the EHCI controller is enabled.
Signed-off-by: Jonas Karlman <jonas@kwiboo.se>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Link: https://patch.msgid.link/20260529190355.4148175-4-heiko@sntech.de
|
|
The Radxa E20C has one USB2.0 Type-A HOST port and one USB2.0 Type-C port.
The Type-C port is conneced to a FE1.1s_QFN USB hub on the board, with its
ports being connected to the XHCI usb controller and an usb-uart bridge.
This also means, the XHCI controller can only be used in device-mode.
Add support for using the USB 2.0 ports on Radxa E20C.
Signed-off-by: Jonas Karlman <jonas@kwiboo.se>
[set xhci to peripheral and add comment about the outward-facing hub]
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Link: https://patch.msgid.link/20260529190355.4148175-3-heiko@sntech.de
|
|
Rockchip RK3528 has one USB 3.0 DWC3 controller and oneUSB 2.0 EHCI/OHCI
controller and uses an Innosilicon-USB2PHY for USB 2.0. The DWC3
controller additionally uses the Naneng Combo PHY for USB3.
Add device tree nodes to describe these USB controllers along with the
USB 2.0 PHYs.
[moved snps,dis_u2_susphy_quirk here from individual boards,
describe both usb2+3 default phy connections, usb2 boards can override]
Signed-off-by: Jonas Karlman <jonas@kwiboo.se>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Link: https://patch.msgid.link/20260529190355.4148175-2-heiko@sntech.de
|
|
The RDS IB connection teardown path is written so it can run during
partial startup and on repeated shutdown attempts. It uses NULL
pointers to distinguish resources that are still owned from resources
that have already been released.
When rds_ib_setup_qp() fails after allocating i_sends but before
allocating i_recvs, the sends_out path frees i_sends without clearing
the pointer. A later shutdown pass can still treat that stale pointer
as a live send ring allocation.
Clear i_sends after vfree() in the error unwind path so the existing
shutdown logic continues to use the correct ownership state.
Fixes: 3b12f73a5c29 ("rds: ib: add error handle")
Cc: stable@kernel.org
Reported-by: Yuan Tan <yuantan098@gmail.com>
Reported-by: Zhengchuan Liang <zcliangcn@gmail.com>
Reported-by: Xin Liu <bird@lzu.edu.cn>
Signed-off-by: Yuqi Xu <xuyq21@lenovo.com>
Signed-off-by: Ren Wei <n05ec@lzu.edu.cn>
Reviewed-by: Allison Henderson <achender@kernel.org>
Link: https://patch.msgid.link/5a0f7624bb9845a7b67d26166a150b59e7f394ce.1779632468.git.xuyq21@lenovo.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
self-deadlock
When FUTEX_CMP_REQUEUE_PI requeues a non-top waiter that already owns the
target PI futex, task_blocks_on_rt_mutex() returns -EDEADLK before setting
waiter->task.
The subsequent remove_waiter() in rt_mutex_start_proxy_lock() dereferences
the NULL waiter->task, causing a kernel crash.
Add a self-deadlock check for non-top waiters before calling
rt_mutex_start_proxy_lock(), analogous to the top-waiter check in
futex_lock_pi_atomic().
Fixes: 3bfdc63936dd4773109b7b8c280c0f3b5ae7d349 ("rtmutex: Use waiter::task instead of current in remove_waiter()")
Signed-off-by: Ji'an Zhou <eilaimemedsnaimel@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Cc: stable@vger.kernel.org
|
|
'net-airoha-preliminary-patches-to-support-multiple-net_devices-connected-to-the-same-gdm-port'
Lorenzo Bianconi says:
====================
net: airoha: Preliminary patches to support multiple net_devices connected to the same GDM port
EN7581 or AN7583 SoCs support connecting multiple external SerDes (e.g.
Ethernet or USB SerDes) to GDM3 or GDM4 ports via a hw arbiter that
manages the traffic in a TDM manner. As a result multiple net_devices can
connect to the same GDM{3,4} port and there is a theoretical "1:n"
relation between GDM ports and net_devices.
┌─────────────────────────────────┐
│ │ ┌──────┐
│ P1 GDM1 ├────►MT7530│
│ │ └──────┘
│ │ ETH0 (DSA conduit)
│ │
│ PSE/FE │
│ │
│ │
│ │ ┌─────┐
│ P0 CDM1 ├────►QDMA0│
│ P4 P9 GDM4 │ └─────┘
└──┬─────────────────────────┬────┘
│ │
┌──▼──┐ ┌────▼────┐
│ PPE │ │ ARB │
└─────┘ └─┬─────┬─┘
│ │
┌──▼──┐┌─▼───┐
│ ETH ││ USB │
└─────┘└─────┘
ETH1 ETH2
This is a preliminary series to introduce support for multiple net_devices
connected to the same Frame Engine (FE) GDM port (GDM3 or GDM4) via an
external hw arbiter.
====================
Link: https://patch.msgid.link/20260527-airoha-eth-multi-serdes-preliminary-v1-0-ec6ed73ef7fc@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This is a preliminary patch in order to allow the user to select if the
configured device will be used as hw lan or wan.
Please not this patch does not introduce any logical changes, just
cosmetic ones.
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://patch.msgid.link/20260527-airoha-eth-multi-serdes-preliminary-v1-6-ec6ed73ef7fc@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Since now multiple net_devices connected to different QDMA blocks can
share the same GDM port, cpu_tx_packets and fwd_tx_packets fields can
be overwritten with the value from a different QDMA block. In order to
fix the issue move cpu_tx_packets and fwd_tx_packets fields from
airoha_gdm_port struct to airoha_gdm_dev one.
Tested-by: Xuegang Lu <xuegang.lu@airoha.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://patch.msgid.link/20260527-airoha-eth-multi-serdes-preliminary-v1-5-ec6ed73ef7fc@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Since now multiple net_devices connected to different QDMA blocks can
share the same GDM port, qos_sq_bmap field can be overwritten with the
configuration obtained from a net_device connected to a different QDMA
block. In order to fix the issue move qos_sq_bmap field from
airoha_gdm_port struct to airoha_gdm_dev one.
Add qos_channel_map bitmap in airoha_qdma struct to track if a shared
QDMA channel is already in use by another net_device.
Tested-by: Xuegang Lu <xuegang.lu@airoha.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://patch.msgid.link/20260527-airoha-eth-multi-serdes-preliminary-v1-4-ec6ed73ef7fc@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Rename airoha_is_lan_gdm_port in airoha_is_lan_gdm_dev. Moreover, rely
on airoha_gdm_dev pointer in airoha_is_lan_gdm_dev() instead of
airoha_gdm_port one.
This is a preliminary patch to support multiple net_devices connected to
the same GDM{3,4} port via an external hw arbiter.
Tested-by: Xuegang Lu <xuegang.lu@airoha.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://patch.msgid.link/20260527-airoha-eth-multi-serdes-preliminary-v1-3-ec6ed73ef7fc@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Move airoha_qdma pointer from airoha_gdm_port struct to airoha_gdm_dev
one since the QDMA block used depends on the particular net_device
WAN/LAN configuration and in the current codebase net_device pointer is
associated to airoha_gdm_dev struct.
This is a preliminary patch to support multiple net_devices connected
to the same GDM{3,4} port via an external hw arbiter.
Tested-by: Xuegang Lu <xuegang.lu@airoha.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://patch.msgid.link/20260527-airoha-eth-multi-serdes-preliminary-v1-2-ec6ed73ef7fc@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
EN7581 and AN7583 SoCs support connecting multiple external SerDes to GDM3
or GDM4 ports via a hw arbiter that manages the traffic in a TDM manner.
As a result multiple net_devices can connect to the same GDM{3,4} port
and there is a theoretical "1:n" relation between GDM port and
net_devices.
Introduce airoha_gdm_dev struct to collect net_device related info (e.g.
net_device and external phy pointer). Please note this is just a
preliminary patch and we are still supporting a single net_device for
each GDM port. Subsequent patches will add support for multiple net_devices
connected to the same GDM port.
Tested-by: Xuegang Lu <xuegang.lu@airoha.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://patch.msgid.link/20260527-airoha-eth-multi-serdes-preliminary-v1-1-ec6ed73ef7fc@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
These link-attrs attributes were previously marked as strings:
- wireless - struct iw_event
- protinfo - a nest of ifla6-attrs or linkinfo-brport-attrs
- cost, priority - unused
Signed-off-by: Remy D. Farley <one-d-wide@protonmail.com>
Link: https://patch.msgid.link/20260529121355.1564817-1-one-d-wide@protonmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Daniel Zahka says:
====================
netdevsim: psp: fix issues with stats collection
It has come to my attention via a sashiko review of my net-next series
for aes-gcm in netdevsim [1] that there were preexisting issues with
netdevsim's implementation of psp statistics.
API usage issues:
1. not calling u64_stats_init() on the u64_stats_sync object during
init
2. not serializing usage of the writer side API during stats update
Logical Bugs:
1. We were incrementing rx stats on the sending devices stats
counters.
Fix the first set of issues by removing the u64_stats_t api entirely,
and keep track of stats with atomics. Fix the second issue by charging
events to the right netdevsim object.
[1]: https://sashiko.dev/#/patchset/20260508-nsim-psp-crypto-v1-0-4b50ed09b794%40gmail.com
TAP version 13
1..28
ok 1 psp.data_basic_send_v0_ip4
ok 2 psp.data_basic_send_v0_ip6
ok 3 psp.data_basic_send_v1_ip4
ok 4 psp.data_basic_send_v1_ip6
ok 5 psp.data_basic_send_v2_ip4
ok 6 psp.data_basic_send_v2_ip6
ok 7 psp.data_basic_send_v3_ip4
ok 8 psp.data_basic_send_v3_ip6
ok 9 psp.data_mss_adjust_ip4
ok 10 psp.data_mss_adjust_ip6
ok 11 psp.dev_list_devices
ok 12 psp.dev_get_device
ok 13 psp.dev_get_device_bad
ok 14 psp.dev_rotate
ok 15 psp.dev_rotate_spi
ok 16 psp.assoc_basic
ok 17 psp.assoc_bad_dev
ok 18 psp.assoc_sk_only_conn
ok 19 psp.assoc_sk_only_mismatch
ok 20 psp.assoc_sk_only_mismatch_tx
ok 21 psp.assoc_sk_only_unconn
ok 22 psp.assoc_version_mismatch
ok 23 psp.assoc_twice
ok 24 psp.data_send_bad_key
ok 25 psp.data_send_disconnect
ok 26 psp.data_stale_key
ok 27 psp.removal_device_rx
ok 28 psp.removal_device_bi
# Totals: pass:28 fail:0 xfail:0 xpass:0 skip:0 error:0
Dump stats on both devs tx on one should match rx on other:
local dev:
id=5 ifindex=2 stats={'dev-id': 5, 'key-rotations': 0,
'stale-events': 0, 'rx-packets': 1226, 'rx-bytes': 39244,
'rx-auth-fail': 0, 'rx-error': 0, 'rx-bad': 0, 'tx-packets': 1931,
'tx-bytes': 2478908, 'tx-error': 0}
remote dev:
id=3 ifindex=2 stats={'dev-id': 3, 'key-rotations': 0, 'stale-events':
0, 'rx-packets': 1931, 'rx-bytes': 2478908, 'rx-auth-fail': 0,
'rx-error': 0, 'rx-bad': 0, 'tx-packets': 1226, 'tx-bytes': 39244,
'tx-error': 0}
====================
Link: https://patch.msgid.link/20260529-fix-psp-stats-v2-0-3a194eacf18e@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The existing u64_stats_t-based psp counters had two preexisting api
usage bugs: u64_stats_init() was never called on the syncp object, and
the writer side of the u64_stats_update_begin()/end() api was not
serialized. Switch the counters to atomic64_t instead. Atomics need
no initialization and are inherently safe against concurrent writers,
eliminating both bugs at once.
Use atomic64_t rather than atomic_long_t so byte counters don't wrap
at 4 GiB on 32-bit builds.
Fixes: 178f0763c5f3 ("netdevsim: implement psp device stats")
Cc: <stable+noautosel@kernel.org> # netdevsim is a test harness, it's never loaded on production systems
Signed-off-by: Daniel Zahka <daniel.zahka@gmail.com>
Link: https://patch.msgid.link/20260529-fix-psp-stats-v2-2-3a194eacf18e@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
nsim_do_psp() handles both tx and rx psp processing in the sending
device's nsim_start_xmit() path. The existing code has a logical bug,
where we erroneously increment rx_bytes and rx_packets on the sending
devices stats, instead of the peer device.
Additionally, compute psp_len after psp_dev_encapsulate() and before
psp_dev_rcv(), which modifies the header region of the skb. The
existing calculation was actually correct, because psp_dev_rcv()
leaves skb_inner_transport_header pointing at the tcp header, but this
is fragile and confusing as there is no actual inner transport header
after psp_dev_rcv has removed udp encapsulation.
Fixes: 178f0763c5f3 ("netdevsim: implement psp device stats")
Cc: <stable+noautosel@kernel.org> # netdevsim is a test harness, it's never loaded on production systems
Signed-off-by: Daniel Zahka <daniel.zahka@gmail.com>
Link: https://patch.msgid.link/20260529-fix-psp-stats-v2-1-3a194eacf18e@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Writing to the netdevsim debugfs file
"netdevsim/netdevsimN/fib/nexthop_bucket_activity" enters
nsim_nexthop_bucket_activity_write(), which looks up a nexthop in
data->nexthop_ht under rtnl_lock(). If a network namespace teardown,
devlink reload or device deletion runs concurrently, nsim_fib_destroy()
frees that rhashtable (and the surrounding nsim_fib_data) while the
write is still in flight, leading to a slab-use-after-free:
BUG: KASAN: slab-use-after-free in nsim_nexthop_bucket_activity_write+0xb9e/0xdf0
Read of size 4 at addr ff1100001a379808 by task syz.0.11967/27894
CPU: 0 UID: 0 PID: 27894 Comm: syz.0.11967 Not tainted 7.1.0-rc4-gf6f1bfc1980a #4
Call Trace:
nsim_nexthop_bucket_activity_write+0xb9e/0xdf0
full_proxy_write+0x135/0x1a0
vfs_write+0x2e2/0x1040
ksys_write+0x146/0x270
__x64_sys_write+0x76/0xb0
do_syscall_64+0xb9/0x5b0
entry_SYSCALL_64_after_hwframe+0x74/0x7c
Allocated by task 15957:
rhashtable_init_noprof+0x3ec/0x860
nsim_fib_create+0x371/0xca0
nsim_drv_probe+0xd60/0x15c0
...
new_device_store+0x425/0x7f0
Freed by task 24:
rhashtable_free_and_destroy+0x10d/0x620
nsim_fib_destroy+0xc9/0x1c0
nsim_dev_reload_destroy+0x1e7/0x530
nsim_dev_reload_down+0x6b/0xd0
devlink_reload+0x1b5/0x770
devlink_pernet_pre_exit+0x25d/0x3a0
ops_undo_list+0x1b7/0xb90
cleanup_net+0x47f/0x8a0
The buggy address belongs to the object at ff1100001a379800
which belongs to the cache kmalloc-1k of size 1024
The freed 1k object is the bucket table of data->nexthop_ht. Shortly
after, the dangling table is dereferenced again and the machine also
takes a GPF in __rht_bucket_nested() from the same call site.
The root cause is a lifetime mismatch: the debugfs files reference
nsim_fib_data (the writer dereferences data->nexthop_ht), but the
interface is not bracketed around the lifetime of that data.
nsim_fib_destroy() freed both rhashtables and only removed the debugfs
directory afterwards, and nsim_fib_create() created the debugfs files
before the rhashtables were initialized and, on the error path, freed
them before removing the files. debugfs keeps the file itself alive
across a ->write() via debugfs_file_get()/debugfs_file_put()
(fs/debugfs/file.c), but it does not keep data->nexthop_ht alive, so the
in-flight writer dereferenced freed memory. rtnl_lock() in the writer
does not help, because the teardown path does not take rtnl around
rhashtable_free_and_destroy().
Fix it by bracketing the debugfs interface around the data it exposes,
keeping nsim_fib_create() and nsim_fib_destroy() symmetric:
- In nsim_fib_destroy(), tear down the debugfs files before the data
structures they reference. debugfs_remove_recursive() drops the
initial active-user reference and then waits for every in-flight
->write() to drop its reference before returning, and rejects new
opens (__debugfs_file_removed(), fs/debugfs/inode.c). Once it returns,
no debugfs accessor can reach the FIB data, so the rhashtables and
nsim_fib_data can be destroyed safely. This also covers the bool knobs
in the same directory, which store pointers into the same
nsim_fib_data, and the final kfree(data).
- In nsim_fib_create(), create the debugfs files after the rhashtables
and notifiers are set up. This closes the same race on the
error-unwind path, where a concurrent writer could otherwise observe a
half-constructed instance or a table that the unwind has already
freed. (With only the destroy-side change, a writer racing the create
window instead dereferences an uninitialized data->nexthop_ht.)
This is reproducible by racing, in a loop, writes to
/sys/kernel/debug/netdevsim/netdevsimN/fib/nexthop_bucket_activity
against a teardown of the same netdevsim instance -- a devlink reload
("devlink dev reload netdevsim/netdevsimN"), destroying the network
namespace it lives in, or "echo N > /sys/bus/netdevsim/del_device". It
was found with syzkaller; a syzkaller reproducer is available. A
standalone C reproducer does not trigger it reliably because the race
needs the netns-teardown/reload path.
Cc: <stable+noautosel@kernel.org> # netdevsim is a test harness, it's never loaded on production systems
Signed-off-by: Zijing Yin <yzjaurora@gmail.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20260529135718.1804031-1-yzjaurora@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Certain Samsung S2M series PMICs have a MUIC device which reports
various cable states by measuring the ID-GND resistance with an internal
ADC. Document the devicetree schema for this device.
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com>
Signed-off-by: Kaustabh Chakraborty <kauschluss@disroot.org>
Link: https://patch.msgid.link/20260516-s2mu005-pmic-v7-2-73f9702fb461@disroot.org
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
|
|
This Kconfig symbol is not used anymore, remove it.
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Link: https://patch.msgid.link/20260519-vdso-generic_time_vsyscal-v1-3-5c2a5905d5f5@linutronix.de
|
|
Both the compilation of kernel/time/vsyscall.c, which contains the real
definition of update_vsyscall() and the other vDSO definitions in
timekeeper_internal.h use CONFIG_GENERIC_GETTIMEOFDAY and not
CONFIG_GENERIC_TIME_VSYSCALL.
Align the code to use a single Kconfig symbol.
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Link: https://patch.msgid.link/20260519-vdso-generic_time_vsyscal-v1-2-5c2a5905d5f5@linutronix.de
|
|
The syscall definitions can be built just fine for 32-bit systems.
Also the guard does not cover __arch_get_hw_counter() which is always
used together with those system call fallbacks. Also this header is
unused when no vDSO is built anyways.
Drop the ifdeffery. The logic will be simpler to understand. Furthermore
this prepares the complete removal of CONFIG_GENERIC_TIME_VSYSCALL.
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Link: https://patch.msgid.link/20260519-vdso-generic_time_vsyscal-v1-1-5c2a5905d5f5@linutronix.de
|
|
These pointers are only modified once in vdso_setup_data_pages(),
during the init phase. Make them read-only after that.
Drop __refdata as that would conflict with __ro_after_init.
Modpost does accept the reference from a __ro_after_init symbol to
an __init one.
Fixes: 05988dba1179 ("vdso/datastore: Allocate data pages dynamically")
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Link: https://patch.msgid.link/20260513-vdso-ro-after-init-v1-1-4b51f74015a4@linutronix.de
|
|
Add the bootph-all property to ULP watchdog nodes for i.MX7ULP, ensuring
the watchdog is available during all boot phases.
Signed-off-by: Alice Guo <alice.guo@nxp.com>
Signed-off-by: Frank Li <Frank.Li@nxp.com>
|