| Age | Commit message (Collapse) | Author |
|
In the panfrost driver, the platform data of several Mediatek SoC
declares and uses several different power domains arrays according to
GPU core number present in the SoC:
- mediatek_mt8186_pm_domains (2 cores)
- mediatek_mt8183_pm_domains (3 cores)
- mediatek_mt8192_pm_domains (5 cores)
As they all are fixed arrays, starting with the same entries and the
platform data also has a power domains array length field
(num_pm_domains), they can be replaced by a single array, containing
all entries, if the num_pm_domains field of the platform data is also
set to the matching core number.
So, create a generic power domain array (mediatek_pm_domains) and use
it in the mt8183(b), mt8186, mt8188 and mt8192 platform data instead.
Signed-off-by: Louis-Alexis Eyraud <louisalexis.eyraud@collabora.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Signed-off-by: Steven Price <steven.price@arm.com>
Link: https://lore.kernel.org/r/20250509-mt8370-enable-gpu-v6-3-2833888cb1d3@collabora.com
|
|
In the panfrost driver, the platform data of several Mediatek SoC
declares and uses custom supplies array definitions
(mediatek_mt8192_supplies, mediatek_mt8183_b_supplies), that are the
same as default_supplies (used by default platform data).
So drop these duplicated definitions and use default_supplies instead.
Also, rename mediatek_mt8183_supplies to a more generic name too
(legacy_supplies).
Signed-off-by: Louis-Alexis Eyraud <louisalexis.eyraud@collabora.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Signed-off-by: Steven Price <steven.price@arm.com>
Link: https://lore.kernel.org/r/20250509-mt8370-enable-gpu-v6-2-2833888cb1d3@collabora.com
|
|
Syzkaller can create many uhid devices that trigger
repeated warnings like:
"hid-generic xxxx: unknown main item tag 0x0"
These messages can flood the system log, especially if a crash occurs
(e.g., with a slow UART console, leading to soft lockups). To mitigate
this, convert `hid_warn()` to use `dev_warn_ratelimited()`.
This helps reduce log noise and improves system stability under fuzzing
or faulty device scenarios.
Signed-off-by: Li Chen <chenl311@chinatelecom.cn>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
|
|
Commit 0f9a1739dd0e ("efi: zboot specific mechanism for embedding SBAT
section") neglected to adjust the sizes of the .data section when
CONFIG_EFI_SBAT_FILE is set. As the result, the produced PE binary is
incorrect and some tools complain about it. E.g. 'sbsign' reports:
# sbsign --key my.key --cert my.crt arch/arm64/boot/vmlinuz.efi
warning: file-aligned section .data extends beyond end of file
warning: checksum areas are greater than image size. Invalid section table?
Note, '__data_size' is also used in the PE optional header and it is not
entirely clear whether .sbat needs to be accounted as part of
SizeOfInitializedData or not. As the header seems to be unused by the real
world firmware, keeping the field equal to __data_size.
Fixes: 0f9a1739dd0e ("efi: zboot specific mechanism for embedding SBAT section")
Reported-by: Heinrich Schuchardt <heinrich.schuchardt@gmx.de>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
|
|
The `separate_colour_plane_flag` element is only present in the SPS if
`chroma_format_idc == 3`, so the corresponding flag should be disabled
whenever that is not the case and not just on profiles where
`chroma_format_idc` is not present.
Fixes: b32e48503df0 ("media: controls: Validate H264 stateless controls")
Signed-off-by: James Cowgill <james.cowgill@blaize.com>
Signed-off-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
|
|
The 'used' and 'new' bitmaps are local to this function, so there is no
need to use atomic access because concurrency can not happen.
Use the non-atomic __set_bit() to save a few cycles.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
|
|
Add video_device_release() in label 'err_m2m' to release the memory
allocated by video_device_alloc() and prevent potential memory leaks.
Remove the reduntant code in label 'err_m2m'.
Fixes: a8ef0488cc59 ("media: imx: add csc/scaler mem2mem device")
Cc: stable@vger.kernel.org
Signed-off-by: Haoxiang Li <haoxiang_li2024@163.com>
Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
|
|
On errors, the rkvdec chip self resets. This can clear the addresses
programmed in the iommu. This case is signaled by the
RKVDEC_SOFTRESET_RDY status bit.
Since the iommu framework does not have a restore functionality, and
as recommended by the iommu subsystem maintainers, this patch
restores the iommu programming by attaching and detaching an empty
domain, which will clear and restore the default domain.
Suggested-by: Detlev Casanova <detlev.casanova@collabora.com>
Tested-by: Detlev Casanova <detlev.casanova@collabora.com>
Reviewed-by: Detlev Casanova <detlev.casanova@collabora.com>
Signed-off-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
|
|
The desired clock frequency was correctly set to 400MHz in the device tree
but was lowered by the driver to 300MHz breaking 4K 60Hz content playback.
Fix the issue by removing the driver call to clk_set_rate(), which reduce
the amount of board specific code.
Fixes: 003afda97c65 ("media: verisilicon: Enable AV1 decoder on rk3588")
Cc: stable@vger.kernel.org
Reviewed-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Signed-off-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
|
|
If VPU supports untiled output, it actually supports several different
YUV 4:2:0 layouts, namely NV12, NV21, YUV420 and YVU420.
Add support for all of them.
Signed-off-by: Jernej Skrabec <jernej.skrabec@siol.net>
Reviewed-by: Paul Kocialkowski <paulk@sys-base.io>
Signed-off-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
[hverkuil: add 'default' case to switch to fix warning with old compiler]
|
|
SDCA (SoundWire Device Class for Audio) uses HID to convey
input events from peripheral devices. Add a bus define for the
SoundWire bus to prepare support for this.
Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Signed-off-by: Shuming Fan <shumingf@realtek.com>
Acked-by: Jiri Kosina <jkosina@suse.com>
Link: https://patch.msgid.link/20250616114907.855452-1-shumingf@realtek.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Applications may set data_offset when it refers to an output queue. So
driver need to account for it when getting the start address of input
image in the plane.
Meanwhile the mxc-jpeg codec requires the address (plane address +
data_offset) to be 16-aligned.
Fixes: 2db16c6ed72c ("media: imx-jpeg: Add V4L2 driver for i.MX8 JPEG Encoder/Decoder")
Signed-off-by: Ming Qian <ming.qian@oss.nxp.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
|
|
For format H264 and HEVC, the firmware can report the parsed profile idc
and level idc to driver, these information may be useful.
Implement the H264 and HEVC profile and level control to report them.
Signed-off-by: Ming Qian <ming.qian@oss.nxp.com>
Signed-off-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
|
|
struct gpio_chip now has callbacks for setting line values that return
an integer, allowing to indicate failures. Convert the driver to using
them.
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Link: https://patch.msgid.link/20250610-gpiochip-set-rv-ssb-v1-1-0bee5b45b411@linaro.org
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Don't populate the read-only array cfg_offset on the stack at run time,
instead make it static const.
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Acked-by: Arend van Spriel <arend.vanspriel@broadcom.com>
Link: https://patch.msgid.link/20250619082554.1834654-1-colin.i.king@gmail.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Since secs_to_jiffies()(commit:b35108a51cf7) has been introduced, we can
use it to avoid scaling the time to msec.
Signed-off-by: Yuesong Li <liyuesong@vivo.com>
Link: https://patch.msgid.link/20250613102624.3077418-1-liyuesong@vivo.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Since secs_to_jiffies()(commit:b35108a51cf7) has been introduced, we can
use it to avoid scaling the time to msec.
Signed-off-by: Yuesong Li <liyuesong@vivo.com>
Acked-by: Stanislaw Gruszka <stf_xl@wp.pl>
Link: https://patch.msgid.link/20250612021446.3465972-1-liyuesong@vivo.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Enable the host MLME flag to allow supported W8997 chipsets to
use WPA3. This feature requires firmware support (V2 API key), which
the driver validates before activation.
Tested using sdsd8997_combo_v4.bin from commit
211fbc287a0b ("linux-firmware: Update FW files for MRVL SD8997 chips")
[ 5.956510] mwifiex_sdio mmc2:0001:1: info: FW download over, size 623352 bytes
...
[ 6.825456] mwifiex_sdio mmc2:0001:1: WLAN FW is active
...
[ 12.171950] mwifiex_sdio mmc2:0001:1: host_mlme: enable, key_api: 2
[ 12.226206] mwifiex_sdio mmc2:0001:1: info: MWIFIEX VERSION: mwifiex 1.0 (16.68.1.p197)
root@verdin-imx8mm-14700070:~# strings /lib/firmware/mrvl/sdsd8997_combo_v4.bin |grep 16
$Id: w8997o-V4, RF878X, FP68_LINUX, 16.68.1.p197.1 $
Signed-off-by: Rafael Beims <rafael.beims@toradex.com>
Reviewed-by: Francesco Dolcini <francesco.dolcini@toradex.com>
Acked-by: Brian Norris <briannorris@chromium.org>
Link: https://patch.msgid.link/20250530094711.915574-1-rafael@beims.me
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Limit rate_idx to IL_LAST_OFDM_RATE for 5GHz band for thinkable case
the index is incorrect.
Reported-by: Fedor Pchelkin <pchelkin@ispras.ru>
Reported-by: Alexei Safin <a.safin@rosa.ru>
Signed-off-by: Stanislaw Gruszka <stf_xl@wp.pl>
Reviewed-by: Fedor Pchelkin <pchelkin@ispras.ru>
Link: https://patch.msgid.link/20250525144524.GA172583@wp.pl
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
The usage of the mockdev pointer in struct gpio_device is limited to the
GPIO sysfs code. There's no reason to keep it in this top-level
structure. Create a separate structure containing the reference to the
GPIO device and the dummy class device that will be passed to
device_create_with_groups(). The !gdev->mockdev checks can be removed as
long as we make sure that all operations on the GPIO class are protected
with the sysfs lock.
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20250610-gpio-sysfs-chip-export-v1-6-a8c7aa4478b1@linaro.org
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
|
|
No symbols from the linux/idr.h or linux/spinlock.h headers are used in
this file so remove them. We also don't technically need linux/list.h
currently but one of the follow-up commits will start using it so let's
leave it.
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20250610-gpio-sysfs-chip-export-v1-5-a8c7aa4478b1@linaro.org
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
|
|
Update the code to be more consistent with the rest of the codebase.
Mostly correctly align line-breaks, remove unneeded tabs, stray newlines
& spaces and tweak the comment style.
No functional change.
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20250610-gpio-sysfs-chip-export-v1-4-a8c7aa4478b1@linaro.org
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
|
|
While not critical, it's useful to have the corresponding call to
mutex_destroy() whenever we use mutex_init(). Add the call right before
kfreeing the GPIO data.
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20250610-gpio-sysfs-chip-export-v1-3-a8c7aa4478b1@linaro.org
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
|
|
Static 'struct regmap_range_cfg' array is not modified so can be changed
to const for more safety.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20250528194453.567324-2-krzysztof.kozlowski@linaro.org
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
|
|
Enabling the compile test should not cause automatic enabling of all
drivers. Restrict the default to ARCH also for individual driver, even
though its choice is not visible without selecting parent Kconfig
symbol, because otherwise selecting parent would select the child during
compile testing.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20250404-kconfig-defaults-clk-v1-3-4d2df5603332@linaro.org
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
|
|
Enabling the compile test should not cause automatic enabling of all
drivers. Restrict the default to ARCH also for individual driver, even
though its choice is not visible without selecting parent Kconfig
symbol, because otherwise selecting parent would select the child during
compile testing.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20250404-kconfig-defaults-clk-v1-2-4d2df5603332@linaro.org
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
|
|
Add kunit test suites clk_hw_get_dev() and clk_hw_get_of_node()
for clocks registered with clk_hw_register() and of_clk_hw_register()
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Link: https://lore.kernel.org/r/20250417-clk-hw-get-helpers-v1-2-7743e509612a@baylibre.com
Reviewed-by: Brian Masney <bmasney@redhat.com>
[sboyd@kernel.org: Drop genparams, rename tests, drop inits,
combine suites, add test for non-DT platform device]
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
|
|
Rename clk_register_clk_parent_data_device_driver() to
kunit_of_platform_driver_dev() and have it return a struct device
pointer while accepting a match table. This will be useful to
find the device associated with an OF node for more tests than
only the clk_parent_data tests.
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
[sboyd@kernel.org: Split out from next patch, carry SoB and
authorship, rename API, return device pointer]
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
|
|
Documentation/filesystems/sysfs.rst mentions that show() should only
use sysfs_emit() or sysfs_emit_at() when formating the value to be
returned to user space. So replace scnprintf() with sysfs_emit().
Signed-off-by: Chelsy Ratnawat <chelsyratnawat2001@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
|
|
Add "Thinkpad X1 Tablet Gen 2 Keyboard" PID to hid-lenovo driver to fix trackpoint not working issue.
Signed-off-by: Akira Inoue <niyarium@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
|
|
Don't populate the read-only array reconnect_event on the stack
at run time, instead make it static const.
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
|
|
This improves code readability by using the standard
kernel macro for minimal value selection while maintaining identical
functionality.
Signed-off-by: Yu Jiaoliang <yujiaoliang@vivo.com>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
|
|
During appletb_kbd_probe, probe attempts to get the backlight device
by name. When this happens backlight_device_get_by_name looks for a
device in the backlight class which has name "appletb_backlight" and
upon finding a match it increments the reference count for the device
and returns it to the caller. However this reference is never released
leading to a reference leak.
Fix this by decrementing the backlight device reference count on removal
via put_device and on probe failure.
Fixes: 93a0fc489481 ("HID: hid-appletb-kbd: add support for automatic brightness control while using the touchbar")
Cc: stable@vger.kernel.org
Signed-off-by: Qasim Ijaz <qasdev00@gmail.com>
Reviewed-by: Aditya Garg <gargaditya08@live.com>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
|
|
in case we have GPIOLIB enabled the gpio pins are used
from the current driver as gpio pins. But may the gpio
functions of this pins are not enabled in the flash
of the chip and so gpio access fails.
In case CONFIG_IIO is not enabled we can prevent this
issue of the driver simply by enabling the gpio mode
for all pins.
Signed-off-by: Heiko Schocher <hs@denx.de>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
|
|
Pull block fixes from Jens Axboe:
- Two fixes for aoe which fixes issues dating back to when this driver
was converted to blk-mq
- Fix for ublk, checking for valid queue depth and count values before
setting up a device
* tag 'block-6.16-20250619' of git://git.kernel.dk/linux:
ublk: santizize the arguments from userspace when adding a device
aoe: defer rexmit timer downdev work to workqueue
aoe: clean device rq_list in aoedev_downdev()
|
|
Pull drm fixes from Dave Airlie:
"Bit of an uptick in fixes for rc3, msm and amdgpu leading the way,
with i915/xe/nouveau with a few each and then some scattered misc
bits, nothing looks too crazy:
msm:
- Display:
- Fixed DP output on SDM845
- Fixed 10nm DSI PLL init
- GPU:
- SUBMIT ioctl error path leak fixes
- drm half of stall-on-fault fixes
- a7xx: Missing CP_RESET_CONTEXT_STATE
- Skip GPU component bind if GPU is not in the device table
i915:
- Fix MIPI vtotal programming off by one on Broxton
- Fix PMU code for GCOV and AutoFDO enabled build
xe:
- A workaround update
- Fix memset on iomem
- Fix early wedge on GuC Load failure
amdgpu:
- DP tunneling fix
- LTTPR fix
- DSC fix
- DML2.x ABGR16161616 fix
- RMCM fix
- Backlight fixes
- GFX11 kicker support
- SDMA reset fixes
- VCN 5.0.1 fix
- Reset fix
- Misc small fixes
amdkfd:
- SDMA reset fix
- Fix race in GWS scheduling
nouveau:
- update docs reference
- fix backlight name buffer size
- fix UAF in r535 gsp rpc msg
- fix undefined shift
mgag200:
- drop export header
ast:
- drop export header
malidp:
- drop informational error
ssd130x:
- fix clear columns
etnaviv:
- scheduler locking fix
v3d:
- null pointer crash fix"
* tag 'drm-fixes-2025-06-20' of https://gitlab.freedesktop.org/drm/kernel: (50 commits)
drm/xe: Fix early wedge on GuC load failure
drm/xe: Fix memset on iomem
drm/xe/bmg: Update Wa_16023588340
drm/amdgpu/sdma5.2: init engine reset mutex
drm/amdkfd: Fix race in GWS queue scheduling
drm/amdgpu/sdma5: init engine reset mutex
drm/amdgpu: switch job hw_fence to amdgpu_fence
drm/amdgpu: Fix SDMA UTC_L1 handling during start/stop sequences
drm/amdgpu: Release reset locks during failures
drm/amd/display: Check dce_hwseq before dereferencing it
drm/amdgpu: VCN v5_0_1 to prevent FW checking RB during DPG pause
drm/amdgpu: Use logical instance ID for SDMA v4_4_2 queue operations
drm/amdgpu: Fix SDMA engine reset with logical instance ID
drm/amdgpu: add kicker fws loading for gfx11/smu13/psp13
drm/amdgpu: Add kicker device detection
drm/amd/display: Export full brightness range to userspace
drm/amd/display: Only read ACPI backlight caps once
drm/amd/display: Fix RMCM programming seq errors
drm/amd/display: Fix mpv playback corruption on weston
drm/amd/display: Add more checks for DSC / HUBP ONO guarantees
...
|
|
After reverting the transition to the generic min heap library, bcache no
longer depends on MIN_HEAP. The select entry can be removed to reduce
code size and shrink the kernel's attack surface.
This change effectively reverts the bcache-related part of commit
92a8b224b833 ("lib/min_heap: introduce non-inline versions of min heap API
functions").
This is part of a series of changes to address a performance regression
caused by the use of the generic min_heap implementation.
As reported by Robert, bcache now suffers from latency spikes, with P100
(max) latency increasing from 600 ms to 2.4 seconds every 5 minutes.
These regressions degrade bcache's effectiveness as a low-latency cache
layer and lead to frequent timeouts and application stalls in production
environments.
Link: https://lore.kernel.org/lkml/CAJhEC05+0S69z+3+FB2Cd0hD+pCRyWTKLEOsc8BOmH73p1m+KQ@mail.gmail.com
Link: https://lkml.kernel.org/r/20250614202353.1632957-4-visitorckw@gmail.com
Fixes: 866898efbb25 ("bcache: remove heap-related macros and switch to generic min_heap")
Fixes: 92a8b224b833 ("lib/min_heap: introduce non-inline versions of min heap API functions")
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Reported-by: Robert Pang <robertpang@google.com>
Closes: https://lore.kernel.org/linux-bcache/CAJhEC06F_AtrPgw2-7CvCqZgeStgCtitbD-ryuPpXQA-JG5XXw@mail.gmail.com
Acked-by: Coly Li <colyli@kernel.org>
Cc: Ching-Chun (Jim) Huang <jserv@ccns.ncku.edu.tw>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
This reverts commit 866898efbb25bb44fd42848318e46db9e785973a.
The generic bottom-up min_heap implementation causes performance
regression in invalidate_buckets_lru(), a hot path in bcache. Before the
cache is fully populated, new_bucket_prio() often returns zero, leading to
many equal comparisons. In such cases, bottom-up sift_down performs up to
2 * log2(n) comparisons, while the original top-down approach completes
with just O() comparisons, resulting in a measurable performance gap.
The performance degradation is further worsened by the non-inlined
min_heap API functions introduced in commit 92a8b224b833 ("lib/min_heap:
introduce non-inline versions of min heap API functions"), adding function
call overhead to this critical path.
As reported by Robert, bcache now suffers from latency spikes, with P100
(max) latency increasing from 600 ms to 2.4 seconds every 5 minutes.
These regressions degrade bcache's effectiveness as a low-latency cache
layer and lead to frequent timeouts and application stalls in production
environments.
This revert aims to restore bcache's original low-latency behavior.
Link: https://lore.kernel.org/lkml/CAJhEC05+0S69z+3+FB2Cd0hD+pCRyWTKLEOsc8BOmH73p1m+KQ@mail.gmail.com
Link: https://lkml.kernel.org/r/20250614202353.1632957-3-visitorckw@gmail.com
Fixes: 866898efbb25 ("bcache: remove heap-related macros and switch to generic min_heap")
Fixes: 92a8b224b833 ("lib/min_heap: introduce non-inline versions of min heap API functions")
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Reported-by: Robert Pang <robertpang@google.com>
Closes: https://lore.kernel.org/linux-bcache/CAJhEC06F_AtrPgw2-7CvCqZgeStgCtitbD-ryuPpXQA-JG5XXw@mail.gmail.com
Acked-by: Coly Li <colyli@kernel.org>
Cc: Ching-Chun (Jim) Huang <jserv@ccns.ncku.edu.tw>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Patch series "bcache: Revert min_heap migration due to performance
regression".
This patch series reverts the migration of bcache from its original heap
implementation to the generic min_heap library. While the original change
aimed to simplify the code and improve maintainability, it introduced a
severe performance regression in real-world scenarios.
As reported by Robert, systems using bcache now suffer from periodic
latency spikes, with P100 (max) latency increasing from 600 ms to 2.4
seconds every 5 minutes. This degrades bcache's value as a low-latency
caching layer, and leads to frequent timeouts and application stalls in
production environments.
The primary cause of this regression is the behavior of the generic
min_heap implementation's bottom-up sift_down, which performs up to 2 *
log2(n) comparisons when many elements are equal. The original top-down
variant used by bcache only required O(1) comparisons in such cases. The
issue was further exacerbated by commit 92a8b224b833 ("lib/min_heap:
introduce non-inline versions of min heap API functions"), which
introduced non-inlined versions of the min_heap API, adding function call
overhead to a performance-critical hot path.
This patch (of 3):
This reverts commit 3d8a9a1c35227c3f1b0bd132c9f0a80dbda07b65.
Although removing the custom swap function simplified the code, this
change is part of a broader migration to the generic min_heap API that
introduced significant performance regressions in bcache.
As reported by Robert, bcache now suffers from latency spikes, with P100
(max) latency increasing from 600 ms to 2.4 seconds every 5 minutes.
These regressions degrade bcache's effectiveness as a low-latency cache
layer and lead to frequent timeouts and application stalls in production
environments.
This revert is part of a series of changes to restore previous performance
by undoing the min_heap transition.
Link: https://lkml.kernel.org/r/20250614202353.1632957-1-visitorckw@gmail.com
Link: https://lore.kernel.org/lkml/CAJhEC05+0S69z+3+FB2Cd0hD+pCRyWTKLEOsc8BOmH73p1m+KQ@mail.gmail.com
Link: https://lkml.kernel.org/r/20250614202353.1632957-2-visitorckw@gmail.com
Fixes: 866898efbb25 ("bcache: remove heap-related macros and switch to generic min_heap")
Fixes: 92a8b224b833 ("lib/min_heap: introduce non-inline versions of min heap API functions")
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Reported-by: Robert Pang <robertpang@google.com>
Closes: https://lore.kernel.org/linux-bcache/CAJhEC06F_AtrPgw2-7CvCqZgeStgCtitbD-ryuPpXQA-JG5XXw@mail.gmail.com
Acked-by: Coly Li <colyli@kernel.org>
Cc: Ching-Chun (Jim) Huang <jserv@ccns.ncku.edu.tw>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
dma_map_XXX() can fail and should be tested for errors with
dma_mapping_error().
Fixes: a63e78eb2b0f ("scsi: fnic: Add support for fabric based solicited requests and responses")
Signed-off-by: Thomas Fourier <fourier.thomas@gmail.com>
Link: https://lore.kernel.org/r/20250618065715.14740-2-fourier.thomas@gmail.com
Reviewed-by: Karan Tilak Kumar <kartilak@cisco.com>
Reviewed-by: John Menghini <jmeneghi@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Replace KERN_INFO with KERN_DEBUG for a log message.
Reviewed-by: Sesidhar Baddela <sebaddel@cisco.com>
Reviewed-by: Arulprabhu Ponnusamy <arulponn@cisco.com>
Reviewed-by: Gian Carlo Boffa <gcboffa@cisco.com>
Reviewed-by: Arun Easi <aeasi@cisco.com>
Signed-off-by: Karan Tilak Kumar <kartilak@cisco.com>
Link: https://lore.kernel.org/stable/20250612002212.4144-1-kartilak%40cisco.com
Link: https://lore.kernel.org/r/20250618003431.6314-4-kartilak@cisco.com
Reviewed-by: John Meneghini <jmeneghi@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Add logs in FDMI and FDMI ABTS paths.
Modify log text in these paths.
Reviewed-by: Sesidhar Baddela <sebaddel@cisco.com>
Reviewed-by: Arulprabhu Ponnusamy <arulponn@cisco.com>
Reviewed-by: Gian Carlo Boffa <gcboffa@cisco.com>
Reviewed-by: Arun Easi <aeasi@cisco.com>
Reviewed-by: John Meneghini <jmeneghi@redhat.com>
Signed-off-by: Karan Tilak Kumar <kartilak@cisco.com>
Link: https://lore.kernel.org/r/20250618003431.6314-3-kartilak@cisco.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
When the link goes down and comes up, FDMI requests are not sent out
anymore.
Fix bug by turning off FNIC_FDMI_ACTIVE when the link goes down.
Fixes: 09c1e6ab4ab2 ("scsi: fnic: Add and integrate support for FDMI")
Reviewed-by: Sesidhar Baddela <sebaddel@cisco.com>
Reviewed-by: Arulprabhu Ponnusamy <arulponn@cisco.com>
Reviewed-by: Gian Carlo Boffa <gcboffa@cisco.com>
Reviewed-by: Arun Easi <aeasi@cisco.com>
Tested-by: Karan Tilak Kumar <kartilak@cisco.com>
Cc: stable@vger.kernel.org
Signed-off-by: Karan Tilak Kumar <kartilak@cisco.com>
Link: https://lore.kernel.org/r/20250618003431.6314-2-kartilak@cisco.com
Reviewed-by: John Meneghini <jmeneghi@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
When both the RHBA and RPA FDMI requests time out, fnic reuses a frame to
send ABTS for each of them. On send completion, this causes an attempt to
free the same frame twice that leads to a crash.
Fix crash by allocating separate frames for RHBA and RPA, and modify ABTS
logic accordingly.
Tested by checking MDS for FDMI information.
Tested by using instrumented driver to:
- Drop PLOGI response
- Drop RHBA response
- Drop RPA response
- Drop RHBA and RPA response
- Drop PLOGI response + ABTS response
- Drop RHBA response + ABTS response
- Drop RPA response + ABTS response
- Drop RHBA and RPA response + ABTS response for both of them
Fixes: 09c1e6ab4ab2 ("scsi: fnic: Add and integrate support for FDMI")
Reviewed-by: Sesidhar Baddela <sebaddel@cisco.com>
Reviewed-by: Arulprabhu Ponnusamy <arulponn@cisco.com>
Reviewed-by: Gian Carlo Boffa <gcboffa@cisco.com>
Tested-by: Arun Easi <aeasi@cisco.com>
Co-developed-by: Arun Easi <aeasi@cisco.com>
Signed-off-by: Arun Easi <aeasi@cisco.com>
Tested-by: Karan Tilak Kumar <kartilak@cisco.com>
Cc: stable@vger.kernel.org
Signed-off-by: Karan Tilak Kumar <kartilak@cisco.com>
Link: https://lore.kernel.org/r/20250618003431.6314-1-kartilak@cisco.com
Reviewed-by: John Meneghini <jmeneghi@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
With the ATA error model, an NCQ command failure always triggers an abort
(termination) of all NCQ commands queued on the device. In such case, the
SAT or the host must handle the failed command according to the command
sense data and immediately retry all other NCQ commands that were aborted
due to the failed NCQ command.
For SAS HBAs controlled by the mpt3sas driver, NCQ command aborts are not
handled by the HBA SAT and sent back to the host, with an ioc log
information equal to 0x31080000 (IOC_LOGINFO_PREFIX_PL with the PL code
PL_LOGINFO_CODE_SATA_NCQ_FAIL_ALL_CMDS_AFTR_ERR). The function
_scsih_io_done() always forces a retry of commands terminated with the
status MPI2_IOCSTATUS_SCSI_IOC_TERMINATED using the SCSI result
DID_SOFT_ERROR, regardless of the log_info for the command. This
correctly forces the retry of collateral NCQ abort commands, but with the
retry counter for the command being incremented. If a command to an ATA
device is subject to too many retries due to other NCQ commands failing
(e.g. read commands trying to access unreadable sectors), the collateral
NCQ abort commands may be terminated with an error as they run out of
retries. This violates the SAT specification and causes hard-to-debug
command errors.
Solve this issue by modifying the handling of the
MPI2_IOCSTATUS_SCSI_IOC_TERMINATED status to check if a command is for an
ATA device and if the command loginfo indicates an NCQ collateral
abort. If that is the case, force the command retry using the SCSI result
DID_IMM_RETRY to avoid incrementing the command retry count.
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20250606052747.742998-3-dlemoal@kernel.org
Tested-by: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
With the ATA error model, an NCQ command failure always triggers an abort
(termination) of all NCQ commands queued on the device. In such case, the
SAT or the host must handle the failed command according to the command
sense data and immediately retry all other NCQ commands that were aborted
due to the failed NCQ command.
For SAS HBAs controlled by the mpi3mr driver, NCQ command aborts are not
handled by the HBA SAT and sent back to the host, with an ioc log
information equal to 0x31080000 (IOC_LOGINFO_PREFIX_PL with the PL code
PL_LOGINFO_CODE_SATA_NCQ_FAIL_ALL_CMDS_AFTR_ERR). The function
mpi3mr_process_op_reply_desc() always forces a retry of commands
terminated with the status MPI3_IOCSTATUS_SCSI_IOC_TERMINATED using the
SCSI result DID_SOFT_ERROR, regardless of the ioc_loginfo for the
command. This correctly forces the retry of collateral NCQ abort
commands, but with the retry counter for the command being incremented.
If a command to an ATA device is subject to too many retries due to other
NCQ commands failing (e.g. read commands trying to access unreadable
sectors), the collateral NCQ abort commands may be terminated with an
error as they run out of retries. This violates the SAT specification and
causes hard-to-debug command errors.
Solve this issue by modifying the handling of the
MPI3_IOCSTATUS_SCSI_IOC_TERMINATED status to check if a command is for an
ATA device and if the command ioc_loginfo indicates an NCQ collateral
abort. If that is the case, force the command retry using the SCSI result
DID_IMM_RETRY to avoid incrementing the command retry count.
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20250606052747.742998-2-dlemoal@kernel.org
Tested-by: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
This change frees resources after an error is detected.
Signed-off-by: Francisco Gutierrez <frankramirez@google.com>
Link: https://lore.kernel.org/r/20250617210443.989058-1-frankramirez@google.com
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Previously, the response buffer (ucd_rsp_ptr) was cleared in multiple
UPIU preparation functions. Do it once.
Signed-off-by: Avri Altman <avri.altman@sandisk.com>
Link: https://lore.kernel.org/r/20250617095611.89229-2-avri.altman@sandisk.com
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
In ufshcd_host_reset_and_restore(), scale up clocks only when clock
scaling is supported. Without this change CPU latency is voted for 0
(ufshcd_pm_qos_update) during resume unconditionally.
Signed-off-by: anvithdosapati <anvithdosapati@google.com>
Link: https://lore.kernel.org/r/20250616085734.2133581-1-anvithdosapati@google.com
Fixes: a3cd5ec55f6c ("scsi: ufs: add load based scaling of UFS gear")
Cc: stable@vger.kernel.org
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
On a system with DRAM interleave enabled, out-of-bound access is
detected:
megaraid_sas 0000:3f:00.0: requested/available msix 128/128 poll_queue 0
------------[ cut here ]------------
UBSAN: array-index-out-of-bounds in ./arch/x86/include/asm/topology.h:72:28
index -1 is out of range for type 'cpumask *[1024]'
dump_stack_lvl+0x5d/0x80
ubsan_epilogue+0x5/0x2b
__ubsan_handle_out_of_bounds.cold+0x46/0x4b
megasas_alloc_irq_vectors+0x149/0x190 [megaraid_sas]
megasas_probe_one.cold+0xa4d/0x189c [megaraid_sas]
local_pci_probe+0x42/0x90
pci_device_probe+0xdc/0x290
really_probe+0xdb/0x340
__driver_probe_device+0x78/0x110
driver_probe_device+0x1f/0xa0
__driver_attach+0xba/0x1c0
bus_for_each_dev+0x8b/0xe0
bus_add_driver+0x142/0x220
driver_register+0x72/0xd0
megasas_init+0xdf/0xff0 [megaraid_sas]
do_one_initcall+0x57/0x310
do_init_module+0x90/0x250
init_module_from_file+0x85/0xc0
idempotent_init_module+0x114/0x310
__x64_sys_finit_module+0x65/0xc0
do_syscall_64+0x82/0x170
entry_SYSCALL_64_after_hwframe+0x76/0x7e
Fix it accordingly.
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Link: https://lore.kernel.org/r/20250604042556.3731059-1-yu.c.chen@intel.com
Fixes: 8049da6f3943 ("scsi: megaraid_sas: Use irq_set_affinity_and_hint()")
Cc: stable@vger.kernel.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|