| Age | Commit message (Collapse) | Author |
|
Document the iio_push_to_buffers_with_ts() function.
This is copied and slightly cleaned up from
iio_push_to_buffers_with_timestamp().
Signed-off-by: David Lechner <dlechner@baylibre.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
|
|
Updating drm-misc-next to the state of v6.18-rc1.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
|
|
Commit 6d4405b16d37 ("ata: libata-core: Cache the general purpose log
directory") introduced caching of a device general purpose log directory
to avoid repeated access to this log page during device scan. This
change also added a check on this log page to verify that the log page
version is 0x0001 as mandated by the ACS specifications.
And it turns out that some devices do not bother reporting this version,
instead reporting a version 0, resulting in error messages such as:
ata6.00: Invalid log directory version 0x0000
and to the device being marked as not supporting the general purpose log
directory log page.
Since before commit 6d4405b16d37 the log page version check did not
exist and things were still working correctly for these devices, relax
ata_read_log_directory() version check and only warn about the invalid
log page version number without disabling access to the log directory
page.
Fixes: 6d4405b16d37 ("ata: libata-core: Cache the general purpose log directory")
Cc: stable@vger.kernel.org
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220635
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Niklas Cassel <cassel@kernel.org>
|
|
Alt Mode drivers are responsible for sending Enter Mode through the TCPM,
but only a DFP is allowed to send Enter Mode. typec_get_data_role gets
the port's data role, which can then be used in altmode drivers via
typec_altmode_get_data_role to know if Enter Mode should be sent.
Signed-off-by: RD Babiera <rdbabiera@google.com>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Link: https://lore.kernel.org/r/20250923181606.1583584-5-rdbabiera@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Add device tree clock binding definitions for CMU_MFC
Signed-off-by: Raghav Sharma <raghav.s@samsung.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
|
|
Add device tree clock binding definitions for CMU_M2M
Signed-off-by: Raghav Sharma <raghav.s@samsung.com>
Acked-by: Rob Herring (Arm) <robh@kernel.org>
Reviewed-by: Alim Akhtar <alim.akhtar@samsung.com>
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
|
|
Cross-merge BPF and other fixes after downstream PR.
No conflicts.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
This reverts commit 1a2b423be6a89dd07d5fc27ea042be68697a6a49 because we
got a regression report and need time to find out the details.
Reported-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Closes: https://lore.kernel.org/r/29ec0082-4dd4-4120-acd2-44b35b4b9487@oss.qualcomm.com
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
|
|
Pull SCSI fixes from James Bottomley:
"Fixes only in drivers (ufs, mvsas, qla2xxx, target) that came in just
before or during the merge window.
The most important one is the qla2xxx which reverts a conversion to
fix flexible array member warnings, that went up in this merge window
but which turned out on further testing to be causing data corruption"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: ufs: core: Include UTP error in INT_FATAL_ERRORS
scsi: ufs: sysfs: Make HID attributes visible
scsi: mvsas: Fix use-after-free bugs in mvs_work_queue
scsi: ufs: core: Fix PM QoS mutex initialization
scsi: ufs: core: Fix runtime suspend error deadlock
Revert "scsi: qla2xxx: Fix memcpy() field-spanning write issue"
scsi: target: target_core_configfs: Add length check to avoid buffer overflow
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull more x86 updates from Borislav Petkov:
- Remove a bunch of asm implementing condition flags testing in KVM's
emulator in favor of int3_emulate_jcc() which is written in C
- Replace KVM fastops with C-based stubs which avoids problems with the
fastop infra related to latter not adhering to the C ABI due to their
special calling convention and, more importantly, bypassing compiler
control-flow integrity checking because they're written in asm
- Remove wrongly used static branches and other ugliness accumulated
over time in hyperv's hypercall implementation with a proper static
function call to the correct hypervisor call variant
- Add some fixes and modifications to allow running FRED-enabled
kernels in KVM even on non-FRED hardware
- Add kCFI improvements like validating indirect calls and prepare for
enabling kCFI with GCC. Add cmdline params documentation and other
code cleanups
- Use the single-byte 0xd6 insn as the official #UD single-byte
undefined opcode instruction as agreed upon by both x86 vendors
- Other smaller cleanups and touchups all over the place
* tag 'x86_core_for_v6.18_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
x86,retpoline: Optimize patch_retpoline()
x86,ibt: Use UDB instead of 0xEA
x86/cfi: Remove __noinitretpoline and __noretpoline
x86/cfi: Add "debug" option to "cfi=" bootparam
x86/cfi: Standardize on common "CFI:" prefix for CFI reports
x86/cfi: Document the "cfi=" bootparam options
x86/traps: Clarify KCFI instruction layout
compiler_types.h: Move __nocfi out of compiler-specific header
objtool: Validate kCFI calls
x86/fred: KVM: VMX: Always use FRED for IRQs when CONFIG_X86_FRED=y
x86/fred: Play nice with invoking asm_fred_entry_from_kvm() on non-FRED hardware
x86/fred: Install system vector handlers even if FRED isn't fully enabled
x86/hyperv: Use direct call to hypercall-page
x86/hyperv: Clean up hv_do_hypercall()
KVM: x86: Remove fastops
KVM: x86: Convert em_salc() to C
KVM: x86: Introduce EM_ASM_3WCL
KVM: x86: Introduce EM_ASM_1SRC2
KVM: x86: Introduce EM_ASM_2CL
KVM: x86: Introduce EM_ASM_2W
...
|
|
Pull bpf fixes from Alexei Starovoitov:
- Finish constification of 1st parameter of bpf_d_path() (Rong Tao)
- Harden userspace-supplied xdp_desc validation (Alexander Lobakin)
- Fix metadata_dst leak in __bpf_redirect_neigh_v{4,6}() (Daniel
Borkmann)
- Fix undefined behavior in {get,put}_unaligned_be32() (Eric Biggers)
- Use correct context to unpin bpf hash map with special types (KaFai
Wan)
* tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
selftests/bpf: Add test for unpinning htab with internal timer struct
bpf: Avoid RCU context warning when unpinning htab with internal structs
xsk: Harden userspace-supplied xdp_desc validation
bpf: Fix metadata_dst leak __bpf_redirect_neigh_v{4,6}
libbpf: Fix undefined behavior in {get,put}_unaligned_be32()
bpf: Finish constification of 1st parameter of bpf_d_path()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull more updates from Andrew Morton:
"Just one series here - Mike Rappoport has taught KEXEC handover to
preserve vmalloc allocations across handover"
* tag 'mm-nonmm-stable-2025-10-10-15-03' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
lib/test_kho: use kho_preserve_vmalloc instead of storing addresses in fdt
kho: add support for preserving vmalloc allocations
kho: replace kho_preserve_phys() with kho_preserve_pages()
kho: check if kho is finalized in __kho_preserve_order()
MAINTAINERS, .mailmap: update Umang's email address
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
"7 hotfixes. All 7 are cc:stable and all 7 are for MM.
All singletons, please see the changelogs for details"
* tag 'mm-hotfixes-stable-2025-10-10-15-00' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
mm: hugetlb: avoid soft lockup when mprotect to large memory area
fsnotify: pass correct offset to fsnotify_mmap_perm()
mm/ksm: fix flag-dropping behavior in ksm_madvise
mm/damon/vaddr: do not repeat pte_offset_map_lock() until success
mm/rmap: fix soft-dirty and uffd-wp bit loss when remapping zero-filled mTHP subpage to shared zeropage
mm/thp: fix MTE tag mismatch when replacing zero-filled subpages
memcg: skip cgroup_file_notify if spinning is not allowed
|
|
Allow mmap() on guest_memfd instances for x86 VMs with private memory as
the need to track private vs. shared state in the guest_memfd instance is
only pertinent to INIT_SHARED. Doing mmap() on private memory isn't
terrible useful (yet!), but it's now possible, and will be desirable when
guest_memfd gains support for other VMA-based syscalls, e.g. mbind() to
set NUMA policy.
Lift the restriction now, before MMAP support is officially released, so
that KVM doesn't need to add another capability to enumerate support for
mmap() on private memory.
Fixes: 3d3a04fad25a ("KVM: Allow and advertise support for host mmap() on guest_memfd files")
Reviewed-by: Ackerley Tng <ackerleytng@google.com>
Tested-by: Ackerley Tng <ackerleytng@google.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20251003232606.4070510-6-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Add a guest_memfd flag to allow userspace to state that the underlying
memory should be configured to be initialized as shared, and reject user
page faults if the guest_memfd instance's memory isn't shared. Because
KVM doesn't yet support in-place private<=>shared conversions, all
guest_memfd memory effectively follows the initial state.
Alternatively, KVM could deduce the initial state based on MMAP, which for
all intents and purposes is what KVM currently does. However, implicitly
deriving the default state based on MMAP will result in a messy ABI when
support for in-place conversions is added.
For x86 CoCo VMs, which don't yet support MMAP, memory is currently private
by default (otherwise the memory would be unusable). If MMAP implies
memory is shared by default, then the default state for CoCo VMs will vary
based on MMAP, and from userspace's perspective, will change when in-place
conversion support is added. I.e. to maintain guest<=>host ABI, userspace
would need to immediately convert all memory from shared=>private, which
is both ugly and inefficient. The inefficiency could be avoided by adding
a flag to state that memory is _private_ by default, irrespective of MMAP,
but that would lead to an equally messy and hard to document ABI.
Bite the bullet and immediately add a flag to control the default state so
that the effective behavior is explicit and straightforward.
Fixes: 3d3a04fad25a ("KVM: Allow and advertise support for host mmap() on guest_memfd files")
Cc: David Hildenbrand <david@redhat.com>
Reviewed-by: Fuad Tabba <tabba@google.com>
Tested-by: Fuad Tabba <tabba@google.com>
Reviewed-by: Ackerley Tng <ackerleytng@google.com>
Tested-by: Ackerley Tng <ackerleytng@google.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20251003232606.4070510-3-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Rework the not-yet-released KVM_CAP_GUEST_MEMFD_MMAP into a more generic
KVM_CAP_GUEST_MEMFD_FLAGS capability so that adding new flags doesn't
require a new capability, and so that developers aren't tempted to bundle
multiple flags into a single capability.
Note, kvm_vm_ioctl_check_extension_generic() can only return a 32-bit
value, but that limitation can be easily circumvented by adding e.g.
KVM_CAP_GUEST_MEMFD_FLAGS2 in the unlikely event guest_memfd supports more
than 32 flags.
Reviewed-by: Ackerley Tng <ackerleytng@google.com>
Tested-by: Ackerley Tng <ackerleytng@google.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20251003232606.4070510-2-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Pull more drm fixes from Dave Airlie:
"Just the follow up fixes for rc1 from the next branch, amdgpu and xe
mostly with a single v3d fix in there.
amdgpu:
- DC DCE6 fixes
- GPU reset fixes
- Secure diplay messaging cleanup
- MES fix
- GPUVM locking fixes
- PMFW messaging cleanup
- PCI US/DS switch handling fix
- VCN queue reset fix
- DC FPU handling fix
- DCN 3.5 fix
- DC mirroring fix
amdkfd:
- Fix kfd process ref leak
- mmap write lock handling fix
- Fix comments in IOCTL
xe:
- Fix build with clang 16
- Fix handling of invalid configfs syntax usage and spell out the
expected syntax in the documentation
- Do not try late bind firmware when running as VF since it shouldn't
handle firmware loading
- Fix idle assertion for local BOs
- Fix uninitialized variable for late binding
- Do not require perfmon_capable to expose free memory at page
granularity. Handle it like other drm drivers do
- Fix lock handling on suspend error path
- Fix I2C controller resume after S3
v3d:
- fix fence locking"
* tag 'drm-next-2025-10-11-1' of https://gitlab.freedesktop.org/drm/kernel: (34 commits)
drm/amd/display: Incorrect Mirror Cositing
drm/amd/display: Enable Dynamic DTBCLK Switch
drm/amdgpu: Report individual reset error
drm/amdgpu: partially revert "revert to old status lock handling v3"
drm/amd/display: Fix unsafe uses of kernel mode FPU
drm/amd/pm: Disable VCN queue reset on SMU v13.0.6 due to regression
drm/amdgpu: Fix general protection fault in amdgpu_vm_bo_reset_state_machine
drm/amdgpu: Check swus/ds for switch state save
drm/amdkfd: Fix two comments in kfd_ioctl.h
drm/amd/pm: Avoid interface mismatch messaging
drm/amdgpu: Merge amdgpu_vm_set_pasid into amdgpu_vm_init
drm/amd/amdgpu: Fix the mes version that support inv_tlbs
drm/amd: Check whether secure display TA loaded successfully
drm/amdkfd: Fix mmap write lock not release
drm/amdkfd: Fix kfd process ref leaking when userptr unmapping
drm/amdgpu: Fix for GPU reset being blocked by KIQ I/O.
drm/amd/display: Disable scaling on DCE6 for now
drm/amd/display: Properly disable scaling on DCE6
drm/amd/display: Properly clear SCL_*_FILTER_CONTROL on DCE6
drm/amd/display: Add missing DCE6 SCL_HORZ_FILTER_INIT* SRIs
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux
Pull devicetree fixes from Rob Herring:
- Allow child nodes on renesas-bsc bus binding
- Drop node name pattern on allwinner,sun50i-a64-de2 bus binding
- Switch DT patchwork to kernel.org from ozlabs.org
- Fix some typos in docs and bindings
- Fix reference count in PCI node unittest
* tag 'devicetree-fixes-for-6.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
dt-bindings: bus: renesas-bsc: allow additional properties
dt-bindings: bus: allwinner,sun50i-a64-de2: don't check node names
MAINTAINERS: Move DT patchwork to kernel.org
of: unittest: Fix device reference count leak in of_unittest_pci_node_verify
of: doc: Fix typo in doc comments.
dt-bindings: mmc: Correct typo "upto" to "up to"
|
|
Pull ceph updates from Ilya Dryomov:
- some messenger improvements (Eric and Max)
- address an issue (also affected userspace) of incorrect permissions
being granted to users who have access to multiple different CephFS
instances within the same cluster (Kotresh)
- a bunch of assorted CephFS fixes (Slava)
* tag 'ceph-for-6.18-rc1' of https://github.com/ceph/ceph-client:
ceph: add bug tracking system info to MAINTAINERS
ceph: fix multifs mds auth caps issue
ceph: cleanup in ceph_alloc_readdir_reply_buffer()
ceph: fix potential NULL dereference issue in ceph_fill_trace()
libceph: add empty check to ceph_con_get_out_msg()
libceph: pass the message pointer instead of loading con->out_msg
libceph: make ceph_con_get_out_msg() return the message pointer
ceph: fix potential race condition on operations with CEPH_I_ODIRECT flag
ceph: refactor wake_up_bit() pattern of calling
ceph: fix potential race condition in ceph_ioctl_lazyio()
ceph: fix overflowed constant issue in ceph_do_objects_copy()
ceph: fix wrong sizeof argument issue in register_session()
ceph: add checking of wait_for_completion_killable() return value
ceph: make ceph_start_io_*() killable
libceph: Use HMAC-SHA256 library instead of crypto_shash
|
|
The arraymap and hashtab duplicate the logic that checks for and frees
internal structs (timer, workqueue, task_work) based on
BTF record flags. Centralize this by introducing two helpers:
* bpf_map_has_internal_structs(map)
Returns true if the map value contains any of internal structs:
BPF_TIMER | BPF_WORKQUEUE | BPF_TASK_WORK.
* bpf_map_free_internal_structs(map, obj)
Frees the internal structs for a single value object.
Convert arraymap and both the prealloc/malloc hashtab paths to use the
new generic functions. This keeps the functionality for when/how to free
these special fields in one place and makes it easier to add support for
new internal structs in the future without touching every map
implementation.
Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20251010164606.147298-3-mykyta.yatsenko5@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux
Pull block fixes from Jens Axboe:
- Don't include __GFP_NOWARN for loop worker allocation, as it already
uses GFP_NOWAIT which has __GFP_NOWARN set already
- Small series cleaning up the recent bio_iov_iter_get_pages() changes
- loop fix for leaking the backing reference file, if validation fails
- Update of a comment pertaining to disk/partition stat locking
* tag 'block-6.18-20251009' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
loop: remove redundant __GFP_NOWARN flag
block: move bio_iov_iter_get_bdev_pages to block/fops.c
iomap: open code bio_iov_iter_get_bdev_pages
block: rename bio_iov_iter_get_pages_aligned to bio_iov_iter_get_pages
block: remove bio_iov_iter_get_pages
block: Update a comment of disk statistics
loop: fix backing file reference leak on validation error
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux
Pull io_uring fixes from Jens Axboe:
- Fixup indentation in the UAPI header
- Two fixes for zcrx. One fixes receiving too much in some cases, and
the other deals with not correctly incrementing the source in the
fallback copy loop
- Fix for a race in the IORING_OP_WAITID command, where there was a
small window where the request would be left on the wait_queue_head
list even though it was being canceled/completed
- Update liburing git URL in the kernel tree
* tag 'io_uring-6.18-20251009' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
io_uring/zcrx: increment fallback loop src offset
io_uring/zcrx: fix overshooting recv limit
io_uring: use tab indentation for IORING_SEND_VECTORIZED comment
io_uring/waitid: always prune wait queue entry in io_waitid_wait()
io_uring: update liburing git URL
|
|
Rename the storage_get_func_atomic flag to a more generic non_sleepable
flag that tracks whether a helper or kfunc may be called from a
non-sleepable context. This makes the flag more broadly applicable
beyond just storage_get helpers. See [0] for more context.
The flag is now set unconditionally for all helpers and kfuncs when:
- RCU critical section is active.
- Preemption is disabled.
- IRQs are disabled.
- In a non-sleepable context within a sleepable program (e.g., timer
callbacks), which is indicated by !in_sleepable().
Previously, the flag was only set for storage_get helpers in these
contexts. With this change, it can be used by any code that needs to
differentiate between sleepable and non-sleepable contexts at the
per-instruction level.
The existing usage in do_misc_fixups() for storage_get helpers is
preserved by checking is_storage_get_function() before using the flag.
[0]: https://lore.kernel.org/bpf/CAP01T76cbaNi4p-y8E0sjE2NXSra2S=Uja8G4hSQDu_SbXxREQ@mail.gmail.com
Cc: Mykyta Yatsenko <yatsenko@meta.com>
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Acked-by: Mykyta Yatsenko <mykyta.yatsenko5@gmail.com>
Link: https://lore.kernel.org/r/20251007220349.3852807-3-memxor@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull more i2c updates from Wolfram Sang:
- Second part of rtl9300 updates since dependencies are in now:
- general cleanups
- implement block read/write support
- add RTL9310 support
- DT schema conversion of hix5hd2 binding
- namespace cleanup for i2c-algo-pca
- minor simplification for mt65xx
* tag 'i2c-for-6.18-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
dt-bindings: i2c: hisilicon,hix5hd2: convert to DT schema
i2c: mt65xx: convert set_speed function to void
i2c: rename wait_for_completion callback to wait_for_completion_cb
i2c: rtl9300: add support for RTL9310 I2C controller
dt-bindings: i2c: realtek,rtl9301-i2c: extend for RTL9310 support
i2c: rtl9300: use scoped guard instead of explicit lock/unlock
i2c: rtl9300: separate xfer configuration and execution
i2c: rtl9300: do not set read mode on every transfer
i2c: rtl9300: move setting SCL frequency to config_io
i2c: rtl9300: rename internal sda_pin to sda_num
dt-bindings: i2c: realtek,rtl9301-i2c: fix wording and typos
i2c: rtl9300: use regmap fields and API for registers
i2c: rtl9300: Implement I2C block read and write
|
|
The current shenanigans for duration calculation introduce too much
complexity for a trivial problem, and further the code is hard to patch and
maintain.
Address these issues with a flat look-up table, which is easy to understand
and patch. If leaf driver specific patching is required in future, it is
easy enough to make a copy of this table during driver initialization and
add the chip parameter back.
'chip->duration' is retained for TPM 1.x.
As the first entry for this new behavior address TCG spec update mentioned
in this issue:
https://github.com/raspberrypi/linux/issues/7054
Therefore, for TPM_SelfTest the duration is set to 3000 ms.
This does not categorize a as bug, given that this is introduced to the
spec after the feature was originally made.
Reviewed-by: Serge Hallyn <serge@hallyn.com>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
|
|
https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-next-6.18-2025-10-09:
amdgpu:
- DC DCE6 fixes
- GPU reset fixes
- Secure diplay messaging cleanup
- MES fix
- GPUVM locking fixes
- PMFW messaging cleanup
- PCI US/DS switch handling fix
- VCN queue reset fix
- DC FPU handling fix
- DCN 3.5 fix
- DC mirroring fix
amdkfd:
- Fix kfd process ref leak
- mmap write lock handling fix
- Fix comments in IOCTL
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://lore.kernel.org/r/20251009162915.981503-1-alexander.deucher@amd.com
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
"Including fixes from netfilter.
Current release - regressions:
- mlx5: fix pre-2.40 binutils assembler error
Current release - new code bugs:
- net: psp: don't assume reply skbs will have a socket
- eth: fbnic: fix missing programming of the default descriptor
Previous releases - regressions:
- page_pool: fix PP_MAGIC_MASK to avoid crashing on some 32-bit arches
- tcp:
- take care of zero tp->window_clamp in tcp_set_rcvlowat()
- don't call reqsk_fastopen_remove() in tcp_conn_request()
- eth:
- ice: release xa entry on adapter allocation failure
- usb: asix: hold PM usage ref to avoid PM/MDIO + RTNL deadlock
Previous releases - always broken:
- netfilter: validate objref and objrefmap expressions
- sctp: fix a null dereference in sctp_disposition sctp_sf_do_5_1D_ce()
- eth:
- mlx4: prevent potential use after free in mlx4_en_do_uc_filter()
- mlx5: prevent tunnel mode conflicts between FDB and NIC IPsec tables
- ocelot: fix use-after-free caused by cyclic delayed work
Misc:
- add support for MediaTek PCIe 5G HP DRMR-H01"
* tag 'net-6.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (38 commits)
net: airoha: Fix loopback mode configuration for GDM2 port
selftests: drv-net: pp_alloc_fail: add necessary optoins to config
selftests: drv-net: pp_alloc_fail: lower traffic expectations
selftests: drv-net: fix linter warnings in pp_alloc_fail
eth: fbnic: fix reporting of alloc_failed qstats
selftests: drv-net: xdp: add test for interface level qstats
selftests: drv-net: xdp: rename netnl to ethnl
eth: fbnic: fix saving stats from XDP_TX rings on close
eth: fbnic: fix accounting of XDP packets
eth: fbnic: fix missing programming of the default descriptor
selftests: netfilter: query conntrack state to check for port clash resolution
selftests: netfilter: nft_fib.sh: fix spurious test failures
bridge: br_vlan_fill_forward_path_pvid: use br_vlan_group_rcu()
netfilter: nft_objref: validate objref and objrefmap expressions
net: pse-pd: tps23881: Fix current measurement scaling
net/mlx5: fix pre-2.40 binutils assembler error
net/mlx5e: Do not fail PSP init on missing caps
net/mlx5e: Prevent tunnel reformat when tunnel mode not allowed
net/mlx5: Prevent tunnel mode conflicts between FDB and NIC IPsec tables
net: usb: asix: hold PM usage ref to avoid PM/MDIO + RTNL deadlock
...
|
|
This pointer is in a register anyway, so let's use that instead of
reloading from memory everywhere.
[ idryomov: formatting ]
Signed-off-by: Max Kellermann <max.kellermann@ionos.com>
Reviewed-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
The caller in messenger_v1.c loads it anyway, so let's keep the
pointer in the register instead of reloading it from memory. This
eliminates a tiny bit of unnecessary overhead.
Signed-off-by: Max Kellermann <max.kellermann@ionos.com>
Reviewed-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
Use the HMAC-SHA256 library functions instead of crypto_shash. This is
simpler and faster.
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Reviewed-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
Pull more VFIO updates from Alex Williamson:
- Optimizations for DMA map and unmap opertions through the type1 vfio
IOMMU backend.
This uses various means of batching and hints from the mm structures
to improve efficiency and therefore performance, resulting in a
significant speedup for huge page use cases (Li Zhe)
- Expose supported device migration features through debugfs (Cédric Le
Goater)
* tag 'vfio-v6.18-rc1-pt2' of https://github.com/awilliam/linux-vfio:
vfio: Dump migration features under debugfs
vfio/type1: optimize vfio_unpin_pages_remote()
vfio/type1: introduce a new member has_rsvd for struct vfio_dma
vfio/type1: batch vfio_find_vpfn() in function vfio_unpin_pages_remote()
vfio/type1: optimize vfio_pin_pages_remote()
mm: introduce num_pages_contiguous()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input updates from Dmitry Torokhov:
- Conversions to yaml/json schema and fixes for input-related device
tree bindings
- New drivers:
- Awinic AW86927 haptic chip
- Hynitron CST816x series controller
- Himax HX852x(ES) touchscreen controller
- Fix uinput to not leak kernel memory via a gap in
uinput_ff_upload_compat structure
- Prevent overflow in pressure calculation in tsc2007 driver causing
phantom touches
- Make the Atmel maxTouch driver support generic touchscreen
configuration (flip, rotate, etc)
- Drop support for platform data in tca8418_keypad, pxa27x-keypad,
spear-keyboard and twl4030_keypad drivers, they all now rely on
generic device properties for configuration
- Other assorted changes and fixes
* tag 'input-for-v6.18-rc0' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (50 commits)
Input: atmel_mxt_ts - allow reset GPIO to sleep
Input: aw86927 - fix error code in probe()
Input: psxpad-spi - add a check for the return value of spi_setup()
Input: uinput - zero-initialize uinput_ff_upload_compat to avoid info leak
Input: aw86927 - add driver for Awinic AW86927
dt-bindings: input: Add Awinic AW86927
dt-bindings: touchscreen: remove touchscreen.txt
dt-bindings: arm: bcm: raspberrypi,bcm2835-firmware: Add touchscreen child node
dt-bindings: touchscreen: convert eeti bindings to json schema
Input: pm8941-pwrkey - disable wakeup for resin by default
dt-bindings: input: pm8941-pwrkey: Document wakeup-source property
Input: add driver for Hynitron CST816x series
dt-bindings: input: touchscreen: add hynitron cst816x series
Input: imx6ul_tsc - set glitch threshold by DTS property
dt-bindings: touchscreen: fsl,imx6ul-tsc: support glitch thresold
dt-bindings: touchscreen: add debounce-delay-us property
Input: ps2-gpio - fix typo
Input: atmel_mxt_ts - add support for generic touchscreen configurations
dt-bindings: input: maxtouch: add common touchscreen properties
dt-bindings: touchscreen: convert zet6223 bindings to json schema
...
|
|
syzkaller discovered the following crash: (kernel BUG)
[ 44.607039] ------------[ cut here ]------------
[ 44.607422] kernel BUG at mm/userfaultfd.c:2067!
[ 44.608148] Oops: invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN NOPTI
[ 44.608814] CPU: 1 UID: 0 PID: 2475 Comm: reproducer Not tainted 6.16.0-rc6 #1 PREEMPT(none)
[ 44.609635] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
[ 44.610695] RIP: 0010:userfaultfd_release_all+0x3a8/0x460
<snip other registers, drop unreliable trace>
[ 44.617726] Call Trace:
[ 44.617926] <TASK>
[ 44.619284] userfaultfd_release+0xef/0x1b0
[ 44.620976] __fput+0x3f9/0xb60
[ 44.621240] fput_close_sync+0x110/0x210
[ 44.622222] __x64_sys_close+0x8f/0x120
[ 44.622530] do_syscall_64+0x5b/0x2f0
[ 44.622840] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 44.623244] RIP: 0033:0x7f365bb3f227
Kernel panics because it detects UFFD inconsistency during
userfaultfd_release_all(). Specifically, a VMA which has a valid pointer
to vma->vm_userfaultfd_ctx, but no UFFD flags in vma->vm_flags.
The inconsistency is caused in ksm_madvise(): when user calls madvise()
with MADV_UNMEARGEABLE on a VMA that is registered for UFFD in MINOR mode,
it accidentally clears all flags stored in the upper 32 bits of
vma->vm_flags.
Assuming x86_64 kernel build, unsigned long is 64-bit and unsigned int and
int are 32-bit wide. This setup causes the following mishap during the &=
~VM_MERGEABLE assignment.
VM_MERGEABLE is a 32-bit constant of type unsigned int, 0x8000'0000.
After ~ is applied, it becomes 0x7fff'ffff unsigned int, which is then
promoted to unsigned long before the & operation. This promotion fills
upper 32 bits with leading 0s, as we're doing unsigned conversion (and
even for a signed conversion, this wouldn't help as the leading bit is 0).
& operation thus ends up AND-ing vm_flags with 0x0000'0000'7fff'ffff
instead of intended 0xffff'ffff'7fff'ffff and hence accidentally clears
the upper 32-bits of its value.
Fix it by changing `VM_MERGEABLE` constant to unsigned long, using the
BIT() macro.
Note: other VM_* flags are not affected: This only happens to the
VM_MERGEABLE flag, as the other VM_* flags are all constants of type int
and after ~ operation, they end up with leading 1 and are thus converted
to unsigned long with leading 1s.
Note 2:
After commit 31defc3b01d9 ("userfaultfd: remove (VM_)BUG_ON()s"), this is
no longer a kernel BUG, but a WARNING at the same place:
[ 45.595973] WARNING: CPU: 1 PID: 2474 at mm/userfaultfd.c:2067
but the root-cause (flag-drop) remains the same.
[akpm@linux-foundation.org: rust bindgen wasn't able to handle BIT(), from Miguel]
Link: https://lore.kernel.org/oe-kbuild-all/202510030449.VfSaAjvd-lkp@intel.com/
Link: https://lkml.kernel.org/r/20251001090353.57523-2-acsjakub@amazon.de
Fixes: 7677f7fd8be7 ("userfaultfd: add minor fault registration mode")
Signed-off-by: Jakub Acs <acsjakub@amazon.de>
Signed-off-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: SeongJae Park <sj@kernel.org>
Tested-by: Alice Ryhl <aliceryhl@google.com>
Tested-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
Cc: Xu Xin <xu.xin16@zte.com.cn>
Cc: Chengming Zhou <chengming.zhou@linux.dev>
Cc: Peter Xu <peterx@redhat.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Generally memcg charging is allowed from all the contexts including NMI
where even spinning on spinlock can cause locking issues. However one
call chain was missed during the addition of memcg charging from any
context support. That is try_charge_memcg() -> memcg_memory_event() ->
cgroup_file_notify().
The possible function call tree under cgroup_file_notify() can acquire
many different spin locks in spinning mode. Some of them are
cgroup_file_kn_lock, kernfs_notify_lock, pool_workqeue's lock. So, let's
just skip cgroup_file_notify() from memcg charging if the context does not
allow spinning.
Alternative approach was also explored where instead of skipping
cgroup_file_notify(), we defer the memcg event processing to irq_work [1].
However it adds complexity and it was decided to keep things simple until
we need more memcg events with !allow_spinning requirement.
Link: https://lore.kernel.org/all/5qi2llyzf7gklncflo6gxoozljbm4h3tpnuv4u4ej4ztysvi6f@x44v7nz2wdzd/ [1]
Link: https://lkml.kernel.org/r/20250922220203.261714-1-shakeel.butt@linux.dev
Fixes: 3ac4638a734a ("memcg: make memcg_rstat_updated nmi safe")
Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev>
Acked-by: Michal Hocko <mhocko@suse.com>
Closes: https://lore.kernel.org/all/20250905061919.439648-1-yepeilin@google.com/
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Peilin Ye <yepeilin@google.com>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Tejun Heo <tj@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
A vmalloc allocation is preserved using binary structure similar to global
KHO memory tracker. It's a linked list of pages where each page is an
array of physical address of pages in vmalloc area.
kho_preserve_vmalloc() hands out the physical address of the head page to
the caller. This address is used as the argument to kho_vmalloc_restore()
to restore the mapping in the vmalloc address space and populate it with
the preserved pages.
[pasha.tatashin@soleen.com: free chunks using free_page() not kfree()]
Link: https://lkml.kernel.org/r/mafs0a52idbeg.fsf@kernel.org
[akpm@linux-foundation.org: coding-style cleanups]
Link: https://lkml.kernel.org/r/20250921054458.4043761-4-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Reviewed-by: Pratyush Yadav <pratyush@kernel.org>
Cc: Alexander Graf <graf@amazon.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Changyuan Lyu <changyuanl@google.com>
Cc: Chris Li <chrisl@kernel.org>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
to make it clear that KHO operates on pages rather than on a random
physical address.
The kho_preserve_pages() will be also used in upcoming support for vmalloc
preservation.
Link: https://lkml.kernel.org/r/20250921054458.4043761-3-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Reviewed-by: Pratyush Yadav <pratyush@kernel.org>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Cc: Alexander Graf <graf@amazon.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Changyuan Lyu <changyuanl@google.com>
Cc: Chris Li <chrisl@kernel.org>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux
Pull dma-mapping fixes from Marek Szyprowski:
"Two small fixes for the recently performed code refactoring (Shigeru
Yoshida) and missing handling of direction parameter in DMA debug code
(Petr Tesarik)"
* tag 'dma-mapping-6.18-2025-10-07' of git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux:
dma-mapping: fix direction in dma_alloc direction traces
kmsan: fix kmsan_handle_dma() to avoid false positives
|
|
Queue read and write pointers are "to KFD", not "from KFD".
Suggested-by: Robert Liu <robert.liu@amd.com>
Signed-off-by: Felix Kuehling <felix.kuehling@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Robert Liu <robert.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull more power management updates from Rafael Wysocki:
"These are cpufreq fixes and cleanups on top of the material merged
previously, a power management core code fix and updates of the
runtime PM framework including unit tests, documentation updates and
introduction of auto-cleanup macros for runtime PM "resume and get"
and "get without resuming" operations.
Specifics:
- Make cpufreq drivers setting the default CPU transition latency to
CPUFREQ_ETERNAL specify a proper default transition latency value
instead which addresses a regression introduced during the 6.6
cycle that broke CPUFREQ_ETERNAL handling (Rafael Wysocki)
- Make the cpufreq CPPC driver use a proper transition delay value
when CPUFREQ_ETERNAL is returned by cppc_get_transition_latency()
to indicate an error condition (Rafael Wysocki)
- Make cppc_get_transition_latency() return a negative error code to
indicate error conditions instead of using CPUFREQ_ETERNAL for this
purpose and drop CPUFREQ_ETERNAL that has no other users (Rafael
Wysocki, Gopi Krishna Menon)
- Fix device leak in the mediatek cpufreq driver (Johan Hovold)
- Set target frequency on all CPUs sharing a policy during frequency
updates in the tegra186 cpufreq driver and make it initialize all
cores to max frequencies (Aaron Kling)
- Rust cpufreq helper cleanup (Thorsten Blum)
- Make pm_runtime_put*() family of functions return 1 when the given
device is already suspended which is consistent with the
documentation (Brian Norris)
- Add basic kunit tests for runtime PM API contracts and update
return values in kerneldoc comments for the runtime PM API (Brian
Norris, Dan Carpenter)
- Add auto-cleanup macros for runtime PM "resume and get" and "get
without resume" operations, use one of them in the PCI core and
drop the existing "free" macro introduced for similar purpose, but
somewhat cumbersome to use (Rafael Wysocki)
- Make the core power management code avoid waiting on device links
marked as SYNC_STATE_ONLY which is consistent with the handling of
those device links elsewhere (Pin-yen Lin)"
* tag 'pm-6.18-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
docs/zh_CN: Fix malformed table
docs/zh_TW: Fix malformed table
PM: runtime: Fix error checking for kunit_device_register()
PM: runtime: Introduce one more usage counter guard
cpufreq: Drop unused symbol CPUFREQ_ETERNAL
ACPI: CPPC: Do not use CPUFREQ_ETERNAL as an error value
cpufreq: CPPC: Avoid using CPUFREQ_ETERNAL as transition delay
cpufreq: Make drivers using CPUFREQ_ETERNAL specify transition latency
PM: runtime: Drop DEFINE_FREE() for pm_runtime_put()
PCI/sysfs: Use runtime PM guard macro for auto-cleanup
PM: runtime: Add auto-cleanup macros for "resume and get" operations
cpufreq: tegra186: Initialize all cores to max frequencies
cpufreq: tegra186: Set target frequency for all cpus in policy
rust: cpufreq: streamline find_supply_names
cpufreq: mediatek: fix device leak on probe failure
PM: sleep: Do not wait on SYNC_STATE_ONLY device links
PM: runtime: Update kerneldoc return codes
PM: runtime: Make put{,_sync}() return 1 when already suspended
PM: runtime: Add basic kunit tests for API contracts
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Pull clk updates from Stephen Boyd:
"There's a bunch of patches here across drivers/clk/ to migrate drivers
to use struct clk_ops::determine_rate() instead of the round_rate()
one so that we can remove the round_rate clk_op entirely. Brian has
taken up that task which nobody else has wanted to do for close to a
decade. Thanks Brian!
This is all prerequisite work to get to the real task of improving the
clk rate setting process. Once we have determine_rate() used
everywhere, we'll be able to do things like chain the rate request
structs in linked lists to order the rate setting operations or add
more parameters without having to change every clk driver in
existence. It's also nice to not have multiple ways to do something
which just causes confusion for clk driver authors. Overall I'm glad
this is getting done.
Beyond this change we also have a tweak to the clk_lookup() function
in the core framework to use hashing on the clk name instead of a clk
tree walk with string comparisons. We _still_ rely on the clk name to
be unique, because historically we've used globally unique strings to
describe the clk tree topology. This tree walk becomes increasingly
slow as more clks are added to the system. Searching from the roots
for a duplicate is simple but pretty dumb and it wastes boot time so
we're using a hash table as an improvement. Ideally we wouldn't rely
on the strings to be unique at all, relegating them to simply debug
information, but that is future work that will likely require some
sort of Kconfig knob indicating strings aren't used for topology
description.
Outside of the core framework changes we have the usual new SoC
support and fixes to clk drivers for things that were discovered once
the clks were used by consumer drivers. Nothing in particular is
jumping out at me in the "misc" pile, except maybe the Amlogic driver
that has gone through a refactoring. That series got a fix from
testing in -next though so it seems likely that things have been
getting good test coverage for a couple weeks already"
* tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: (299 commits)
clk: microchip: core: remove duplicate roclk_determine_rate()
reset: aspeed: register AST2700 reset auxiliary bus device
dt-bindings: clock: ast2700: modify soc0/1 clock define
clk: tegra: do not overallocate memory for bpmp clocks
clk: ep93xx: Use int type to store negative error codes
clk: nxp: Fix pll0 rate check condition in LPC18xx CGU driver
clk: loongson2: Add clock definitions for Loongson-2K0300 SoC
clk: loongson2: Avoid hardcoding firmware name of the reference clock
clk: loongson2: Allow zero divisors for dividers
clk: loongson2: Support scale clocks with an alternative mode
clk: loongson2: Allow specifying clock flags for gate clock
dt-bindings: clock: loongson2: Add Loongson-2K0300 compatible
clk: clocking-wizard: Fix output clock register offset for Versal platforms
clk: xilinx: Optimize divisor search in clk_wzrd_get_divisors_ver()
clk: mmp: pxa1908: Instantiate power driver through auxiliary bus
clk: s2mps11: add support for S2MPG10 PMIC clock
dt-bindings: clock: samsung,s2mps11: add s2mpg10
dt-bindings: stm32: cosmetic fixes for STM32MP25 clock and reset bindings
clk: stm32: introduce clocks for STM32MP21 platform
dt-bindings: stm32: add STM32MP21 clocks and reset bindings
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux
Pull hyperv updates from Wei Liu:
- Unify guest entry code for KVM and MSHV (Sean Christopherson)
- Switch Hyper-V MSI domain to use msi_create_parent_irq_domain()
(Nam Cao)
- Add CONFIG_HYPERV_VMBUS and limit the semantics of CONFIG_HYPERV
(Mukesh Rathor)
- Add kexec/kdump support on Azure CVMs (Vitaly Kuznetsov)
- Deprecate hyperv_fb in favor of Hyper-V DRM driver (Prasanna
Kumar T S M)
- Miscellaneous enhancements, fixes and cleanups (Abhishek Tiwari,
Alok Tiwari, Nuno Das Neves, Wei Liu, Roman Kisel, Michael Kelley)
* tag 'hyperv-next-signed-20251006' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
hyperv: Remove the spurious null directive line
MAINTAINERS: Mark hyperv_fb driver Obsolete
fbdev/hyperv_fb: deprecate this in favor of Hyper-V DRM driver
Drivers: hv: Make CONFIG_HYPERV bool
Drivers: hv: Add CONFIG_HYPERV_VMBUS option
Drivers: hv: vmbus: Fix typos in vmbus_drv.c
Drivers: hv: vmbus: Fix sysfs output format for ring buffer index
Drivers: hv: vmbus: Clean up sscanf format specifier in target_cpu_store()
x86/hyperv: Switch to msi_create_parent_irq_domain()
mshv: Use common "entry virt" APIs to do work in root before running guest
entry: Rename "kvm" entry code assets to "virt" to genericize APIs
entry/kvm: KVM: Move KVM details related to signal/-EINTR into KVM proper
mshv: Handle NEED_RESCHED_LAZY before transferring to guest
x86/hyperv: Add kexec/kdump support on Azure CVMs
Drivers: hv: Simplify data structures for VMBus channel close message
Drivers: hv: util: Cosmetic changes for hv_utils_transport.c
mshv: Add support for a new parent partition configuration
clocksource: hyper-v: Skip unnecessary checks for the root partition
hyperv: Add missing field to hv_output_map_device_interrupt
|
|
Keep bio_iov_iter_get_bdev_pages local with the callers, as blindly
looking at the bdev logical block size is often not the best idea
unless on a block device.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Now that the bio_iov_iter_get_pages is free again, use it instead of
the more complicated now. Also drop the unused export.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Switch the only caller to bio_iov_iter_get_pages, and explain why it does
not have any alignment requirements.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Be consistent with tab style of "liburing/src/include/liburing/io_uring.h".
Signed-off-by: Haiyue Wang <haiyuewa@163.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Maintain two separate RB trees per order - one for clear (zeroed) blocks
and another for dirty (uncleared) blocks. This separation improves
code clarity and makes it more obvious which tree is being searched
during allocation. It also improves scalability and efficiency when
searching for a specific type of block, avoiding unnecessary checks
and making the allocator more predictable under fragmentation.
The changes have been validated using the existing drm_buddy_test
KUnit test cases, along with selected graphics workloads,
to ensure correctness and avoid regressions.
v2: Missed adding the suggested-by tag. Added it in v2.
v3(Matthew):
- Remove the double underscores from the internal functions.
- Rename the internal functions to have less generic names.
- Fix the error handling code.
- Pass tree argument for the tree macro.
- Use the existing dirty/free bit instead of new tree field.
- Make free_trees[] instead of clear_tree and dirty_tree for
more cleaner approach.
v4:
- A bug was reported by Intel CI and it is fixed by
Matthew Auld.
- Replace the get_root function with
&mm->free_trees[tree][order] (Matthew)
- Remove the unnecessary rbtree_is_empty() check (Matthew)
- Remove the unnecessary get_tree_for_flags() function.
- Rename get_tree_for_block() name with get_block_tree() for more
clarity.
v5(Jani Nikula):
- Don't use static inline in .c files.
- enum free_tree and enumerator names are quite generic for a header
and usage and the whole enum should be an implementation detail.
v6:
- Rewrite the __force_merge() function using the rb_last() and rb_prev().
v7(Matthew):
- Replace the open-coded tree iteration for loops with the
for_each_free_tree() macro throughout the code.
- Fixed out_free_roots to prevent double decrement of i,
addressing potential crash.
- Replaced enum drm_buddy_free_tree with unsigned int
in for_each_free_tree loops.
Cc: stable@vger.kernel.org
Fixes: a68c7eaa7a8f ("drm/amdgpu: Enable clear page functionality")
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Suggested-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4260
Link: https://lore.kernel.org/r/20251006095124.1663-2-Arunpravin.PaneerSelvam@amd.com
|
|
Replace the freelist (O(n)) used for free block management with a
red-black tree, providing more efficient O(log n) search, insert,
and delete operations. This improves scalability and performance
when managing large numbers of free blocks per order (e.g., hundreds
or thousands).
In the VK-CTS memory stress subtest, the buddy manager merges
fragmented memory and inserts freed blocks into the freelist. Since
freelist insertion is O(n), this becomes a bottleneck as fragmentation
increases. Benchmarking shows list_insert_sorted() consumes ~52.69% CPU
with the freelist, compared to just 0.03% with the RB tree
(rbtree_insert.isra.0), despite performing the same sorted insert.
This also improves performance in heavily fragmented workloads,
such as games or graphics tests that stress memory.
As the buddy allocator evolves with new features such as clear-page
tracking, the resulting fragmentation and complexity have grown.
These RB-tree based design changes are introduced to address that
growth and ensure the allocator continues to perform efficiently
under fragmented conditions.
The RB tree implementation with separate clear/dirty trees provides:
- O(n log n) aggregate complexity for all operations instead of O(n^2)
- Elimination of soft lockups and system instability
- Improved code maintainability and clarity
- Better scalability for large memory systems
- Predictable performance under fragmentation
v3(Matthew):
- Remove RB_EMPTY_NODE check in force_merge function.
- Rename rb for loop macros to have less generic names and move to
.c file.
- Make the rb node rb and link field as union.
v4(Jani Nikula):
- The kernel-doc comment should be "/**"
- Move all the rbtree macros to rbtree.h and add parens to ensure
correct precedence.
v5:
- Remove the inline in a .c file (Jani Nikula).
v6(Peter Zijlstra):
- Add rb_add() function replacing the existing rbtree_insert() code.
v7:
- A full walk iteration in rbtree is slower than the list (Peter Zijlstra).
- The existing rbtree_postorder_for_each_entry_safe macro should be used
in scenarios where traversal order is not a critical factor (Christian).
v8(Matthew):
- Remove the rbtree_is_empty() check in this patch as well.
Cc: stable@vger.kernel.org
Fixes: a68c7eaa7a8f ("drm/amdgpu: Enable clear page functionality")
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://lore.kernel.org/r/20251006095124.1663-1-Arunpravin.PaneerSelvam@amd.com
|
|
Merge cpufreq fixes and cleanups, mostly on top of those fixes, for
6.18-rc1:
- Make cpufreq drivers setting the default CPU transition latency to
CPUFREQ_ETERNAL specify a proper default transition latency value
instead which addresses a regression introduced during the 6.6 cycle
that broke CPUFREQ_ETERNAL handling (Rafael Wysocki)
- Make the cpufreq CPPC driver use a proper transition delay value
when CPUFREQ_ETERNAL is returned by cppc_get_transition_latency() to
indicate an error condition (Rafael Wysocki)
- Make cppc_get_transition_latency() return a negative error code to
indicate error conditions instead of using CPUFREQ_ETERNAL for this
purpose and drop CPUFREQ_ETERNAL that has no other users (Rafael
Wysocki, Gopi Krishna Menon)
- Fix device leak in the mediatek cpufreq driver (Johan Hovold)
- Set target frequency on all CPUs sharing a policy during frequency
updates in the tegra186 cpufreq driver and make it initialize all
cores to max frequencies (Aaron Kling)
- Rust cpufreq helper cleanup (Thorsten Blum)
* pm-cpufreq:
docs/zh_CN: Fix malformed table
docs/zh_TW: Fix malformed table
cpufreq: Drop unused symbol CPUFREQ_ETERNAL
ACPI: CPPC: Do not use CPUFREQ_ETERNAL as an error value
cpufreq: CPPC: Avoid using CPUFREQ_ETERNAL as transition delay
cpufreq: Make drivers using CPUFREQ_ETERNAL specify transition latency
cpufreq: tegra186: Initialize all cores to max frequencies
cpufreq: tegra186: Set target frequency for all cpus in policy
rust: cpufreq: streamline find_supply_names
cpufreq: mediatek: fix device leak on probe failure
|
|
Merge runtime PM framework updates and a core power management code fix
for 6.18-rc1:
- Make pm_runtime_put*() family of functions return 1 when the
given device is already suspended which is consistent with the
documentation (Brian Norris)
- Add basic kunit tests for runtime PM API contracts and update return
values in kerneldoc coments for the runtime PM API (Brian Norris,
Dan Carpenter)
- Add auto-cleanup macros for runtime PM "resume and get" and "get
without resume" operations, use one of them in the PCI core and
drop the existing "free" macro introduced for similar purpose, but
somewhat cumbersome to use (Rafael Wysocki)
- Make the core power management code avoid waiting on device links
marked as SYNC_STATE_ONLY which is consistent with the handling of
those device links elsewhere (Pin-yen Lin)
* pm-core:
PM: sleep: Do not wait on SYNC_STATE_ONLY device links
* pm-runtime:
PM: runtime: Fix error checking for kunit_device_register()
PM: runtime: Introduce one more usage counter guard
PM: runtime: Drop DEFINE_FREE() for pm_runtime_put()
PCI/sysfs: Use runtime PM guard macro for auto-cleanup
PM: runtime: Add auto-cleanup macros for "resume and get" operations
PM: runtime: Update kerneldoc return codes
PM: runtime: Make put{,_sync}() return 1 when already suspended
PM: runtime: Add basic kunit tests for API contracts
|
|
Pull nfsd updates from Chuck Lever:
"Mike Snitzer has prototyped a mechanism for disabling I/O caching in
NFSD. This is introduced in v6.18 as an experimental feature. This
enables scaling NFSD in /both/ directions:
- NFS service can be supported on systems with small memory
footprints, such as low-cost cloud instances
- Large NFS workloads will be less likely to force the eviction of
server-local activity, helping it avoid thrashing
Jeff Layton contributed a number of fixes to the new attribute
delegation implementation (based on a pending Internet RFC) that we
hope will make attribute delegation reliable enough to enable by
default, as it is on the Linux NFS client.
The remaining patches in this pull request are clean-ups and minor
optimizations. Many thanks to the contributors, reviewers, testers,
and bug reporters who participated during the v6.18 NFSD development
cycle"
* tag 'nfsd-6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: (42 commits)
nfsd: discard nfserr_dropit
SUNRPC: Make RPCSEC_GSS_KRB5 select CRYPTO instead of depending on it
NFSD: Add io_cache_{read,write} controls to debugfs
NFSD: Do the grace period check in ->proc_layoutget
nfsd: delete unnecessary NULL check in __fh_verify()
NFSD: Allow layoutcommit during grace period
NFSD: Disallow layoutget during grace period
sunrpc: fix "occurence"->"occurrence"
nfsd: Don't force CRYPTO_LIB_SHA256 to be built-in
nfsd: nfserr_jukebox in nlm_fopen should lead to a retry
NFSD: Reduce DRC bucket size
NFSD: Delay adding new entries to LRU
SUNRPC: Move the svc_rpcb_cleanup() call sites
NFS: Remove rpcbind cleanup for NFSv4.0 callback
nfsd: unregister with rpcbind when deleting a transport
NFSD: Drop redundant conversion to bool
sunrpc: eliminate return pointer in svc_tcp_sendmsg()
sunrpc: fix pr_notice in svc_tcp_sendto() to show correct length
nfsd: decouple the xprtsec policy check from check_nfsd_access()
NFSD: Fix destination buffer size in nfsd4_ssc_setup_dul()
...
|