| Age | Commit message (Collapse) | Author |
|
If KVM toggles "CPU dirty logging", a.k.a. Page-Modification Logging (PML),
while L2 is active, temporarily load vmcs01 and immediately update the
relevant controls instead of deferring the update until the next nested
VM-Exit. For PML, deferring the update is relatively straightforward, but
for several APICv related updates, deferring updates creates ordering and
state consistency problems, e.g. KVM at-large thinks APICv is enabled, but
vmcs01 is still running with stale (and effectively unknown) state.
Convert PML first precisely because it's the simplest case to handle: if
something is broken with the vmcs01 <=> vmcs02 dance, then hopefully bugs
will bisect here.
Reviewed-by: Chao Gao <chao.gao@intel.com>
Link: https://patch.msgid.link/20260109034532.1012993-3-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Add a test to verify KVM correctly handles a variety of edge cases related
to APICv updates, and in particular updates that are triggered while L2 is
actively running.
Reviewed-by: Chao Gao <chao.gao@intel.com>
Link: https://patch.msgid.link/20260109034532.1012993-2-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
When neither sysenter32 nor syscall32 is available (on either
FRED-capable 64-bit hardware or old 32-bit hardware), there is no
reason to do a bunch of stack shuffling in __kernel_vsyscall.
Unfortunately, just overwriting the initial "push" instructions will
mess up the CFI annotations, so suffer the 3-byte NOP if not
applicable.
Similarly, inline the int $0x80 when doing inline system calls in the
vdso instead of calling __kernel_vsyscall.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://patch.msgid.link/20251216212606.1325678-11-hpa@zytor.com
|
|
In most cases, the use of "fast 32-bit system call" depends either on
X86_FEATURE_SEP or X86_FEATURE_SYSENTER32 || X86_FEATURE_SYSCALL32.
However, nearly all the logic for both is identical.
Define X86_FEATURE_SYSFAST32 which indicates that *either* SYSENTER32 or
SYSCALL32 should be used, for either 32- or 64-bit kernels. This
defaults to SYSENTER; use SYSCALL if the SYSCALL32 bit is also set.
As this removes ALL existing uses of X86_FEATURE_SYSENTER32, which is
a kernel-only synthetic feature bit, simply remove it and replace it
with X86_FEATURE_SYSFAST32.
This leaves an unused alternative for a true 32-bit kernel, but that
should really not matter in any way.
The clearing of X86_FEATURE_SYSCALL32 can be removed once the patches
for automatically clearing disabled features has been merged.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://patch.msgid.link/20251216212606.1325678-10-hpa@zytor.com
|
|
Abstract out the calling of true system calls from the vdso into
macros.
It has been a very long time since gcc did not allow %ebx or %ebp in
inline asm in 32-bit PIC mode; remove the corresponding hacks.
Remove the use of memory output constraints in gettimeofday.h in favor
of "memory" clobbers. The resulting code is identical for the current
use cases, as the system call is usually a terminal fallback anyway,
and it merely complicates the macroization.
This patch adds only a handful of more lines of code than it removes,
and in fact could be made substantially smaller by removing the macros
for the argument counts that aren't currently used, however, it seems
better to be general from the start.
[ v3: remove stray comment from prototyping; remove VDSO_SYSCALL6()
since it would require special handling on 32 bits and is
currently unused. (Uros Biszjak)
Indent nested preprocessor directives. ]
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Uros Bizjak <ubizjak@gmail.com>
Link: https://patch.msgid.link/20251216212606.1325678-9-hpa@zytor.com
|
|
Currently the vdso doesn't include .note.gnu.property or a GNU noexec
stack annotation (the -z noexecstack in the linker script is
ineffective because we specify PHDRs explicitly.)
The motivation is that the dynamic linker currently do not check
these.
However, this is a weak excuse: the vdso*.so are also supposed to be
usable at link libraries, and there is no reason why the dynamic
linker might not want or need to check these in the future, so add
them back in -- it is trivial enough.
Use symbolic constants for the PHDR permission flags.
[ v4: drop unrelated formatting changes ]
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://patch.msgid.link/20251216212606.1325678-8-hpa@zytor.com
|
|
The vdso32 sigreturn.S contains open-coded DWARF bytecode, which
includes a hack for gdb to not try to step back to a previous call
instruction when backtracing from a signal handler.
Neither of those are necessary anymore: the backtracing issue is
handled by ".cfi_entry simple" and ".cfi_signal_frame", both of which
have been supported for a very long time now, which allows the
remaining frame to be built using regular .cfi annotations.
Add a few more register offsets to the signal frame just for good
measure.
Replace the nop on fallthrough of the system call (which should never,
ever happen) with a ud2a trap.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://patch.msgid.link/20251216212606.1325678-7-hpa@zytor.com
|
|
A macro SYSCALL_ENTER_KERNEL was defined in sigreturn.S, with the
ability of overriding it. The override capability, however, is not
used anywhere, and the macro name is potentially confusing because it
seems to imply that sysenter/syscall could be used here, which is NOT
true: the sigreturn system calls MUST use int $0x80.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://patch.msgid.link/20251216212606.1325678-6-hpa@zytor.com
|
|
There is no fundamental reason to use the int80_landing_pad symbol to
adjust ip when moving the vdso. If ip falls within the vdso, and the
vdso is moved, we should change the ip accordingly, regardless of mode
or location within the vdso. This *currently* can only happen on 32
bits, but there isn't any reason not to do so generically.
Note that if this is ever possible from a vdso-internal call, then the
user space stack will also needed to be adjusted (as well as the
shadow stack, if enabled.) Fortunately this is not currently the case.
At the moment, we don't even consider other threads when moving the
vdso. The assumption is that it is only used by process freeze/thaw
for migration, where this is not an issue.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://patch.msgid.link/20251216212606.1325678-5-hpa@zytor.com
|
|
Donglin Peng says:
====================
Improve the performance of BTF type lookups with binary search
From: Donglin Peng <pengdonglin@xiaomi.com>
The series addresses the performance limitations of linear search in large
BTFs by:
1. Adding BTF permutation support
2. Using resolve_btfids to sort BTF during the build phase
3. Checking BTF sorting
4. Using binary search when looking up types
Patch #1 introduces an interface for btf__permute in libbpf to relay out BTF.
Patch #2 adds test cases to validate the functionality of btf__permute in base
and split BTF scenarios.
Patch #3 introduces a new phase in the resolve_btfids tool to sort BTF by name
in ascending order.
Patches #4-#7 implement the sorting check and binary search.
Patches #8-#10 optimize type lookup performance of some functions by skipping
anonymous types or invoking btf_find_by_name_kind.
Patch #11 refactors the code by calling str_is_empty.
Here is a simple performance test result [1] for lookups to find 87,584 named
types in vmlinux BTF:
./vmtest.sh -- ./test_progs -t btf_permute/perf -v
Results:
| Condition | Lookup Time | Improvement |
|--------------------|-------------|--------------|
| Unsorted (Linear) | 36,534 ms | Baseline |
| Sorted (Binary) | 15 ms | 2437x faster |
The binary search implementation reduces lookup time from 36.5 seconds to 15
milliseconds, achieving a **2437x** speedup for large-scale type queries.
Changelog:
v12:
- Set the start_id to 1 instead of btf->start_id in the btf__find_by_name (AI)
v11:
- Link: https://lore.kernel.org/bpf/20260108031645.1350069-1-dolinux.peng@gmail.com/
- PATCH #1: Modify implementation of btf__permute: id_map[0] must be 0 for base BTF (Andrii)
- PATCH #3: Refactor the code (Andrii)
- PATCH #4~8:
- Revert to using the binary search in v7 to simplify the code (Andrii)
- Refactor the code of btf_check_sorted (Andrii, Eduard)
- Rename sorted_start_id to named_start_id
- Rename btf_sorted_start_id to btf_named_start_id, and add comments (Andrii, Eduard)
v10:
- Link: https://lore.kernel.org/all/20251218113051.455293-1-dolinux.peng@gmail.com/
- Improve btf__permute() documentation (Eduard)
- Fall back to linear search when locating anonymous types (Eduard)
- Remove redundant NULL name check in libbpf's linear search path (Eduard)
- Simplify btf_check_sorted() implementation (Eduard)
- Treat kernel modules as unsorted by default
- Introduce btf_is_sorted and btf_sorted_start_id for clarity (Eduard)
- Fix optimizations in btf_find_decl_tag_value() and btf_prepare_func_args()
to support split BTF
- Remove linear search branch in determine_ptr_size()
- Rebase onto Ihor's v4 patch series [4]
v9:
- Link: https://lore.kernel.org/bpf/20251208062353.1702672-1-dolinux.peng@gmail.com/
- Optimize the performance of the function determine_ptr_size by invoking
btf__find_by_name_kind
- Optimize the performance of btf_find_decl_tag_value/btf_prepare_func_args/
bpf_core_add_cands by skipping anonymous types
- Rebase the patch series onto Ihor's v3 patch series [3]
v8
- Link: https://lore.kernel.org/bpf/20251126085025.784288-1-dolinux.peng@gmail.com/
- Remove the type dropping feature of btf__permute (Andrii)
- Refactor the code of btf__permute (Andrii, Eduard)
- Make the self-test code cleaner (Eduard)
- Reconstruct the BTF sorting patch based on Ihor's patch series [2]
- Simplify the sorting logic and place anonymous types before named types
(Andrii, Eduard)
- Optimize type lookup performance of two kernel functions
- Refactoring the binary search and type lookup logic achieves a 4.2%
performance gain, reducing the average lookup time (via the perf test
code in [1] for 60,995 named types in vmlinux BTF) from 10,217 us (v7) to
9,783 us (v8).
v7:
- Link: https://lore.kernel.org/all/20251119031531.1817099-1-dolinux.peng@gmail.com/
- btf__permute API refinement: Adjusted id_map and id_map_cnt parameter
usage so that for base BTF, id_map[0] now contains the new id of original
type id 1 (instead of VOID type id 0), improving logical consistency
- Selftest updates: Modified test cases to align with the API usage changes
- Refactor the code of resolve_btfids
v6:
- Link: https://lore.kernel.org/all/20251117132623.3807094-1-dolinux.peng@gmail.com/
- ID Map-based reimplementation of btf__permute (Andrii)
- Build-time BTF sorting using resolve_btfids (Alexei, Eduard)
- Binary search method refactoring (Andrii)
- Enhanced selftest coverage
v5:
- Link: https://lore.kernel.org/all/20251106131956.1222864-1-dolinux.peng@gmail.com/
- Refactor binary search implementation for improved efficiency
(Thanks to Andrii and Eduard)
- Extend btf__permute interface with 'ids_sz' parameter to support
type dropping feature (suggested by Andrii). Plan subsequent reimplementation of
id_map version for comparative analysis with current sequence interface
- Add comprehensive test coverage for type dropping functionality
- Enhance function comment clarity and accuracy
v4:
- Link: https://lore.kernel.org/all/20251104134033.344807-1-dolinux.peng@gmail.com/
- Abstracted btf_dedup_remap_types logic into a helper function (suggested by Eduard).
- Removed btf_sort.c and implemented sorting separately for libbpf and kernel (suggested by Andrii).
- Added test cases for both base BTF and split BTF scenarios (suggested by Eduard).
- Added validation for name-only sorting of types (suggested by Andrii)
- Refactored btf__permute implementation to reduce complexity (suggested by Andrii)
- Add doc comments for btf__permute (suggested by Andrii)
v3:
- Link: https://lore.kernel.org/all/20251027135423.3098490-1-dolinux.peng@gmail.com/
- Remove sorting logic from libbpf and provide a generic btf__permute() interface (suggested
by Andrii)
- Omitted the search direction patch to avoid conflicts with base BTF (suggested by Eduard).
- Include btf_sort.c directly in btf.c to reduce function call overhead
v2:
- Link: https://lore.kernel.org/all/20251020093941.548058-1-dolinux.peng@gmail.com/
- Moved sorting to the build phase to reduce overhead (suggested by Alexei).
- Integrated sorting into btf_dedup_compact_and_sort_types (suggested by Eduard).
- Added sorting checks during BTF parsing.
- Consolidated common logic into btf_sort.c for sharing (suggested by Alan).
v1:
- Link: https://lore.kernel.org/all/20251013131537.1927035-1-dolinux.peng@gmail.com/
[1] https://github.com/pengdonglin137/btf_sort_test
[2] https://lore.kernel.org/bpf/20251126012656.3546071-1-ihor.solodrai@linux.dev/
[3] https://lore.kernel.org/bpf/20251205223046.4155870-1-ihor.solodrai@linux.dev/
[4] https://lore.kernel.org/bpf/20251218003314.260269-1-ihor.solodrai@linux.dev/
====================
Link: https://patch.msgid.link/20260109130003.3313716-1-dolinux.peng@gmail.com
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
|
|
Calling the str_is_empty function to clarify the code and
no functional changes are introduced.
Signed-off-by: Donglin Peng <pengdonglin@xiaomi.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/20260109130003.3313716-12-dolinux.peng@gmail.com
|
|
Currently, vmlinux BTF is unconditionally sorted during
the build phase. The function btf_find_by_name_kind
executes the binary search branch, so find_bpffs_btf_enums
can be optimized by using btf_find_by_name_kind.
Signed-off-by: Donglin Peng <pengdonglin@xiaomi.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/20260109130003.3313716-10-dolinux.peng@gmail.com
|
|
Currently, vmlinux and kernel module BTFs are unconditionally
sorted during the build phase, with named types placed at the
end. Thus, anonymous types should be skipped when starting the
search. In my vmlinux BTF, the number of anonymous types is
61,747, which means the loop count can be reduced by 61,747.
Signed-off-by: Donglin Peng <pengdonglin@xiaomi.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/20260109130003.3313716-9-dolinux.peng@gmail.com
|
|
This patch checks whether the BTF is sorted by name in ascending order.
If sorted, binary search will be used when looking up types.
Specifically, vmlinux and kernel module BTFs are always sorted during
the build phase with anonymous types placed before named types, so we
only need to identify the starting ID of named types.
Signed-off-by: Donglin Peng <pengdonglin@xiaomi.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20260109130003.3313716-8-dolinux.peng@gmail.com
|
|
Improve btf_find_by_name_kind() performance by adding binary search
support for sorted types. Falls back to linear search for compatibility.
Signed-off-by: Donglin Peng <pengdonglin@xiaomi.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20260109130003.3313716-7-dolinux.peng@gmail.com
|
|
The kernel specifies LUT table sizes in a separate property, however it
doesn't enforce it as a maximum. Some drivers implement max size check
on their own in the atomic_check path. Other drivers simply ignore the
issue. Perform LUT size validation in the generic place.
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patch.msgid.link/20260106-drm-fix-lut-checks-v3-3-f7f979eb73c8@oss.qualcomm.com
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
|
|
The function drm_property_replace_blob_from_id() allows checking whether
the blob size is equal to a predefined value. In case of variable-size
properties (like the gamma / degamma LUTs) we might want to check for
the blob size against the maximum, allowing properties of the size
lesser than the max supported by the hardware. Extend the function in
order to support such checks.
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patch.msgid.link/20260106-drm-fix-lut-checks-v3-2-f7f979eb73c8@oss.qualcomm.com
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
|
|
We have a helper to get property values for non-atomic drivers and
another one default property values for atomic drivers. In some cases we
need the ability to get value of immutable property, no matter what kind
of driver it is. Implement new property-related helper,
drm_object_immutable_property_get_value(), which lets the caller to get
the value of the immutable property.
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patch.msgid.link/20260106-drm-fix-lut-checks-v3-1-f7f979eb73c8@oss.qualcomm.com
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
|
|
This patch checks whether the BTF is sorted by name in ascending
order. If sorted, binary search will be used when looking up types.
Signed-off-by: Donglin Peng <pengdonglin@xiaomi.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/20260109130003.3313716-6-dolinux.peng@gmail.com
|
|
This patch introduces binary search optimization for BTF type lookups
when the BTF instance contains sorted types.
The optimization significantly improves performance when searching for
types in large BTF instances with sorted types. For unsorted BTF, the
implementation falls back to the original linear search.
Signed-off-by: Donglin Peng <pengdonglin@xiaomi.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20260109130003.3313716-5-dolinux.peng@gmail.com
|
|
This introduces a new BTF sorting phase that specifically sorts
BTF types by name in ascending order, so that the binary search
can be used to look up types.
Signed-off-by: Donglin Peng <pengdonglin@xiaomi.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/20260109130003.3313716-4-dolinux.peng@gmail.com
|
|
This patch introduces test cases for the btf__permute function to ensure
it works correctly with both base BTF and split BTF scenarios.
The test suite includes:
- test_permute_base: Validates permutation on base BTF
- test_permute_split: Tests permutation on split BTF
Signed-off-by: Donglin Peng <pengdonglin@xiaomi.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/20260109130003.3313716-3-dolinux.peng@gmail.com
|
|
Introduce btf__permute() API to allow in-place rearrangement of BTF types.
This function reorganizes BTF type order according to a provided array of
type IDs, updating all type references to maintain consistency.
Signed-off-by: Donglin Peng <pengdonglin@xiaomi.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/20260109130003.3313716-2-dolinux.peng@gmail.com
|
|
- Separate out the vdso sources into common, vdso32, and vdso64
directories.
- Build the 32- and 64-bit vdsos in their respective subdirectories;
this greatly simplifies the build flags handling.
- Unify the mangling of Makefile flags between the 32- and 64-bit
vdso code as much as possible; all common rules are put in
arch/x86/entry/vdso/common/Makefile.include. The remaining
is very simple for 32 bits; the 64-bit one is only slightly more
complicated because it contains the x32 generation rule.
- Define __DISABLE_EXPORTS when building the vdso. This need seems to
have been masked by different ordering compile flags before.
- Change CONFIG_X86_64 to BUILD_VDSO32_64 in vdso32/system_call.S,
to make it compatible with including fake_32bit_build.h.
- The -fcf-protection= option was "leaking" from the kernel build,
for reasons that was not clear to me. Furthermore, several
distributions ship with it set to a default value other than
"-fcf-protection=none". Make it match the configuration options
for *user space*.
Note that this patch may seem large, but the vast majority of it is
simply code movement.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://patch.msgid.link/20251216212606.1325678-4-hpa@zytor.com
|
|
It is generally better to build tools in arch/x86/tools to keep host
cflags proliferation down, and to reduce makefile sequencing issues.
Move the vdso build tool vdso2c into arch/x86/tools in preparation for
refactoring the vdso makefiles.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://patch.msgid.link/20251216212606.1325678-3-hpa@zytor.com
|
|
The vdso .so files are named vdso*.so. These structures are binary
images and descriptions of these files, so it is more consistent for
them to have a naming that more directly mirrors the filenames.
It is also very slightly more compact (by one character...) and
simplifies the Makefile just a little bit.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://patch.msgid.link/20251216212606.1325678-2-hpa@zytor.com
|
|
Now that we've had one bug that occurred in nouveau as the result of
nv50_head_flush_* being called without the appropriate locks, let's add
some lockdep asserts to make sure this doesn't happen in the future.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Lyude Paul <lyude@redhat.com>
Link: https://patch.msgid.link/20251219215344.170852-3-lyude@redhat.com
|
|
For a while, I've been seeing a strange issue where some (usually not all)
of the display DMA channels will suddenly hang, particularly when there is
a visible cursor on the screen that is being frequently updated, and
especially when said cursor happens to go between two screens. While this
brings back lovely memories of fixing Intel Skylake bugs, I would quite
like to fix it :).
It turns out the problem that's happening here is that we're managing to
reach nv50_head_flush_set() in our atomic commit path without actually
holding nv50_disp->mutex. This means that cursor updates happening in
parallel (along with any other atomic updates that need to use the core
channel) will race with eachother, which eventually causes us to corrupt
the pushbuffer - leading to a plethora of various GSP errors, usually:
nouveau 0000:c1:00.0: gsp: Xid:56 CMDre 00000000 00000218 00102680 00000004 00800003
nouveau 0000:c1:00.0: gsp: Xid:56 CMDre 00000000 0000021c 00040509 00000004 00000001
nouveau 0000:c1:00.0: gsp: Xid:56 CMDre 00000000 00000000 00000000 00000001 00000001
The reason this is happening is because generally we check whether we need
to set nv50_atom->lock_core at the end of nv50_head_atomic_check().
However, curs507a_prepare is called from the fb_prepare callback, which
happens after the atomic check phase. As a result, this can lead to commits
that both touch the core channel but also don't grab nv50_disp->mutex.
So, fix this by making sure that we set nv50_atom->lock_core in
cus507a_prepare().
Reviewed-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Lyude Paul <lyude@redhat.com>
Fixes: 1590700d94ac ("drm/nouveau/kms/nv50-: split each resource type into their own source files")
Cc: <stable@vger.kernel.org> # v4.18+
Link: https://patch.msgid.link/20251219215344.170852-2-lyude@redhat.com
|
|
The "select" schema is not necessary because this schema is referenced by
riscv/cpus.yaml schema.
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
|
|
Add descriptions for the Sha extension and the seven extensions it
comprises: Shcounterenw, Shgatpa, Shtvala, Shvsatpa, Shvstvala, Shvstvecd,
and Ssstateen.
Sha is ratified in the RVA23 Profiles Version 1.0 (commit 0273f3c921b6
"rva23/rvb23 ratified") as a new profile-defined extension that captures
the full set of features that are mandated to be supported along with
the H extension.
Extensions Shcounterenw, Shgatpa, Shtvala, Shvsatpa, Shvstvala, Shvstvecd,
and Ssstateen are ratified in the RISC-V Profiles Version 1.0 (commit
b1d806605f87 "Updated to ratified state").
The requirement status for Sha and its comprised extension in RISC-V
Profiles are:
- Sha: Mandatory in RVA23S64
- H: Optional in RVA22S64; Mandatory in RVA23S64
- Shcounterenw: Optional in RVA22S64; Mandatory in RVA23S64
- Shgatpa: Optional in RVA22S64; Mandatory in RVA23S64
- Shtvala: Optional in RVA22S64; Mandatory in RVA23S64
- Shvsatpa: Optional in RVA22S64; Mandatory in RVA23S64
- Shvstvala: Optional in RVA22S64; Mandatory in RVA23S64
- Shvstvecd: Optional in RVA22S64; Mandatory in RVA23S64
- Ssstateen: Optional in RVA22S64; Mandatory in RVA23S64
Signed-off-by: Guodong Xu <guodong@riscstar.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
|
|
Add descriptions for five new extensions: Ssccptr, Sscounterenw, Sstvala,
Sstvecd, and Ssu64xl. These extensions are ratified in RISC-V Profiles
Version 1.0 (commit b1d806605f87 "Updated to ratified state.").
They are introduced as new extension names for existing features and
regulate implementation details for RISC-V Profile compliance. According
to RISC-V Profiles Version 1.0 and RVA23 Profiles Version 1.0, their
requirement status are:
- Ssccptr: Mandatory in RVA20S64, RVA22S64, RVA23S64
- Sscounterenw: Mandatory in RVA22S64, RVA23S64
- Sstvala: Mandatory in RVA20S64, RVA22S64, RVA23S64
- Sstvecd: Mandatory in RVA20S64, RVA22S64, RVA23S64
- Ssu64xl: Optional in RVA20S64, RVA22S64; Mandatory in RVA23S64
Signed-off-by: Guodong Xu <guodong@riscstar.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
|
|
Add descriptions for four extensions: Za64rs, Ziccamoa, Ziccif, and
Zicclsm. These extensions are ratified in RISC-V Profiles Version 1.0
(commit b1d806605f87 "Updated to ratified state.").
They are introduced as new extension names for existing features and
regulate implementation details for RISC-V Profile compliance. According
to RISC-V Profiles Version 1.0 and RVA23 Profiles Version 1.0, they are
mandatory for the following profiles:
- za64rs: Mandatory in RVA22U64, RVA23U64
- ziccamoa: Mandatory in RVA20U64, RVA22U64, RVA23U64
- ziccif: Mandatory in RVA20U64, RVA22U64, RVA23U64
- zicclsm: Mandatory in RVA20U64, RVA22U64, RVA23U64
Ziccrse specifies the main memory must support "RsrvEventual", which is
one (totally there are four) of the support level for Load-Reserved/
Store-Conditional (LR/SC) atomic instructions. Thus it depends on Zalrsc.
Ziccamoa specifies the main memory must support AMOArithmetic, among the
four levels of PMA support defined for AMOs in the A extension. Thus it
depends on Zaamo.
Za64rs defines reservation sets are contiguous, naturally aligned, and a
maximum of 64 bytes. Za64rs is consumed by two extensions: Zalrsc and
Zawrs. Zawrs itself depends on Zalrsc too.
Based on the relationship that "A" = Zaamo + Zalrsc, add the following
dependencies checks:
Za64rs -> Zalrsc or A
Ziccrse -> Zalrsc or A
Ziccamoa -> Zaamo or A
Signed-off-by: Guodong Xu <guodong@riscstar.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
|
|
Add description of the single-letter B extension for Bit Manipulation.
B is mandatory for RVA23U64.
The B extension is ratified in the 20240411 version of the unprivileged
ISA specification. According to the ratified spec, the B standard
extension comprises instructions provided by the Zba, Zbb, and Zbs
extensions.
Add two-way dependency check to enforce that B implies Zba/Zbb/Zbs; and
when Zba/Zbb/Zbs (all of them) are specified, then B must be added too.
The reason why B/Zba/Zbb/Zbs must coexist at the same time is that
unlike other single-letter extensions, B was ratified (Apr/2024) much
later than its component extensions Zba/Zbb/Zbs (Jun/2021).
When "b" is specified, zba/zbb/zbs must be present to ensure
backward compatibility with existing software and kernels that only
look for the explicit component strings.
When all three components zba/zbb/zbs are specified, "b" should also be
present. Making "b" mandatory when all three components are present.
Existing devicetrees with zba/zbb/zbs but without "b" will generate
warnings that can be fixed in follow-up patches.
Signed-off-by: Guodong Xu <guodong@riscstar.com>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
|
|
The descriptions for h, svinval, svnapot, and svpbmt extensions currently
reference the "20191213 version of the privileged ISA specification".
While an Unprivileged ISA document exists with that date, there is no
corresponding ratified Privileged ISA specification.
These extensions were ratified in the RISC-V Instruction Set Manual,
Volume II: Privileged Architecture, Version 20211203. Update the
descriptions to reference the correct specification version.
RISC-V International hosts a website [1] for ratified specifications.
Following the "Ratified ISA Specifications", historical versions of
Volume II Privileged ISA can be found.
Link: https://riscv.org/specifications/ratified/ [1]
Fixes: aeb71e42caae ("dt-bindings: riscv: deprecate riscv,isa")
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Guodong Xu <guodong@riscstar.com>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
|
|
Commit 32ece31db4df ("ACPI: PM: s2idle: Only retrieve constraints when
needed") attempted to avoid useless evaluation of LPS0 _DSM Function 1
in lps0_device_attach() because pm_debug_messages_on might never be set
(and that is the case on production systems most of the time), but it
turns out that LPS0 _DSM Function 1 is generally problematic on some
platforms and causes suspend issues to occur when pm_debug_messages_on
is set now.
In Linux, LPS0 _DSM Function 1 is only useful for diagnostics and only
in the cases when the system does not reach the deepest platform idle
state during suspend-to-idle for some reason. If such diagnostics is
not necessary, evaluating it is a loss of time, so using it along with
the other pm_debug_messages_on diagnostics is questionable because the
latter is expected to be suitable for collecting debug information even
during production use of system suspend.
For this reason, add a module parameter called check_lps0_constraints
to control whether or not the list of LPS0 constraints will be checked
in acpi_s2idle_prepare_late_lps0() and so whether or not to evaluate
LPS0 _DSM Function 1 (once) in acpi_s2idle_begin_lps0().
Fixes: 32ece31db4df ("ACPI: PM: s2idle: Only retrieve constraints when needed")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Link: https://patch.msgid.link/2827214.mvXUDI8C0e@rafael.j.wysocki
|
|
Currently, there is no straightforward way for a user to inspect
which quirks are active for a given device from userspace.
Add a new "quirks" sysfs attribute to the nvme controller device.
Reading this file will display a human-readable list
of all active quirks, with each quirk name on a new line.
If no quirks are active, it will display "none".
Tested-by: John Meneghini <jmeneghi@redhat.com>
Reviewed-by: John Meneghini <jmeneghi@redhat.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
|
|
Commit edd17206e363 ("nvmet: remove redundant subsysnqn field from
ctrl") replaced ctrl->subsysnqn with ctrl->subsys->subsysnqn. This
change works as expected because both point to strings with the same
data. However, their memory allocation lengths differ. ctrl->subsysnqn
had the fixed size defined as NVMF_NQN_FILED_LEN, while
ctrl->subsys->subsysnqn has variable length determined by kstrndup().
Due to this difference, KASAN slab-out-of-bounds occurs at memcpy() in
nvmet_passthru_override_id_ctrl() after the commit. The failure can be
recreated by running the blktests test case nvme/033. To prevent such
failures, replace memcpy() with strscpy(), which copies only the string
length and avoids overruns.
Fixes: edd17206e363 ("nvmet: remove redundant subsysnqn field from ctrl")
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
|
|
Instead of assigning the probe function for each driver individually, use
.probe() and .remove() from the pci_express bus. Rename the functions for
consistency.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/83d1edc7d619423331fa6802f0e7da3919a308a9.1764688034.git.u.kleine-koenig@baylibre.com
|
|
The driver core ensures that in .probe() and .remove() both dev and
dev->driver are valid. So drop the respective check.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/2cc2e15e05318b9f0d7b6a2b69b3169d2a6f0bd3.1764688034.git.u.kleine-koenig@baylibre.com
|
|
Fix up some minor typos in the nvme host driver and a comment
style to conform to the standard kernel style.
Signed-off-by: Wilfred Mallawa <wilfred.mallawa@wdc.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Keith Busch <kbusch@kernel.org>
|
|
Conceptually the pci_express bus doesn't belong in generic PCI code.
Move pcie_port_bus_match() and pcie_port_bus_type to pcie/portdrv.c.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/420d771f0091dea7cf18f445b94301576dcee4c8.1764688034.git.u.kleine-koenig@baylibre.com
|
|
The driver core ensures that the match function is only called for drivers
and devices of the right bus. So drop the useless check.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/09ca261912a37d2b253f43359a5dfeec42c016dc.1764688034.git.u.kleine-koenig@baylibre.com
|
|
.shutdown() is an optional callback and the core only calls it if the
pointer in struct device_driver is non-NULL. So make nothing in a bit
shorter time and remove the empty function.
Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org>
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Link: https://patch.msgid.link/283fef06ac51efbb7df25f347d6f3a2967f96429.1764688034.git.u.kleine-koenig@baylibre.com
|
|
pcie_port_probe_service() unconditionally calls get_device() (unless it
fails). So drop that reference also unconditionally as it's fine for a
PCIe driver to not have a remove callback.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/e1c68c3b3f1af8427e98ca5e2c79f8bf0ebe2ce4.1764688034.git.u.kleine-koenig@baylibre.com
|
|
Driver is ready to use intel_gpio_add_pin_ranges() directly instead of
custom approach. Convert it now.
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
|
|
Driver is ready to use intel_gpio_add_pin_ranges() directly instead of
custom approach. Convert it now.
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
|
|
BPF token support was introduced to allow a privileged process to delegate
limited BPF functionality—such as map creation and program loading—to
an unprivileged process:
https://lore.kernel.org/linux-security-module/20231130185229.2688956-1-andrii@kernel.org/
This patch adds SELinux support for controlling BPF token access. With
this change, SELinux policies can now enforce constraints on BPF token
usage based on both the delegating (privileged) process and the recipient
(unprivileged) process.
Supported operations currently include:
- map_create
- prog_load
High-level workflow:
1. An unprivileged process creates a VFS context via `fsopen()` and
obtains a file descriptor.
2. This descriptor is passed to a privileged process, which configures
BPF token delegation options and mounts a BPF filesystem.
3. SELinux records the `creator_sid` of the privileged process during
mount setup.
4. The unprivileged process then uses this BPF fs mount to create a
token and attach it to subsequent BPF syscalls.
5. During verification of `map_create` and `prog_load`, SELinux uses
`creator_sid` and the current SID to check policy permissions via:
avc_has_perm(creator_sid, current_sid, SECCLASS_BPF,
BPF__MAP_CREATE, NULL);
The implementation introduces two new permissions:
- map_create_as
- prog_load_as
At token creation time, SELinux verifies that the current process has the
appropriate `*_as` permission (depending on the `allowed_cmds` value in
the bpf_token) to act on behalf of the `creator_sid`.
Example SELinux policy:
allow test_bpf_t self:bpf {
map_create map_read map_write prog_load prog_run
map_create_as prog_load_as
};
Additionally, a new policy capability bpf_token_perms is added to ensure
backward compatibility. If disabled, previous behavior ((checks based on
current process SID)) is preserved.
Signed-off-by: Eric Suen <ericsu@linux.microsoft.com>
Tested-by: Daniel Durning <danieldurning.work@gmail.com>
Reviewed-by: Daniel Durning <danieldurning.work@gmail.com>
[PM: merge fuzz, subject tweaks, whitespace tweaks, line length tweaks]
Signed-off-by: Paul Moore <paul@paul-moore.com>
|
|
The function addr_location__put() was renamed addr_location__exit() in
commit 0dd5041c9a0eaf8c ("perf addr_location: Add init/exit/copy
functions"). Make the comment preceding the function consistent with
the function itself.
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kexin Sun <kexinsun@smail.nju.edu.cn>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ratnadira Widyasari <ratnadiraw@smu.edu.sg>
Cc: Xutong Ma <xutong.ma@inria.fr>
Cc: Yumbo Lyu <yunbolyu@smu.edu.sg>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
There are typically a lot of PMUs registered, but in many cases only few
of them have an event registered (like the "cpu" PMU in the presence of
the watchdog). As the mutex is already held, it's safe to just check for
existing events before doing the cross CPU call.
This change saves tens of milliseconds from kexec time (perceived as
steal time during a hypervisor host update), with <2ms remaining for
this step in the shutdown. There might be additional potential for
parallelization or we could just disable performance monitoring during
the actual shutdown and be less graceful about it.
Signed-off-by: Jan H. Schönherr <jschoenh@amazon.de>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
|
|
These are hard to interpret in the raw output because they are printed
as hex but are defined in perf_event.h as decimal. Make it much easier
to read the raw callchains by just printing their names.
For example:
$ perf report -D
1798195372321 0x4638 [0xb0]: PERF_RECORD_SAMPLE(IP, 0x4002): 44922/44922: 0x7c8046dd3400 period: 120218 addr: 0
... FP chain: nr:12
..... 0: fffffffffffffe00 (PERF_CONTEXT_USER)
..... 1: 00007c8046dd3400
..... 2: 00007c8046db86d3
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
[ Add PERF_CONTEXT_USER_DEFERRED too, as per Namhyung's review comment ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|