| Age | Commit message (Collapse) | Author |
|
component has component->val_bytes which is set via
snd_soc_component_setup_regmap(). But it can be calculated via
component->regmap. No need to keep it as component->val_bytes.
Add snd_soc_component_regmap_val_bytes() for it.
Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Link: https://patch.msgid.link/878qbxzywo.wl-kuninori.morimoto.gx@renesas.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
The clocksource watchdog code has over time reached the state of an
impenetrable maze of duct tape and staples. The original design, which was
made in the context of systems far smaller than today, is based on the
assumption that the to be monitored clocksource (TSC) can be trivially
compared against a known to be stable clocksource (HPET/ACPI-PM timer).
Over the years it turned out that this approach has major flaws:
- Long delays between watchdog invocations can result in wrap arounds
of the reference clocksource
- Scalability of the reference clocksource readout can degrade on large
multi-socket systems due to interconnect congestion
This was addressed with various heuristics which degraded the accuracy of
the watchdog to the point that it fails to detect actual TSC problems on
older hardware which exposes slow inter CPU drifts due to firmware
manipulating the TSC to hide SMI time.
To address this and bring back sanity to the watchdog, rewrite the code
completely with a different approach:
1) Restrict the validation against a reference clocksource to the boot
CPU, which is usually the CPU/Socket closest to the legacy block which
contains the reference source (HPET/ACPI-PM timer). Validate that the
reference readout is within a bound latency so that the actual
comparison against the TSC stays within 500ppm as long as the clocks
are stable.
2) Compare the TSCs of the other CPUs in a round robin fashion against
the boot CPU in the same way the TSC synchronization on CPU hotplug
works. This still can suffer from delayed reaction of the remote CPU
to the SMP function call and the latency of the control variable cache
line. But this latency is not affecting correctness. It only affects
the accuracy. With low contention the readout latency is in the low
nanoseconds range, which detects even slight skews between CPUs. Under
high contention this becomes obviously less accurate, but still
detects slow skews reliably as it solely relies on subsequent readouts
being monotonically increasing. It just can take slightly longer to
detect the issue.
3) Rewrite the watchdog test so it tests the various mechanisms one by
one and validating the result against the expectation.
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Tested-by: Borislav Petkov (AMD) <bp@alien8.de>
Tested-by: Daniel J Blueman <daniel@quora.org>
Reviewed-by: Jiri Wiesner <jwiesner@suse.de>
Reviewed-by: Daniel J Blueman <daniel@quora.org>
Link: https://patch.msgid.link/20260123231521.926490888@kernel.org
Link: https://patch.msgid.link/87h5qeomm5.ffs@tglx
|
|
Change /** to /* for the DMA attributes list comment in dma-mapping.h.
The comment is not a kernel-doc structured comment and should not use
the kernel-doc opening marker.
Assisted-by: Claude:claude-opus-4-6
Signed-off-by: Kit Dallege <xaum.io@gmail.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Link: https://lore.kernel.org/r/20260315171001.66010-1-xaum.io@gmail.com
|
|
When running in an unprivileged domU under Xen, the privcmd driver
is restricted to allow only hypercalls against a target domain, for
which the current domU is acting as a device model.
Add a boot parameter "unrestricted" to allow all hypercalls (the
hypervisor will still refuse destructive hypercalls affecting other
guests).
Make this new parameter effective only in case the domU wasn't started
using secure boot, as otherwise hypercalls targeting the domU itself
might result in violating the secure boot functionality.
This is achieved by adding another lockdown reason, which can be
tested to not being set when applying the "unrestricted" option.
This is part of XSA-482
Signed-off-by: Juergen Gross <jgross@suse.com>
---
V2:
- new patch
|
|
The mapping buffers which carry this attribute require DMA coherent system.
This means that they can't take SWIOTLB path, can perform CPU cache overlap
and doesn't perform cache flushing.
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Link: https://lore.kernel.org/r/20260316-dma-debug-overlap-v3-4-1dde90a7f08b@nvidia.com
|
|
Rename the DMA_ATTR_CPU_CACHE_CLEAN attribute to better reflect that it
is debugging aid to inform DMA core code that CPU cache line overlaps are
allowed, and refine the documentation describing its use.
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Link: https://lore.kernel.org/r/20260316-dma-debug-overlap-v3-3-1dde90a7f08b@nvidia.com
|
|
Tracing prints decoded DMA attribute flags, but it does not yet
include the recently added DMA_ATTR_CPU_CACHE_CLEAN. Add support
for decoding and displaying this attribute in the trace output.
Fixes: 61868dc55a11 ("dma-mapping: add DMA_ATTR_CPU_CACHE_CLEAN")
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Link: https://lore.kernel.org/r/20260316-dma-debug-overlap-v3-2-1dde90a7f08b@nvidia.com
|
|
Drivers having a struct drm_bridge pointer pointing to a bridge in many
cases hold that reference until the owning device is removed. In those
cases the reference to the bridge can be put in the .remove callback
(possibly using devm actions) or in the .destroy func (possibly with the
help of struct drm_bridge::next_bridge). At those moments the driver should
not be operating anymore and won't dereference the bridge pointer after it
is put.
However there are cases when drivers need to stop holding a reference to a
bridge even when their device is not being removed. This is the case for
bridge hot-unplug, when a bridge is removed but the previous entity (bridge
or encoder) is staying. In such case the "previous entity" needs to put it
but cannot do it via devm or .destroy, because it is not being removed.
The easy way to dispose of such pointer is:
drm_bridge_put(my_priv->some_bridge);
my_priv->some_bridge = NULL;
However this is risky because there is a time window between the two lines
where the reference is put, and thus the bridge could be deallocated, but
the pointer is still assigned. If other functions of the same driver were
invoked concurrently they might dereference my_priv->some_bridge during
that window, resulting in use-after-free.
A correct solution is to clear the pointer before putting the reference,
but that needs a temporary variable:
struct drm_bridge *temp = my_priv->some_bridge;
my_priv->some_bridge = NULL;
drm_bridge_put(temp);
This solution is however annoying to write, so the incorrect version might
still sneak in.
Add a simple, easy to use function to put a bridge after setting its
pointer to NULL in the correct way.
Acked-by: Maxime Ripard <mripard@kernel.org>
Link: https://patch.msgid.link/20260310-drm-bridge-atomic-vs-remove-clear_and_put-v2-1-51fe222f3cf0@bootlin.com
Signed-off-by: Luca Ceresoli <luca.ceresoli@bootlin.com>
|
|
Update the kerneldoc with a more elaborate description of some members,
including the gfp_retry_mayfail member. Use inline kerneldoc.
Suggested-by: Simona Vetter <simona.vetter@ffwll.ch>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Maarten Lankhorst <dev@lankhorst.se>
Acked-by: Christian König <christian.koening@amd.com>
Link: https://patch.msgid.link/20260317141856.237876-4-thomas.hellstrom@linux.intel.com
|
|
In situations where the system is very short on RAM, the shmem
readback from swap-space may invoke the OOM killer.
However, since this might be a recoverable situation where the caller
is indicating this by setting
struct ttm_operation_ctx::gfp_retry_mayfail to true, adjust the gfp
value used by the allocation accordingly.
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Maarten Lankhorst <dev@lankhorst.se>
Acked-by: Christian König <christian.koening@amd.com>
Link: https://patch.msgid.link/20260317141856.237876-3-thomas.hellstrom@linux.intel.com
|
|
Now that all drm_private_objs users have been converted to use
atomic_create_state instead of the old ad-hoc initialization, we can
remove the state parameter from drm_private_obj_init and the fallback
code.
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Reviewed-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
Reviewed-by: Liviu Dudau <liviu.dudau@arm.com>
Reviewed-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patch.msgid.link/20260224-drm-private-obj-reset-v5-4-5a72f8ec9934@kernel.org
Signed-off-by: Maxime Ripard <mripard@kernel.org>
|
|
Add support for handling aggregation-based interrupts when operating in MCQ
mode.
In legacy interrupt mode, an IE.IAGES is triggered when the counter or
timer threshold is reached. To manage this, the handler now resets the
aggregation counter and timer by writing to the MCQIACRy.CTR register.
Since the register layout of MCQIACRy is identical to the existing UTRIACR
register, this implementation reuses the previously defined bitfield masks
to maintain consistency and reduce code duplication.
Extend ufshcd_handle_mcq_cq_events() with a boolean iag parameter. If set,
the handler resets the MCQ IAG counter and timer.
Define MCQ_IAG_EVENT_STATUS (0x200000) and include it in
UFSHCD_ENABLE_MCQ_INTRS to ensure the interrupt is unmasked during
initialization.
Signed-off-by: Vamshi Gajjela <vamshigajjela@google.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Link: https://patch.msgid.link/20260310190308.2474956-1-vamshigajjela@google.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth
Luiz Augusto von Dentz says:
====================
bluetooth pull request for net:
- hci_ll: Fix firmware leak on error path
- hci_sync: annotate data-races around hdev->req_status
- L2CAP: Fix null-ptr-deref on l2cap_sock_ready_cb
- L2CAP: Validate PDU length before reading SDU length in l2cap_ecred_data_rcv()
- L2CAP: Fix regressions caused by reusing ident
- L2CAP: Fix stack-out-of-bounds read in l2cap_ecred_conn_req
- MGMT: Fix dangling pointer on mgmt_add_adv_patterns_monitor_complete
- SCO: Fix use-after-free in sco_recv_frame() due to missing sock_hold
* tag 'for-net-2026-03-19' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth:
Bluetooth: L2CAP: Fix regressions caused by reusing ident
Bluetooth: L2CAP: Fix null-ptr-deref on l2cap_sock_ready_cb
Bluetooth: hci_ll: Fix firmware leak on error path
Bluetooth: hci_sync: annotate data-races around hdev->req_status
Bluetooth: MGMT: Fix dangling pointer on mgmt_add_adv_patterns_monitor_complete
Bluetooth: SCO: Fix use-after-free in sco_recv_frame() due to missing sock_hold
Bluetooth: L2CAP: Validate PDU length before reading SDU length in l2cap_ecred_data_rcv()
Bluetooth: L2CAP: Fix stack-out-of-bounds read in l2cap_ecred_conn_req
====================
Link: https://patch.msgid.link/20260319190455.135302-1-luiz.dentz@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Simplifies allocations by using a flexible array member in this struct.
Add __counted_by to get extra runtime analysis.
Signed-off-by: Rosen Penev <rosenp@gmail.com>
Link: https://patch.msgid.link/20260318191037.5661-1-rosenp@gmail.com
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
|
|
Add /sys/module/*/import_ns to expose imported namespaces for
currently loaded modules. The file contains one namespace per line and
only exists for modules that import at least one namespace.
Previously, the only way for userspace to inspect the symbol
namespaces a module imports is to locate the .ko on disk and invoke
modinfo(8) to decompress/parse the metadata. The kernel validated
namespaces at load time, but it was otherwise discarded.
Exposing this data via sysfs provides a runtime mechanism to verify
which namespaces are being used by modules. For example, this allows
userspace to audit driver API access in Android GKI, which uses symbol
namespaces to restrict vendor drivers from using specific kernel
interfaces (e.g., direct filesystem access).
Signed-off-by: Nicholas Sielicki <linux@opensource.nslick.com>
[Sami: Updated the commit message to explain motivation.]
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
|
|
Cross-merge networking fixes after downstream PR (net-7.0-rc5).
net/netfilter/nft_set_rbtree.c
598adea720b97 ("netfilter: revert nft_set_rbtree: validate open interval overlap")
3aea466a43998 ("netfilter: nft_set_rbtree: don't disable bh when acquiring tree lock")
https://lore.kernel.org/abgaQBpeGstdN4oq@sirena.org.uk
No adjacent changes.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When io_should_commit() returns true (eg for non-pollable files), buffer
commit happens at buffer selection time and sel->buf_list is set to
NULL. When __io_put_kbufs() generates CQE flags at completion time, it
calls __io_put_kbuf_ring() which finds a NULL buffer_list and hence
cannot determine whether the buffer was consumed or not. This means that
IORING_CQE_F_BUF_MORE is never set for non-pollable input with
incrementally consumed buffers.
Likewise for io_buffers_select(), which always commits upfront and
discards the return value of io_kbuf_commit().
Add REQ_F_BUF_MORE to store the result of io_kbuf_commit() during early
commit. Then __io_put_kbuf_ring() can check this flag and set
IORING_F_BUF_MORE accordingy.
Reported-by: Martin Michaelis <code@mgjm.de>
Cc: stable@vger.kernel.org
Fixes: ae98dbf43d75 ("io_uring/kbuf: add support for incremental buffer consumption")
Link: https://github.com/axboe/liburing/issues/1553
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
This attempt to fix regressions caused by reusing ident which apparently
is not handled well on certain stacks causing the stack to not respond to
requests, so instead of simple returning the first unallocated id this
stores the last used tx_ident and then attempt to use the next until all
available ids are exausted and then cycle starting over to 1.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=221120
Link: https://bugzilla.kernel.org/show_bug.cgi?id=221177
Fixes: 6c3ea155e5ee ("Bluetooth: L2CAP: Fix not tracking outstanding TX ident")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Tested-by: Christian Eggers <ceggers@arri.de>
|
|
Add the relevant IFC bits for querying an extra migration state from the
device.
Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
Link: https://lore.kernel.org/r/20260317161753.18964-5-yishaih@nvidia.com
Signed-off-by: Alex Williamson <alex@shazbot.org>
|
|
Introduce a core helper function for VFIO_MIG_GET_PRECOPY_INFO and adapt
all drivers to use it.
It centralizes the common code and ensures that output flags are cleared
on entry, in case user opts in to VFIO_DEVICE_FEATURE_MIG_PRECOPY_INFOv2.
This preventing any unintended echoing of userspace data back to
userspace.
Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
Link: https://lore.kernel.org/r/20260317161753.18964-4-yishaih@nvidia.com
Signed-off-by: Alex Williamson <alex@shazbot.org>
|
|
Currently, existing VFIO_MIG_GET_PRECOPY_INFO implementations don't
assign info.flags before copy_to_user().
Because they copy the struct in from userspace first, this effectively
echoes userspace-provided flags back as output, preventing the field
from being used to report new reliable data from the drivers.
Add support for a new device feature named
VFIO_DEVICE_FEATURE_MIG_PRECOPY_INFOv2.
On SET, enables the v2 pre_copy_info behaviour, where the
vfio_precopy_info.flags is a valid output field.
Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
Link: https://lore.kernel.org/r/20260317161753.18964-3-yishaih@nvidia.com
Signed-off-by: Alex Williamson <alex@shazbot.org>
|
|
As currently defined, initial_bytes is monotonically decreasing and
precedes dirty_bytes when reading from the saving file descriptor.
The transition from initial_bytes to dirty_bytes is unidirectional and
irreversible.
The initial_bytes are considered as critical data that is highly
recommended to be transferred to the target as part of PRE_COPY, without
this data, the PRE_COPY phase would be ineffective.
We come to solve the case when a new chunk of critical data is
introduced during the PRE_COPY phase and the driver would like to report
an entirely new value for the initial_bytes.
For that, we extend the VFIO_MIG_GET_PRECOPY_INFO ioctl with an output
flag named VFIO_PRECOPY_INFO_REINIT to allow drivers reporting a new
initial_bytes value during the PRE_COPY phase.
Currently, existing VFIO_MIG_GET_PRECOPY_INFO implementations don't
assign info.flags before copy_to_user(), this effectively echoes
userspace-provided flags back as output, preventing the field from being
used to report new reliable data from the drivers.
Reliable use of the new VFIO_PRECOPY_INFO_REINIT flag requires userspace
to explicitly opt in by enabling the
VFIO_DEVICE_FEATURE_MIG_PRECOPY_INFOv2 device feature.
When the caller opts in, the driver may report an entirely new
value for initial_bytes. It may be larger, it may be smaller, it may
include the previous unread initial_bytes, it may discard the previous
unread initial_bytes, up to the driver logic and state.
The presence of the VFIO_PRECOPY_INFO_REINIT output flag set by the
driver indicates that new initial data is present on the stream.
Once the caller sees this flag, the initial_bytes value should be
re-evaluated relative to the readiness state for transition to
STOP_COPY.
Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
Link: https://lore.kernel.org/r/20260317161753.18964-2-yishaih@nvidia.com
Signed-off-by: Alex Williamson <alex@shazbot.org>
|
|
The file contains a spelling error in a source comment (succes).
Typos in comments reduce readability and make text searches less reliable
for developers and maintainers.
Replace 'succes' with 'success' in the affected comment. This is a
comment-only cleanup and does not change behavior.
Signed-off-by: Joseph Salisbury <joseph.salisbury@oracle.com>
Link: https://lore.kernel.org/r/20260316185617.166414-1-joseph.salisbury@oracle.com
Signed-off-by: Alex Williamson <alex@shazbot.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from wireless, Bluetooth and netfilter.
Nothing too exciting here, mostly fixes for corner cases.
Current release - fix to a fix:
- bonding: prevent potential infinite loop in bond_header_parse()
Current release - new code bugs:
- wifi: mac80211: check tdls flag in ieee80211_tdls_oper
Previous releases - regressions:
- af_unix: give up GC if MSG_PEEK intervened
- netfilter: conntrack: add missing netlink policy validations
- NFC: nxp-nci: allow GPIOs to sleep"
* tag 'net-7.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (78 commits)
MPTCP: fix lock class name family in pm_nl_create_listen_socket
icmp: fix NULL pointer dereference in icmp_tag_validation()
net: dsa: bcm_sf2: fix missing clk_disable_unprepare() in error paths
net: shaper: protect from late creation of hierarchy
net: shaper: protect late read accesses to the hierarchy
net: mvpp2: guard flow control update with global_tx_fc in buffer switching
nfnetlink_osf: validate individual option lengths in fingerprints
netfilter: nf_tables: release flowtable after rcu grace period on error
netfilter: bpf: defer hook memory release until rcu readers are done
net: bonding: fix NULL deref in bond_debug_rlb_hash_show
udp_tunnel: fix NULL deref caused by udp_sock_create6 when CONFIG_IPV6=n
net/mlx5e: Fix race condition during IPSec ESN update
net/mlx5e: Prevent concurrent access to IPSec ASO context
net/mlx5: qos: Restrict RTNL area to avoid a lock cycle
ipv6: add NULL checks for idev in SRv6 paths
NFC: nxp-nci: allow GPIOs to sleep
net: macb: fix uninitialized rx_fs_lock
net: macb: fix use-after-free access to PTP clock
netdevsim: drop PSP ext ref on forward failure
wifi: mac80211: always free skb on ieee80211_tx_prepare_skb() failure
...
|
|
GICv5 systems will likely not support the full set of PPIs. The
presence of any virtual PPI is tied to the presence of the physical
PPI. Therefore, the available PPIs will be limited by the physical
host. Userspace cannot drive any PPIs that are not implemented.
Moreover, it is not desirable to expose all PPIs to the guest in the
first place, even if they are supported in hardware. Some devices,
such as the arch timer, are implemented in KVM, and hence those PPIs
shouldn't be driven by userspace, either.
Provided a new UAPI:
KVM_DEV_ARM_VGIC_GRP_CTRL => KVM_DEV_ARM_VGIC_USERPSPACE_PPIs
This allows userspace to query which PPIs it is able to drive via
KVM_IRQ_LINE.
Additionally, introduce a check in kvm_vm_ioctl_irq_line() to reject
any PPIs not in the userspace mask.
Signed-off-by: Sascha Bischoff <sascha.bischoff@arm.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/20260319154937.3619520-40-sascha.bischoff@arm.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Only the KVM_DEV_ARM_VGIC_GRP_CTRL->KVM_DEV_ARM_VGIC_CTRL_INIT op is
currently supported. All other ops are stubbed out.
Co-authored-by: Timothy Hayes <timothy.hayes@arm.com>
Signed-off-by: Timothy Hayes <timothy.hayes@arm.com>
Signed-off-by: Sascha Bischoff <sascha.bischoff@arm.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/20260319154937.3619520-36-sascha.bischoff@arm.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Make it mandatory to use the architected PPI when running a GICv5
guest. Attempts to set anything other than the architected PPI (23)
are rejected.
Additionally, KVM_ARM_VCPU_PMU_V3_INIT is relaxed to no longer require
KVM_ARM_VCPU_PMU_V3_IRQ to be called for GICv5-based guests. In this
case, the architectued PPI is automatically used.
Documentation is bumped accordingly.
Signed-off-by: Sascha Bischoff <sascha.bischoff@arm.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Link: https://patch.msgid.link/20260319154937.3619520-33-sascha.bischoff@arm.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Now that GICv5 has arrived, the arch timer requires some TLC to
address some of the key differences introduced with GICv5.
For PPIs on GICv5, the queue_irq_unlock irq_op is used as AP lists are
not required at all for GICv5. The arch timer also introduces an
irq_op - get_input_level. Extend the arch-timer-provided irq_ops to
include the PPI op for vgic_v5 guests.
When possible, DVI (Direct Virtual Interrupt) is set for PPIs when
using a vgic_v5, which directly inject the pending state into the
guest. This means that the host never sees the interrupt for the guest
for these interrupts. This has three impacts.
* First of all, the kvm_cpu_has_pending_timer check is updated to
explicitly check if the timers are expected to fire.
* Secondly, for mapped timers (which use DVI) they must be masked on
the host prior to entering a GICv5 guest, and unmasked on the return
path. This is handled in set_timer_irq_phys_masked.
* Thirdly, it makes zero sense to attempt to inject state for a DVI'd
interrupt. Track which timers are direct, and skip the call to
kvm_vgic_inject_irq() for these.
The final, but rather important, change is that the architected PPIs
for the timers are made mandatory for a GICv5 guest. Attempts to set
them to anything else are actively rejected. Once a vgic_v5 is
initialised, the arch timer PPIs are also explicitly reinitialised to
ensure the correct GICv5-compatible PPIs are used - this also adds in
the GICv5 PPI type to the intid.
Signed-off-by: Sascha Bischoff <sascha.bischoff@arm.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/20260319154937.3619520-32-sascha.bischoff@arm.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Update kvm_vgic_create to create a vgic_v5 device. When creating a
vgic, FEAT_GCIE in the ID_AA64PFR2 is only exposed to vgic_v5-based
guests, and is hidden otherwise. GIC in ~ID_AA64PFR0_EL1 is never
exposed for a vgic_v5 guest.
When initialising a vgic_v5, skip kvm_vgic_dist_init as GICv5 doesn't
support one. The current vgic_v5 implementation only supports PPIs, so
no SPIs are initialised either.
The current vgic_v5 support doesn't extend to nested guests. Therefore,
the init of vgic_v5 for a nested guest is failed in vgic_v5_init.
As the current vgic_v5 doesn't require any resources to be mapped,
vgic_v5_map_resources is simply used to check that the vgic has indeed
been initialised. Again, this will change as more GICv5 support is
merged in.
Signed-off-by: Sascha Bischoff <sascha.bischoff@arm.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/20260319154937.3619520-29-sascha.bischoff@arm.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
GICv5 adds support for directly injected PPIs. The mechanism for
setting this up is GICv5 specific, so rather than adding
GICv5-specific code to the common vgic code, we introduce a new
irq_op.
This new irq_op is intended to be used to enable or disable direct
injection for interrupts that support it. As it is an irq_op, it has
no effect unless explicitly populated in the irq_ops structure for a
particular interrupt. The usage is demonstracted in the subsequent
change.
Signed-off-by: Sascha Bischoff <sascha.bischoff@arm.com>
Link: https://patch.msgid.link/20260319154937.3619520-26-sascha.bischoff@arm.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
There are times when the default behaviour of vgic_queue_irq_unlock()
is undesirable. This is because some GICs, such a GICv5 which is the
main driver for this change, handle the majority of the interrupt
lifecycle in hardware. In this case, there is no need for a per-VCPU
AP list as the interrupt can be made pending directly. This is done
either via the ICH_PPI_x_EL2 registers for PPIs, or with the VDPEND
system instruction for SPIs and LPIs.
The vgic_queue_irq_unlock() function is made overridable using a new
function pointer in struct irq_ops. vgic_queue_irq_unlock() is
overridden if the function pointer is non-null.
This new irq_op is unused in this change - it is purely providing the
infrastructure itself. The subsequent PPI injection changes provide a
demonstration of the usage of the queue_irq_unlock irq_op.
Signed-off-by: Sascha Bischoff <sascha.bischoff@arm.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/20260319154937.3619520-20-sascha.bischoff@arm.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
We only want to expose a subset of the PPIs to a guest. If a PPI does
not have an owner, it is not being actively driven by a device. The
SW_PPI is a special case, as it is likely for userspace to wish to
inject that.
Therefore, just prior to running the guest for the first time, we need
to finalize the PPIs. A mask is generated which, when combined with
trapping a guest's PPI accesses, allows for the guest's view of the
PPI to be filtered. This mask is global to the VM as all VCPUs PPI
configurations must match.
In addition, the PPI HMR is calculated.
Signed-off-by: Sascha Bischoff <sascha.bischoff@arm.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/20260319154937.3619520-19-sascha.bischoff@arm.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
This change introduces GICv5 load/put. Additionally, it plumbs in
save/restore for:
* PPIs (ICH_PPI_x_EL2 regs)
* ICH_VMCR_EL2
* ICH_APR_EL2
* ICC_ICSR_EL1
A GICv5-specific enable bit is added to struct vgic_vmcr as this
differs from previous GICs. On GICv5-native systems, the VMCR only
contains the enable bit (driven by the guest via ICC_CR0_EL1.EN) and
the priority mask (PCR).
A struct gicv5_vpe is also introduced. This currently only contains a
single field - bool resident - which is used to track if a VPE is
currently running or not, and is used to avoid a case of double load
or double put on the WFI path for a vCPU. This struct will be extended
as additional GICv5 support is merged, specifically for VPE doorbells.
Co-authored-by: Timothy Hayes <timothy.hayes@arm.com>
Signed-off-by: Timothy Hayes <timothy.hayes@arm.com>
Signed-off-by: Sascha Bischoff <sascha.bischoff@arm.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/20260319154937.3619520-18-sascha.bischoff@arm.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Introduce the following hyp functions to save/restore GICv5 state:
* __vgic_v5_save_apr()
* __vgic_v5_restore_vmcr_apr()
* __vgic_v5_save_ppi_state() - no hypercall required
* __vgic_v5_restore_ppi_state() - no hypercall required
* __vgic_v5_save_state() - no hypercall required
* __vgic_v5_restore_state() - no hypercall required
Note that the functions tagged as not requiring hypercalls are always
called directly from the same context. They are either called via the
vgic_save_state()/vgic_restore_state() path when running with VHE, or
via __hyp_vgic_save_state()/__hyp_vgic_restore_state() otherwise. This
mimics how vgic_v3_save_state()/vgic_v3_restore_state() are
implemented.
Overall, the state of the following registers is saved/restored:
* ICC_ICSR_EL1
* ICH_APR_EL2
* ICH_PPI_ACTIVERx_EL2
* ICH_PPI_DVIRx_EL2
* ICH_PPI_ENABLERx_EL2
* ICH_PPI_PENDRx_EL2
* ICH_PPI_PRIORITYRx_EL2
* ICH_VMCR_EL2
All of these are saved/restored to/from the KVM vgic_v5 CPUIF shadow
state, with the exception of the PPI active, pending, and enable
state. The pending state is saved and restored from kvm_host_data as
any changes here need to be tracked and propagated back to the
vgic_irq shadow structures (coming in a future commit). Therefore, an
entry and an exit copy is required. The active and enable state is
restored from the vgic_v5 CPUIF, but is saved to kvm_host_data. Again,
this needs to by synced back into the shadow data structures.
The ICSR must be save/restored as this register is shared between host
and guest. Therefore, to avoid leaking host state to the guest, this
must be saved and restored. Moreover, as this can by used by the host
at any time, it must be save/restored eagerly. Note: the host state is
not preserved as the host should only use this register when
preemption is disabled.
As with GICv3, the VMCR is eagerly saved as this is required when
checking if interrupts can be injected or not, and therefore impacts
things such as WFI.
As part of restoring the ICH_VMCR_EL2 and ICH_APR_EL2, GICv3-compat
mode is also disabled by setting the ICH_VCTLR_EL2.V3 bit to 0. The
correspoinding GICv3-compat mode enable is part of the VMCR & APR
restore for a GICv3 guest as it only takes effect when actually
running a guest.
Co-authored-by: Timothy Hayes <timothy.hayes@arm.com>
Signed-off-by: Timothy Hayes <timothy.hayes@arm.com>
Signed-off-by: Sascha Bischoff <sascha.bischoff@arm.com>
Link: https://patch.msgid.link/20260319154937.3619520-17-sascha.bischoff@arm.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Add in a sanitization function for ID_AA64PFR2_EL1, preserving the
already-present behaviour for the FPMR, MTEFAR, and MTESTOREONLY
fields. Add sanitisation for the GCIE field, which is set to IMP if
the host supports a GICv5 guest and NI, otherwise.
Extend the sanitisation that takes place in kvm_vgic_create() to zero
the ID_AA64PFR2.GCIE field when a non-GICv5 GIC is created. More
importantly, move this sanitisation to a separate function,
kvm_vgic_finalize_sysregs(), and call it from kvm_finalize_sys_regs().
We are required to finalize the GIC and GCIE fields a second time in
kvm_finalize_sys_regs() due to how QEMU blindly reads out then
verbatim restores the system register state. This avoids the issue
where both the GCIE and GIC features are marked as present (an
architecturally invalid combination), and hence guests fall over. See
the comment in kvm_finalize_sys_regs() for more details.
Overall, the following happens:
* Before an irqchip is created, FEAT_GCIE is presented if the host
supports GICv5-based guests.
* Once an irqchip is created, all other supported irqchips are hidden
from the guest; system register state reflects the guest's irqchip.
* Userspace is allowed to set invalid irqchip feature combinations in
the system registers, but...
* ...invalid combinations are removed a second time prior to the first
run of the guest, and things hopefully just work.
All of this extra work is required to make sure that "legacy" GICv3
guests based on QEMU transparently work on compatible GICv5 hosts
without modification.
Signed-off-by: Sascha Bischoff <sascha.bischoff@arm.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/20260319154937.3619520-13-sascha.bischoff@arm.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
As part of booting the system and initialising KVM, create and
populate a mask of the implemented PPIs. This mask allows future PPI
operations (such as save/restore or state, or syncing back into the
shadow state) to only consider PPIs that are actually implemented on
the host.
The set of implemented virtual PPIs matches the set of implemented
physical PPIs for a GICv5 host. Therefore, this mask represents all
PPIs that could ever by used by a GICv5-based guest on a specific
host, albeit pre-filtered by what we support in KVM (see next
paragraph).
Only architected PPIs are currently supported in KVM with
GICv5. Moreover, as KVM only supports a subset of all possible PPIS
(Timers, PMU, GICv5 SW_PPI) the PPI mask only includes these PPIs, if
present. The timers are always assumed to be present; if we have KVM
we have EL2, which means that we have the EL1 & EL2 Timer PPIs. If we
have a PMU (v3), then the PMUIRQ is present. The GICv5 SW_PPI is
always assumed to be present.
Signed-off-by: Sascha Bischoff <sascha.bischoff@arm.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/20260319154937.3619520-12-sascha.bischoff@arm.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
GICv5 has moved from using interrupt ranges for different interrupt
types to using some of the upper bits of the interrupt ID to denote
the interrupt type. This is not compatible with older GICs (which rely
on ranges of interrupts to determine the type), and hence a set of
helpers is introduced. These helpers take a struct kvm*, and use the
vgic model to determine how to interpret the interrupt ID.
Helpers are introduced for PPIs, SPIs, and LPIs. Additionally, a
helper is introduced to determine if an interrupt is private - SGIs
and PPIs for older GICs, and PPIs only for GICv5.
Additionally, vgic_is_v5() is introduced (which unsurpisingly returns
true when running a GICv5 guest), and the existing vgic_is_v3() check
is moved from vgic.h to arm_vgic.h (to live alongside the vgic_is_v5()
one), and has been converted into a macro.
The helpers are plumbed into the core vgic code, as well as the Arch
Timer and PMU code.
There should be no functional changes as part of this change.
Signed-off-by: Sascha Bischoff <sascha.bischoff@arm.com>
Reviewed-by: Joey Gouly <joey.gouly@arm.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/20260319154937.3619520-10-sascha.bischoff@arm.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
This is the base GICv5 device which is to be used with the
KVM_CREATE_DEVICE ioctl to create a GICv5-based vgic.
Signed-off-by: Sascha Bischoff <sascha.bischoff@arm.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/20260319154937.3619520-9-sascha.bischoff@arm.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Prior to this change, the act of mapping a virtual IRQ to a physical
one also set the irq_ops. Unmapping then reset the irq_ops to NULL. So
far, this has been fine and hasn't caused any major issues.
Now, however, as GICv5 support is being added to KVM, it has become
apparent that conflating mapping/unmapping IRQs and setting/clearing
irq_ops can cause issues. The reason is that the upcoming GICv5
support introduces a set of default irq_ops for PPIs, and removing
this when unmapping will cause things to break rather horribly.
Split out the mapping/unmapping of IRQs from the setting/clearing of
irq_ops. The arch timer code is updated to set the irq_ops following a
successful map. The irq_ops are intentionally not removed again on an
unmap as the only irq_op introduced by the arch timer only takes
effect if the hw bit in struct vgic_irq is set. Therefore, it is safe
to leave this in place, and it avoids additional complexity when GICv5
support is introduced.
Signed-off-by: Sascha Bischoff <sascha.bischoff@arm.com>
Link: https://patch.msgid.link/20260319154937.3619520-6-sascha.bischoff@arm.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Add an io_uring command handler to the generic BSG layer. The new
.uring_cmd file operation validates io_uring features and delegates
handling to a per-queue bsg_uring_cmd_fn callback.
Extend bsg_register_queue() so transport drivers can register both
sg_io and io_uring command handlers.
Signed-off-by: Yang Xiuwei <yangxiuwei@kylinos.cn>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Link: https://patch.msgid.link/20260317072226.2598233-3-yangxiuwei@kylinos.cn
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Add the bsg_uring_cmd structure to the BSG UAPI header to support
io_uring-based SCSI passthrough operations via IORING_OP_URING_CMD.
Signed-off-by: Yang Xiuwei <yangxiuwei@kylinos.cn>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Link: https://patch.msgid.link/20260317072226.2598233-2-yangxiuwei@kylinos.cn
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
For a short time this file was used by the SHA-256 and Poly1305 library
code, but they are no longer using it. Remove this unused file.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20260314173526.17349-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
|
|
This chip identifies as Tegra238, so update the BPMP ABI header to refer
to it by the correct name.
Reviewed-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
Add support for HID devices that report multiple batteries, each
identified by its report ID.
The hid_device->battery pointer is replaced with a batteries list.
Batteries are named using the pattern hid-{uniq}-battery-{report_id}.
The hid_get_battery() helper returns the first battery in the list for
backwards compatibility with single-battery drivers.
Signed-off-by: Lucas Zampieri <lcasmz54@gmail.com>
Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
|
|
Introduce struct hid_battery to encapsulate individual battery state,
preparing for future multi-battery support.
The new structure contains all battery-related fields previously stored
directly in hid_device (capacity, min, max, report_type, report_id,
charge_status, etc.). The hid_device->battery pointer type changes from
struct power_supply* to struct hid_battery*, and all battery functions
are refactored accordingly.
A hid_get_battery() helper is added for external drivers, with
hid-apple.c and hid-magicmouse.c updated to use the new API. The
hid-input-test.c KUnit tests are also updated for the new structure.
No functional changes for single-battery devices.
Signed-off-by: Lucas Zampieri <lcasmz54@gmail.com>
Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next
Johannes Berg says:
====================
Aside from various small improvements/cleanups, not much:
- cfg80211/mac80211: S1G and UHR improvements
- hwsim: incumbent signal report test support
* tag 'wireless-next-2026-03-19' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (31 commits)
qtnfmac: use alloc_netdev macro for single queue devices
wifi: libertas: don't kill URBs in interrupt context
wifi: libertas: use USB anchors for tracking in-flight URBs
wifi: nl80211: use int for band coming from netlink
wifi: rsi_91x_usb: do not pause rfkill polling when stopping mac80211
wifi: mac80211: fix STA link removal during link removal
wifi: nl80211: reject S1G/60G with HT chantype
wifi: ieee80211: fix definition of EHT-MCS 15 in MRU
wifi: cfg80211: check non-S1G width with S1G chandef
wifi: cfg80211: restrict cfg80211_chandef_create() to only HT-based bands
wifi: mac80211: don't use cfg80211_chandef_create() for default chandef
wifi: mac80211: Remove deleted sta links in ieee80211_ml_reconf_work()
wifi: b43: use register definitions in nphy_op_software_rfkill
wifi: cfg80211: split control freq check from chandef check
wifi: mac80211: always use full chanctx compatible check
wifi: mac80211: refactor chandef tracing macros
wifi: mac80211: validate HE 6 GHz operation when EHT is used
wifi: nl80211: split out UHR operation information
wifi: mwifiex: drop redundant device reference
wifi: rt2x00: drop redundant device reference
...
====================
Link: https://patch.msgid.link/20260319082439.79875-3-johannes@sipsolutions.net
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
This patch adds VF610_CLK_ESW and VF610_CLK_ESW_MAC_TAB{0123}
macros definitions for L2 switch.
Those definitions describe clocks for MoreThanIP switch IP block;
the switch itself and the MAC address lookup table clocks.
Signed-off-by: Lukasz Majewski <lukma@nabladev.com>
Acked-by: Rob Herring (Arm) <robh@kernel.org>
Reviewed-by: Peng Fan <peng.fan@nxp.com>
Link: https://patch.msgid.link/20260129095442.1646748-4-lukma@nabladev.com
Signed-off-by: Abel Vesa <abel.vesa@oss.qualcomm.com>
|
|
The VF610_CLK_END should be dropped as it is not part of the ABI.
Signed-off-by: Lukasz Majewski <lukma@nabladev.com>
Acked-by: Rob Herring (Arm) <robh@kernel.org>
Reviewed-by: Peng Fan <peng.fan@nxp.com>
Link: https://patch.msgid.link/20260129095442.1646748-3-lukma@nabladev.com
Signed-off-by: Abel Vesa <abel.vesa@oss.qualcomm.com>
|
|
In network setup as below:
fastpath bypass
.----------------------------------------.
/ \
| IP - forwarding |
| / \ v
| / wan ...
| /
| |
| |
| brlan.1
| |
| +-------------------------------+
| | vlan 1 |
| | |
| | brlan (vlan-filtering) |
| | +---------------+
| | | DSA-SWITCH |
| | vlan 1 | |
| | to | |
| | untagged 1 vlan 1 |
| +---------------+---------------+
. / \
----->wlan1 lan0
. .
. ^
^ vlan 1 tagged packets
untagged packets
br_vlan_fill_forward_path_mode() sets DEV_PATH_BR_VLAN_UNTAG_HW when
filling in from brlan.1 towards wlan1. But it should be set to
DEV_PATH_BR_VLAN_UNTAG in this case. Using BR_VLFLAG_ADDED_BY_SWITCHDEV
is not correct. The dsa switchdev adds it as a foreign port.
The same problem for all foreignly added dsa vlans on the bridge.
First add the vlan, trying only native devices.
If this fails, we know this may be a vlan from a foreign device.
Use BR_VLFLAG_TAGGING_BY_SWITCHDEV to make sure DEV_PATH_BR_VLAN_UNTAG_HW
is set only when there if no foreign device involved.
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: Eric Woudstra <ericwouds@gmail.com>
Link: https://patch.msgid.link/20260317110347.363875-1-ericwouds@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|