| Age | Commit message (Collapse) | Author |
|
The core library's `From` implementations do not cover conversions
that are not portable or future-proof. For instance, even though it is
safe today, `From<usize>` is not implemented for `u64` because of the
possibility to support larger-than-64bit architectures in the future.
However, the kernel supports a narrower set of architectures, and in the
case of Nova we only support 64-bit. This makes it helpful and desirable
to provide more infallible conversions, lest we need to rely on the `as`
keyword and carry the risk of silently losing data.
Thus, introduce a new module `num` that provides safe const functions
performing more conversions allowed by the build target, as well as
`FromSafeCast` and `IntoSafeCast` traits that are just extensions of
`From` and `Into` to conversions that are known to be lossless.
Suggested-by: Danilo Krummrich <dakr@kernel.org>
Link: https://lore.kernel.org/rust-for-linux/DDK4KADWJHMG.1FUPL3SDR26XF@kernel.org/
Acked-by: Danilo Krummrich <dakr@kernel.org>
[acourbot@nvidia.com: fix merge conflicts after rebase.]
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Message-ID: <20251029-nova-as-v3-4-6a30c7333ad9@nvidia.com>
|
|
Pull drm fixes from Dave Airlie:
"Back from travel, thanks to Simona for handling things. regular fixes,
seems about the right size, but spread out a bit.
amdgpu has the usual range of fixes, xe has a few fixes, and nouveau
has a couple of fixes, one for blackwell modifiers on 8/16 bit
surfaces.
Otherwise a few small fixes for mediatek, sched, imagination and
pixpaper.
sched:
- Fix deadlock
amdgpu:
- Reset fixes
- Misc fixes
- Panel scaling fixes
- HDMI fix
- S0ix fixes
- Hibernation fix
- Secure display fix
- Suspend fix
- MST fix
amdkfd:
- Process cleanup fix
xe:
- Fix missing synchronization on unbind
- Fix device shutdown when doing FLR
- Fix user fence signaling order
i915:
- Avoid lock inversion when pinning to GGTT on CHV/BXT+VTD
- Fix conversion between clock ticks and nanoseconds
mediatek:
- Disable AFBC support on Mediatek DRM driver
- Add pm_runtime support for GCE power control
imagination:
- kconfig: Fix dependencies
nouveau:
- Set DMA mask earlier
- Advertize correct modifiers for GB20x
pixpaper:
- kconfig: Fix dependencies"
* tag 'drm-fixes-2025-11-08' of https://gitlab.freedesktop.org/drm/kernel: (26 commits)
drm/xe: Enforce correct user fence signaling order using
drm/xe: Do clean shutdown also when using flr
drm/xe: Move declarations under conditional branch
drm/xe/guc: Synchronize Dead CT worker with unbind
drm/amd/display: Enable mst when it's detected but yet to be initialized
drm/amdgpu: Fix wait after reset sequence in S3
drm/amd: Fix suspend failure with secure display TA
drm/amdgpu: fix gpu page fault after hibernation on PF passthrough
drm/tiny: pixpaper: add explicit dependency on MMU
drm/nouveau: Advertise correct modifiers on GB20x
drm: define NVIDIA DRM format modifiers for GB20x
drm/nouveau: set DMA mask before creating the flush page
drm/sched: Fix deadlock in drm_sched_entity_kill_jobs_cb
drm/amd/display: Fix NULL deref in debugfs odm_combine_segments
drm/amdkfd: Don't clear PT after process killed
drm/amdgpu/smu: Handle S0ix for vangogh
drm/amdgpu: Drop PMFW RLC notifier from amdgpu_device_suspend()
drm/amd/display: Fix black screen with HDMI outputs
drm/amd/display: Don't stretch non-native images by default in eDP
drm/amd/pm: fix missing device_attr cleanup in amdgpu_pm_sysfs_init()
...
|
|
https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes
Driver Changes:
- Fix missing synchronization on unbind (Balasubramani Vivekanandan)
- Fix device shutdown when doing FLR (Jouni Högander)
- Fix user fence signaling order (Matthew Brost)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patch.msgid.link/mvfyflloncy76a7nmkatpj6f2afddavwsibz3y4u4wo6gznro5@rdulkuh5wvje
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd
Pull iommufd fixes from Jason Gunthorpe:
- Syzkaller found a case where maths overflows can cause divide by 0
- Typo in a compiler bug warning fix in the selftests broke the
selftests
- type1 compatability had a mismatch when unmapping an already unmapped
range, it should succeed
* tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd:
iommufd: Make vfio_compat's unmap succeed if the range is already empty
iommufd/selftest: Fix ioctl return value in _test_cmd_trigger_vevents()
iommufd: Don't overflow during division for dirty tracking
|
|
Add proper error handling and resource cleanup to prevent memory leaks
in add_boot_memory_ranges(). The function now checks for NULL return
from kobject_create_and_add(), uses local buffer for range names to
avoid dynamic allocation, and implements a cleanup path that removes
previously created sysfs groups and kobjects on failure.
This prevents resource leaks when kobject creation or sysfs group
creation fails during boot memory range initialization.
Signed-off-by: Kaushlendra Kumar <kaushlendra.kumar@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Link: https://patch.msgid.link/20251030023228.3956296-1-kaushlendra.kumar@intel.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Replace kfree() with ACPI_FREE() in pch_fivr_read() to follow ACPICA
memory management conventions.
While functionally equivalent in Linux (ACPI_FREE() is implemented
as kfree()), using ACPI_FREE() maintains consistency with ACPICA
coding standards for deallocating ACPI buffer objects.
Signed-off-by: Kaushlendra Kumar <kaushlendra.kumar@intel.com>
[ rjw: Subject and changelog edits ]
Link: https://patch.msgid.link/20251028051554.2862049-1-kaushlendra.kumar@intel.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Using the IS_ENABLED() macro in the int340x_thermal_handler_attach()
forces the kernel to be recompiled when thermal drivers are enabled
or disabled, which is a significant limitation of its modularity.
The IS_ENABLED() macro is particularly problematic for the Android
Generic Kernel Image (GKI) project which uses unified core kernel
while SoC/board support is moved to loadable vendor modules.
The Intel Dynamic Platform and Thermal Framework (DPTF) requires
thermal drivers to be loaded at runtime, thus ACPI bus scan handler
is not needed and acpi_default_enumeration() may create all platform
devices, regardless of the actual setting of CONFIG_INT340X_THERMAL.
Signed-off-by: Slawomir Rosek <srosek@google.com>
Link: https://patch.msgid.link/20251103162516.2606158-3-srosek@google.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
The IRQ used by the Intel SoC DTS thermal device for critical
overheating notification is listed in _CRS of device INT3401 which
therefore needs to be enumerated for Intel SoC DTS thermal to work.
The enumeration happens by binding the int3401_thermal driver to the
INT3401 platform device. Thus CONFIG_INT340X_THERMAL is in fact
necessary for enumerating it, so checking CONFIG_INTEL_SOC_DTS_THERMAL
in int340x_thermal_handler_attach() is pointless and INT340X_THERMAL
may as well be selected by INTEL_SOC_DTS_THERMAL.
Signed-off-by: Slawomir Rosek <srosek@google.com>
[ rjw: New subject ]
Link: https://patch.msgid.link/20251103162516.2606158-2-srosek@google.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
The current code assumes that only DDR errors have split messages. Ensure
proper logging of non-standard event errors that may be split across multiple
messages too.
[ bp: Massage, move comment too, fix it up. ]
Fixes: d5fe2fec6c40 ("EDAC: Add a driver for the AMD Versal NET DDR controller")
Signed-off-by: Shubhrajyoti Datta <shubhrajyoti.datta@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://patch.msgid.link/20251023113108.3467132-1-shubhrajyoti.datta@amd.com
|
|
Add test cases to check outcome of fair GuC context or doorbells
IDs allocations for regular and admin-only PF mode.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Link: https://patch.msgid.link/20251106165932.2143-1-michal.wajdeczko@intel.com
|
|
Instead of trying very hard to find the largest fair number of GuC
doorbell IDs that could be allocated for VFs on the current GT, pick
some smaller rounded down to power-of-two value that is more likely
to be provisioned in the same manner by the other PF instance:
num VFs | num doorbells
--------+--------------
63..32 | 4
31..16 | 8
15..8 | 16
7..4 | 32
3..2 | 64
1 | 128 (regular PF)
1 | 240 (admin only PF)
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Link: https://patch.msgid.link/20251105183253.863-3-michal.wajdeczko@intel.com
|
|
Instead of trying very hard to find the largest fair number of GuC
context IDs that could be allocated for VFs on the current GT, pick
some smaller rounded down to power-of-two value that is more likely
to be provisioned in the same manner by the other PF instance:
num VFs | num contexts
--------+-------------
63..32 | 1024
31..16 | 2048
15..8 | 4096
7..4 | 8192
3..2 | 16384
1 | 32768 (regular PF)
1 | 64512 (admin only PF)
Add also helper function to determine if the PF is admin-only,
and for now use .probe_display flag for that.
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com>
Link: https://patch.msgid.link/20251105183253.863-2-michal.wajdeczko@intel.com
|
|
For whatever unknown reason the pmdemand code is using a custom
50 usec fast polling timeout instead of the normal 2 usec
value. Switch to the standard value to get rid of the special
case.
The eventual aim is to get rid of the fast vs. slow timeout
entirely and switch over to poll_timeout_us().
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patch.msgid.link/20251106152049.21115-11-ville.syrjala@linux.intel.com
|
|
For whatever unknown reason the HDCP code is using a custom
10 usec fast polling timeout instead of the normal 2 usec
value. Switch to the standard value to get rid of the special
case.
The eventual aim is to get rid of the fast vs. slow timeout
entirely and switch over to poll_timeout_us().
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patch.msgid.link/20251106152049.21115-10-ville.syrjala@linux.intel.com
|
|
The LT PHY code is abusing intel_de_wait_custom() in all kinds of weird
ways. Get rid of the weird fast timeouts, and just use the slow ones.
For consistency with intel_wait_for_register() we'll stick to the
default 2 usec fast timeout for all cases.
Someone really needs to properly document where all these magic numbers
came from...
This will let us eventually nuke intel_de_wait_custom() and convert
over to poll_timeout_us().
v2: Go for the longer (ms) timeout in case it actually matters
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patch.msgid.link/20251106152049.21115-9-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
|
|
Include the units the in the define name for XELPDP_PORT_RESET_END_TIMEOUT
to make it match all its other counterparts.
v2: It's _MS not _US (Jani)
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patch.msgid.link/20251106155249.2810-1-ville.syrjala@linux.intel.com
|
|
The slow vs. fast timeout stuff is really just an implementation
detail. Let's not spread that terminology in random timeout defines.
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patch.msgid.link/20251106152049.21115-7-ville.syrjala@linux.intel.com
|
|
XELPDP_MSGBUS_TIMEOUT_FAST_US looks to be just an obfuscated version
of the default 2 microsecond fast timeout used by
intel_wait_for_register(). Get rid of it to make it clear what's going
on here.
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patch.msgid.link/20251106152049.21115-6-ville.syrjala@linux.intel.com
|
|
XELPDP_PORT_POWERDOWN_UPDATE_TIMEOUT_MS
There was a completely unjustified change to the cx0 powerdown
timeout, and the way it was done now prevents future conversion
to poll_timeout_us().
Assuming there was some reason the bigger timeout let's nuke
the old short timeout (XELPDP_PORT_POWERDOWN_UPDATE_TIMEOUT_US)
nd replace it with the bigger timeout
(XELPDP_PORT_POWERDOWN_UPDATE_TIMEOUT_MS).
For consistency with intel_wait_for_register() we'll stick to the
default 2 usec for the fast timeout.
v2: Go for the longer (ms) timeout in case it actually matters
v3: Note the defaullt 2 usec fast timeout (Jani)
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patch.msgid.link/20251106152049.21115-5-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
|
|
The actual timeout used isn't particularly interesting, so
don't print it. Makes the code simpler.
The debugs are also using some random capitalizaton rule.
Clean that up a bit while at it.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patch.msgid.link/20251106152049.21115-4-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
|
|
The actual timeout used isn't particularly interesting, so
don't print it. Makes the code simpler.
The debugs are also using some random capitalizaton rule.
Clean that up a bit while at it.
Also intel_cx0_powerdown_change_sequence() used one timeout
in the actual code but printed a different one.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patch.msgid.link/20251106152049.21115-3-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
|
|
Rename irq_polling_work to qaic_irq_polling_work to reduce ambiguity
and avoid potential naming conflicts in the future.
Signed-off-by: Zack McKevitt <zachary.mckevitt@oss.qualcomm.com>
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Signed-off-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Link: https://patch.msgid.link/20251031192511.3179130-1-zachary.mckevitt@oss.qualcomm.com
|
|
After subsystem of the device has crashed it sends a message with
command DEBUG_TRANSFER_INFO to kernel(host). Send ACK for that message
and then prepare to collect the ramdump of the subsystem
Steps of crashdump collection is as follows,
1) Device sends DEBUG_TRANSFER_INFO message indicating that device wants
to send crashdump.
2) Send an acknowledgment to that message either ACK or NACK.
a) NACK will inform the device that host will not download the
crashdump
b) ACK will inform the device that host will download the crashdump
3) Along with the DEBUG_TRANSFER_INFO we receive a table base address and
its length, use that to download that table from device.
a) This table is meta data of the crashdump and not the actual
crashdump.
4) After we respond as ACK for message received on step 1) we start
downloading the table. Use series of MEMORY_READ/MEMORY_READ_RSP SSR
commands to download the entire table.
5) Each entry in the table represents a segment of crashdump. Once the
table downloading is complete, iterate through each entry of table
and download each crashdump segment(same as table itself). Table entry
contains the memory base address and length along with other info.
6) After the entire crashdump is downloaded send DEBUG_TRANSFER_DONE
which marks that host is terminating the crashdump transfer. This
message can be send in both success or error case.
7) After receiving DEBUG_TRANSFER_DONE_RSP hand over the crashdump to
dev_coredumpv() and free all the necessary memory.
Co-developed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Signed-off-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Co-developed-by: Pranjal Ramajor Asha Kanojiya <quic_pkanojiy@quicinc.com>
Signed-off-by: Pranjal Ramajor Asha Kanojiya <quic_pkanojiy@quicinc.com>
Signed-off-by: Pranjal Ramajor Asha Kanojiya <pkanojiy@codeaurora.org>
Signed-off-by: Youssef Samir <youssef.abdulrahman@oss.qualcomm.com>
Signed-off-by: Zack McKevitt <zachary.mckevitt@oss.qualcomm.com>
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Signed-off-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Link: https://patch.msgid.link/20251031174059.2814445-4-zachary.mckevitt@oss.qualcomm.com
|
|
Subsystem restart (SSR) for a qaic device means that a NSP has crashed,
and will be restarted. However the restart process will lose any state
associated with activation, so the user will need to do some recovery.
While SSR has the provision to collect a crash dump, this patch does not
implement support for it.
Co-developed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Signed-off-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Co-developed-by: Pranjal Ramajor Asha Kanojiya <quic_pkanojiy@quicinc.com>
Signed-off-by: Pranjal Ramajor Asha Kanojiya <quic_pkanojiy@quicinc.com>
Co-developed-by: Troy Hanson <quic_thanson@quicinc.com>
Signed-off-by: Troy Hanson <quic_thanson@quicinc.com>
Co-developed-by: Aswin Venkatesan <aswivenk@qti.qualcomm.com>
Signed-off-by: Aswin Venkatesan <aswivenk@qti.qualcomm.com>
Signed-off-by: Jeffrey Hugo <jhugo@codeaurora.org>
Signed-off-by: Youssef Samir <youssef.abdulrahman@oss.qualcomm.com>
Signed-off-by: Zack McKevitt <zachary.mckevitt@oss.qualcomm.com>
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
[jhugo: Fix minor checkpatch whitespace issues]
Signed-off-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Link: https://patch.msgid.link/20251031174059.2814445-3-zachary.mckevitt@oss.qualcomm.com
|
|
Expose sysfs files for each DBC representing the current state of that DBC.
For example, sysfs for DBC ID 0 and accel minor number 0 looks like this,
/sys/class/accel/accel0/dbc0_state
Following are the states and their corresponding values,
DBC_STATE_IDLE (0)
DBC_STATE_ASSIGNED (1)
DBC_STATE_BEFORE_SHUTDOWN (2)
DBC_STATE_AFTER_SHUTDOWN (3)
DBC_STATE_BEFORE_POWER_UP (4)
DBC_STATE_AFTER_POWER_UP (5)
Signed-off-by: Pranjal Ramajor Asha Kanojiya <quic_pkanojiy@quicinc.com>
Signed-off-by: Youssef Samir <youssef.abdulrahman@oss.qualcomm.com>
Signed-off-by: Zack McKevitt <zachary.mckevitt@oss.qualcomm.com>
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Signed-off-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Link: https://patch.msgid.link/20251031174059.2814445-2-zachary.mckevitt@oss.qualcomm.com
|
|
acpi_processor_setup_cstates() and acpi_processor_setup_cpuidle_cx()
are called after successfully obtaining power information. Among other
things, these setup functions check the C-state count against zero.
However, that check is done by acpi_processor_get_power_info_cst()
which will cause acpi_processor_get_power_info() to fail if it does
no pass, so the checks in the two functions mentioned above are
redundant.
Drop those redundant checks.
No intentional functional impact.
Signed-off-by: Huisong Li <lihuisong@huawei.com>
[ rjw: Subject and changelog rewrite ]
Link: https://patch.msgid.link/20251105093647.3557248-1-lihuisong@huawei.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
For 'always-on' and 'boot-on' regulators, the set_machine_constraints()
may enable supply before enabling the main regulator, however if the
latter fails, the function returns with an error but the supply remains
enabled.
When this happens, the regulator_register() function continues on the
error path where it puts the supply regulator. Since enabling the supply
is not balanced with a disable call, a warning similar to the following
gets issued from _regulator_put():
[ 1.603889] WARNING: CPU: 2 PID: 44 at _regulator_put+0x8c/0xa0
[ 1.603908] Modules linked in:
[ 1.603926] CPU: 2 UID: 0 PID: 44 Comm: kworker/u16:3 Not tainted 6.18.0-rc4 #0 NONE
[ 1.603938] Hardware name: Qualcomm Technologies, Inc. IPQ9574/AP-AL02-C7 (DT)
[ 1.603945] Workqueue: async async_run_entry_fn
[ 1.603958] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 1.603967] pc : _regulator_put+0x8c/0xa0
[ 1.603976] lr : _regulator_put+0x7c/0xa0
...
[ 1.604140] Call trace:
[ 1.604145] _regulator_put+0x8c/0xa0 (P)
[ 1.604156] regulator_register+0x2ec/0xbf0
[ 1.604166] devm_regulator_register+0x60/0xb0
[ 1.604178] rpm_reg_probe+0x120/0x208
[ 1.604187] platform_probe+0x64/0xa8
...
In order to avoid this, change the set_machine_constraints() function to
disable the supply if enabling the main regulator fails.
Fixes: 05f224ca6693 ("regulator: core: Clean enabling always-on regulators + their supplies")
Signed-off-by: Gabor Juhos <j4g8y7@gmail.com>
Link: https://patch.msgid.link/20251107-regulator-disable-supply-v1-1-c95f0536f1b5@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
per_cpu(cpc_desc_ptr, cpu) object is initialized for only the online
CPU via acpi_soft_cpu_online() --> __acpi_processor_start() -->
acpi_cppc_processor_probe().
However the function cppc_perf_ctrs_in_pcc() checks if the CPPC
perf-ctrs are in a PCC region for all the present CPUs, which breaks
when the kernel is booted with "nosmt=force".
Hence, limit the check only to the online CPUs.
Fixes: ae2df912d1a5 ("ACPI: CPPC: Disable FIE if registers in PCC regions")
Reviewed-by: "Mario Limonciello (AMD) (kernel.org)" <superm1@kernel.org>
Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://patch.msgid.link/20251107074145.2340-5-gautham.shenoy@amd.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
per_cpu(cpc_desc_ptr, cpu) object is initialized for only the online
CPUs via acpi_soft_cpu_online() --> __acpi_processor_start() -->
acpi_cppc_processor_probe().
However the function cppc_allow_fast_switch() checks for the validity
of the _CPC object for all the present CPUs. This breaks when the
kernel is booted with "nosmt=force".
Check fast_switch capability only on online CPUs
Fixes: 15eece6c5b05 ("ACPI: CPPC: Fix NULL pointer dereference when nosmp is used")
Reviewed-by: "Mario Limonciello (AMD) (kernel.org)" <superm1@kernel.org>
Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://patch.msgid.link/20251107074145.2340-4-gautham.shenoy@amd.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
per_cpu(cpc_desc_ptr, cpu) object is initialized for only the online
CPUs via acpi_soft_cpu_online() --> __acpi_processor_start() -->
acpi_cppc_processor_probe().
However the function acpi_cpc_valid() checks for the validity of the
_CPC object for all the present CPUs. This breaks when the kernel is
booted with "nosmt=force".
Hence check the validity of the _CPC objects of only the online CPUs.
Fixes: 2aeca6bd0277 ("ACPI: CPPC: Check present CPUs for determining _CPC is valid")
Reported-by: Christopher Harris <chris.harris79@gmail.com>
Closes: https://lore.kernel.org/lkml/CAM+eXpdDT7KjLV0AxEwOLkSJ2QtrsvGvjA2cCHvt1d0k2_C4Cw@mail.gmail.com/
Suggested-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: "Mario Limonciello (AMD) (kernel.org)" <superm1@kernel.org>
Tested-by: Chrisopher Harris <chris.harris79@gmail.com>
Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Link: https://patch.msgid.link/20251107074145.2340-3-gautham.shenoy@amd.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Currently if a user enqueues a work item using schedule_delayed_work() the
used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
schedule_work() that is using system_wq and queue_work(), that makes use
again of WORK_CPU_UNBOUND.
This lack of consistency cannot be addressed without refactoring the API.
alloc_workqueue() treats all queues as per-CPU by default, while unbound
workqueues must opt-in via WQ_UNBOUND.
This default is suboptimal: most workloads benefit from unbound queues,
allowing the scheduler to place worker threads where they’re needed and
reducing noise when CPUs are isolated.
This continues the effort to refactor workqueue APIs, which began with
the introduction of new workqueues and a new alloc_workqueue flag in:
commit 128ea9f6ccfb ("workqueue: Add system_percpu_wq and system_dfl_wq")
commit 930c2ea566af ("workqueue: Add new WQ_PERCPU flag")
This change adds a new WQ_PERCPU flag to explicitly request
alloc_workqueue() to be per-cpu when WQ_UNBOUND has not been specified.
With the introduction of the WQ_PERCPU flag (equivalent to !WQ_UNBOUND),
any alloc_workqueue() caller that doesn’t explicitly specify WQ_UNBOUND
must now use WQ_PERCPU.
Once migration is complete, WQ_UNBOUND can be removed and unbound will
become the implicit default.
Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
|
|
Failing to set power off indicates an unrecoverable hardware or firmware
error. Update the driver to treat such a failure as a fatal condition
and stop further operations that depend on successful power state
transition.
This prevents undefined behavior when the hardware remains in an
unexpected state after a failed power-off attempt.
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
Link: https://patch.msgid.link/20251106180521.1095218-1-lizhi.hou@amd.com
|
|
Enable Wildcat Lake platforms to achieve PMC information from
Intel PMC SSRAM Telemetry driver and substate requirements data
from telemetry region.
Signed-off-by: Xi Pardee <xi.pardee@linux.intel.com>
Link: https://patch.msgid.link/20251105215020.1984036-2-xi.pardee@linux.intel.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
|
|
Add Wildcat Lake PMT telemetry support.
Signed-off-by: Xi Pardee <xi.pardee@linux.intel.com>
Link: https://patch.msgid.link/20251105215020.1984036-1-xi.pardee@linux.intel.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
|
|
The HPA to DPA translation for poison injection assumes that the
base address starts from where the CXL region begins. When the
extended linear cache is active, the offset can be within the DRAM
region. Adjust the offset so that it correctly reflects the offset
within the CXL region.
[ dj: Add fixes tag from Alison ]
Fixes: c3dd67681c70 ("cxl/region: Add inject and clear poison by region offset")
Link: https://patch.msgid.link/20251031173224.3537030-5-dave.jiang@intel.com
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
Pull gpio fixes from Bartosz Golaszewski:
- use the firmware node of the GPIO chip, not its label for software
node lookup
- fix invalid pointer access in GPIO debugfs
- drop unused functions from gpio-tb10x
- fix a regression in gpio-aggregator: restore the set_config()
callback in the driver
- correct schema $id path in ti,twl4030 DT bindings
* tag 'gpio-fixes-for-v6.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
gpio: tb10x: Drop unused tb10x_set_bits() function
gpio: aggregator: restore the set_config operation
gpiolib: fix invalid pointer access in debugfs
gpio: swnode: don't use the swnode's name as the key for GPIO lookup
dt-bindings: gpio: ti,twl4030: Correct the schema $id path
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"All fixes in the UFS driver.
The big contributor to the diffstats is the Intel controller S0ix/S3
fix which has to special case the suspend/resume patch for intel
controllers in ufshcd-pci.c"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: ufs: core: Fix invalid probe error return value
scsi: ufs: ufs-pci: Set UFSHCD_QUIRK_PERFORM_LINK_STARTUP_ONCE for Intel ADL
scsi: ufs: core: Add a quirk to suppress link_startup_again
scsi: ufs: ufs-pci: Fix S0ix/S3 for Intel controllers
scsi: ufs: core: Revert "Make HID attributes visible"
scsi: ufs: core: Reduce link startup failure logging
scsi: ufs: core: Fix a race condition related to the "hid" attribute group
scsi: ufs: ufs-qcom: Fix UFS OCP issue during UFS power down (PC=3)
|
|
s/i915_gem_object_get_frontbuffer/i915_gem_object_frontbuffer_lookup/
The i915_gem_object_get_frontbuffer() name is rather confusing wrt.
intel_frontbuffer_get(). Rename to i915_gem_object_frontbuffer_lookup()
to make things less confusing.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patch.msgid.link/20251016185408.22735-11-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
|
|
The current attempted split between xe/i915 vs. display
for intel_frontbuffer is a mess:
- the i915 rcu leaks through the interface to the display side
- the obj->frontbuffer write-side is now protected by a display
specific spinlock even though the actual obj->framebuffer
pointer lives in a i915 specific structure
- the kref is getting poked directly from both sides
- i915_active is still on the display side
Clean up the mess by moving everything about the frontbuffer
lifetime management to the i915/xe side:
- the rcu usage is now completely contained in i915
- frontbuffer_lock is moved into i915
- kref is on the i915/xe side (xe needs the refcount as well
due to intel_frontbuffer_queue_flush()->intel_frontbuffer_ref())
- the bo (and its refcounting) is no longer on the display side
- i915_active is contained in i915
I was pondering whether we could do this in some kind of smaller
steps, and perhaps we could, but it would probably have to start
with a bunch of reverts (which for sure won't go cleanly anymore).
So not convinced it's worth the hassle.
Acked-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patch.msgid.link/20251016185408.22735-10-ville.syrjala@linux.intel.com
|
|
After upcoming intel_frontbuffer lifetime related changes we
won't need intel_frontbuffer::obj for anything apart from
getting at the display. Add a direct pointer for that instead
so that the obj pointer can be completely eliminated.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patch.msgid.link/20251016185408.22735-9-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
|
|
I want to hide the kref from the high level frontbuffer code.
To that end abstract the kref_get() in intel_frontbuffer_queue_flush()
(which is the only high level function that needs this) as a new
intel_frontbuffer_ref().
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patch.msgid.link/20251016185408.22735-8-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
|
|
Our fb_tracking.lock is serving a double duty:
- protects fb_tracking.busy_bits
- provides the write-side protection for obj->frontbuffer
Split obj->frontbuffer role into a separate lock so that
we can clean up the current mess with the frontbuffer lifetime
management.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patch.msgid.link/20251016185408.22735-7-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
|
|
intel_frontbuffer_flush()
intel_bo_frontbuffer_flush_for_display() is a bit too low level
to be directly in the high level dirtyfb code. Move the calls
into intel_frontbuffer_flush().
There is a slight behavioural change here in that we now skip
the flush if the bo is not a current scanout buffer (front->bits
== 0). But that is fine as the flush will eventually happen via
the fb pinning code if/when the bo becomes a scanout buffer again.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patch.msgid.link/20251016185408.22735-6-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
|
|
operation
Convert intel_bo_flush_if_display() to be an operation on the
frontbuffer object rather than the underlying gem bo. This
will help with cleaning up the frontbuffer xe/i915 vs. display
split.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patch.msgid.link/20251016185408.22735-5-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
|
|
Get rid of intel_frontbuffer_flip_{prepare,complete}() (and
the accompanying flip_bits) since they are unused.
I suppose these could technically provide a minor optimization
over intel_frontbuffer_flip() in that the flush would get
deferred further if new rendering were to sneak in between the
prepare() and complete() calls. But for correctness it should
not make any difference since another flush will anyway follow
once the new rendering finishes.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patch.msgid.link/20251016185408.22735-4-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
|
|
Get rid of intel_frontbuffer_flip_{prepare,complete}() from
the overlay code and just use intel_frontbuffer_flip() instead.
The only difference between these are the light interactions
with the ORIGIN_CS busyness tracking, but since the only user
of this is the overlay/xf86-video-intel/Xv the buffer will
always be filled by the CPU and thus we'll never see any
ORIGIN_CS frontbuffer activity there anyway. Also I don't
think we actually have anything covered by the frontbuffer
tracking that affects the overlay (FBC is on the primary
plane, DRRS isn't currently enabled on the platforms with
overlay, and PSR doesn't exist in the hardware).
Acked-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patch.msgid.link/20251016185408.22735-3-ville.syrjala@linux.intel.com
|
|
I don't even know why we have this DIRTYFB flush in the overlay
code. We'll anyway call intel_frontbuffer_flip() so there should
be no need to pretend that this is some kind of frontbuffer only
rendering operation.
Acked-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patch.msgid.link/20251016185408.22735-2-ville.syrjala@linux.intel.com
|
|
Fix the include of iommu-pages.h in the KUnit tests for the IOMMU
generic page-table code to make the tests compile again.
Reported-by: Thorsten Leemhuis <linux@leemhuis.info>
Closes: https://lore.kernel.org/all/9844d4cb-f517-478b-9911-b6dc1a963b8e@leemhuis.info/
Reported-by: "Longia, Amandeep Kaur" <amandeepkaur.longia@amd.com>
Closes: https://lore.kernel.org/all/e641c955-25ad-4eae-b3fe-4392966cf768@amd.com/
Fixes: 1dd4187f53c3 ("iommupt: Add a kunit test for Generic Page Table")
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Tested-by: Thorsten Leemhuis <linux@leemhuis.info>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
|
|
This patch solves one of the existing mentions of COHA, a task
in the Nova task list about improving the `CoherentAllocation` API.
It uses the `write` method from `CoherentAllocation`.
Signed-off-by: Daniel del Castillo <delcastillodelarosadaniel@gmail.com>
[acourbot@nvidia.com: set prefix to "gpu: nova-core:".]
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Message-ID: <20251104193756.57726-3-delcastillodelarosadaniel@gmail.com>
|
|
Some comments that already existed didn't start with a capital
letter, this patch fixes that.
Signed-off-by: Daniel del Castillo <delcastillodelarosadaniel@gmail.com>
[acourbot@nvidia.com: set prefix to "gpu: nova-core:".]
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Message-ID: <20251104193756.57726-2-delcastillodelarosadaniel@gmail.com>
|