| Age | Commit message (Collapse) | Author |
|
dc_process_dmub_aux_transfer_async
[Why&How]
dc_process_dmub_aux_transfer_async() copies payload->length bytes into a
16-byte stack buffer (dpaux.data[16]) guarded only by an ASSERT(), which
is a no-op in release builds. If a caller ever passes length > 16 this
results in a stack buffer overflow via memcpy.
Additionally, link_index is used to dereference dc->links[] without
bounds checking against dc->link_count, risking an out-of-bounds access.
Replace the ASSERT with a hard runtime check that returns false when
payload->length exceeds the destination buffer size, and add a bounds
check for link_index before it is used.
Assisted-by: GitHub Copilot:Claude claude-4-opus
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Ivan Lipski <ivan.lipski@amd.com>
Tested-by: Dan Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit ba4caa9fecdf7a38f98c878ad05a8a64148b6881)
Cc: stable@vger.kernel.org
|
|
DCE 6.x doesn't support 10-bit truncation and 10-bit dithering
because the following fields are 1-bit only:
FMT_TEMPORAL_DITHER_DEPTH
FMT_SPATIAL_DITHER_DEPTH
FMT_TRUNCATE_DEPTH
Programming these fields to "2" will program them as if the
dithering option was 6-bit, resulting in sub-par picture
quality and an ugly "color banding" effect.
Note that a recent commit changed the default 10-bit dithering
option to DITHER_OPTION_SPATIAL10 which improves the picture
quality because it happens to look better, but is still not
actually supported by DCE 6.x versions.
When the color depth is 10-bit or more, just disable
any kind of dithering options on DCE 6.x.
Closes: https://gitlab.freedesktop.org/drm/amd/-/work_items/5151
Fixes: 529cad0f945c ("drm/amd/display: Add function to set dither option")
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 6be8ced880dfe29ce38c2d5e74489822da5c250e)
|
|
[Why]
The explicit include of linux/array_size.h in Display Core (DC) is
redundant. The ARRAY_SIZE macro is already provided by dm_services.h
(via os_types.h) which DC includes.
[How]
Remove the unnecessary #include <linux/array_size.h> from
dc_hw_sequencer.c and dce_clock_source.c.
Fixes: 2d2366176445 ("drm/amd/display: Replace inline NUM_ELEMENTS macro with ARRAY_SIZE")
CC: Linus Probert <linus.probert@gmail.com>
Signed-off-by: Roman Li <Roman.Li@amd.com>
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Replaces the use of local NUM_ELEMENTS macro with the ARRAY_SIZE macro
defined in <linux/array_size.h>.
This aligns with existing coccinelle script array_size.cocci which has
been applied to other sources in order to remove inline
sizeof(a)/sizeof(a[0]) patterns from other source files.
Suggested-by: Robert P. J. Day <rpjday@crashcourse.ca>
Signed-off-by: Linus Probert <linus.probert@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Commit d5df648ec830 ("drm/amd/display: Change dither policy for 10bpc to
round") degraded display of 12 bpc color precision output to 10 bpc sinks
by switching 10 bpc output from dithering to "truncate to 10 bpc".
I don't find the argumentation in that commit convincing, but the
consequences highly unfortunate, especially for applications that
require effective > 10 bpc precision output of > 10 bpc framebuffers.
The argument wasn't something strong like "there are hardware design
defects or limitations which require us to work around broken dithering
to 10 bpc", or "there are some special use cases which do require
truncation to 10 bpc", but essentially "at some point in the past we
used truncation in Polaris/Vega times and it looks like it got
inadvertently changed for Navi, so let's do that again". I couldn't find
evidence for that in the git commit logs for this. The commit message also
acknowledges that using dithering "...makes some sense for FP16...
...but not for ARGB2101010 surfaces..."
The problem with this is that it makes fp16 surfaces, and especially
rgba16 fixed point surfaces, less useful. These are now well
supported by Mesa 25.3 and later via OpenGL + EGL, Vulkan/WSI, and by
OSS AMDVLK Vulkan/WSI/display, and also by GNOME 50 mutter under Wayland,
and they used to provide more than 10 bpc effective precision at the
output.
Even for 8 or 10 bpc surfaces, the color pipeline behind the framebuffer,
e.g., gamma tables, CTM, can be used for color correction and will
benefit from an effective > 10 bpc output precision via dithering,
retaining some precision that would get lost on the way through the
pipeline, e.g., due to non-linear gamma functions.
Scientific apps rely on this for > 10 bpc display precision. Truncating
to 10 bpc, instead of dithering the pipeline internal 12 bpc precision
down to 10 bpc, causes a serious loss of precision. This also creates the
undesirable and slightly absurd situation that using a cheap monitor
with only 8 bpc input and display panel will yield roughly 12 bpc
precision via dithering from 12 -> 8 bpc, whereas investment into a
more expensive monitor with 10 bpc input and native 10 bpc display will
only yield 10 bpc, even if a fp16 or rgb16 framebuffer and/or a properly
set up color pipeline (gamma tables, CTM's etc. with more than 10 bpc out
precision) would allow effective 12 bpc precision output.
Therefore this patch proposes reverting that commit and going back to
dithering down to 10 bpc, consistent with the behaviour for 6 bpc or 8 bpc
output.
Successfully tested on AMD Polaris DCE 11.2 and Raven Ridge DCN 1.0 with
a native 10 bpc capable monitor, outputting a RGBA16 unorm framebuffer and
measuring resulting color precision with a photometer. No apparent visual
artifacts or problems were observed, and effective precision was measured
to be 12 bpc again, as expected.
Fixes: d5df648ec830 ("drm/amd/display: Change dither policy for 10bpc to round")
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Tested-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Cc: stable@vger.kernel.org
Cc: Aric Cyr <aric.cyr@amd.com>
Cc: Anthony Koo <anthony.koo@amd.com>
Cc: Rodrigo Siqueira <rodrigo.siqueira@amd.com>
Cc: Krunoslav Kovac <krunoslav.kovac@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Reported-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[Why]
Resolve compiler warnings by marking unused parameters explicitly.
[How]
In .c and .h function definitions, keep parameter names
in signatures and add a line with `(void)param;` in function body
Preserved function signatures and avoids breaking code paths that
may reference the parameter under conditional compilation.
Reviewed-by: Dillon Varone <dillon.varone@amd.com>
Reviewed-by: Austin Zheng <austin.zheng@amd.com>
Signed-off-by: Gaghik Khachatrian <gaghik.khachatrian@amd.com>
Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[Why]
FPU guards (DC_FP_START/DC_FP_END) are required to wrap around code that
can manipulates floats. To do this properly, the FPU guards must be used
in a file that is not compiled as a FPU unit. If the guards are used in
a file that is a FPU unit, other sections in the file that aren't guarded
may be end up being compiled to use FPU operations.
[How]
Added DC_FP_START and DC_FP_END to DC functions that call DML functions
using FPU.
Reviewed-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Rafal Ostrowski <rafal.ostrowski@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Revert commit c24bb00cc6cf ("drm/amd/display: Refactor DC update checks")
[WHY]
Causing issues with PSR/Replay, reverting until those can be fixed.
Reviewed-by: Martin Leung <Martin.Leung@amd.com>
Signed-off-by: Dillon Varone <Dillon.Varone@amd.com>
Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Revert commit 7d59465de38e ("drm/amd/display: Add 3DLUT DMA broadcast support")
[WHY&HOW]
Dependencies of this change are still causing issues, so reverting until
those can be fixed.
Reviewed-by: Martin Leung <Martin.Leung@amd.com>
Signed-off-by: Dillon Varone <Dillon.Varone@amd.com>
Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[Why]
VBIOS supports DSC for seamless boot on newer hardware.
Reading hardware state allows proper DSC validation without breaking
existing boot display.
[What]
Remove DSC block for boot timing validation and implement hardware state
reading to populate DSC configuration from VBIOS-configured state.
Enhance dsc_read_state function in DCN401 to read additional
DSC parameters.
Reviewed-by: Yihan Zhu <yihan.zhu@amd.com>
Signed-off-by: Mohit Bawa <Mohit.Bawa@amd.com>
Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Fix compiler warnings by consistently use the same signedness for
a given value
Reviewed-by: Joshua Aberback <joshua.aberback@amd.com>
Signed-off-by: Clay King <clayking@amd.com>
Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[Why]
Implicit signed-to-unsigned conversions caused compiler
warnings in DC paths.
[How]
Added explicit (unsigned int)/(uint32_t) casts for sentinel -1
assignments and IRQ ~MASK initializers, with small cast alignment
in logging/DPCD code.
Functionality and behavior is unchanged; only type intent is explicit.
Reviewed-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Gaghik Khachatrian <gaghik.khachatrian@amd.com>
Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
'update_planes_and_stream_state'
Add missing info for the update_descriptor parameter in
update_planes_and_stream_state().
Fixes the below with gcc W=1:
../display/dc/core/dc.c:3630 function parameter 'update_descriptor' not described in 'update_planes_and_stream_state'
Fixes: c24bb00cc6cf ("drm/amd/display: Refactor DC update checks")
Cc: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Cc: Dillon Varone <Dillon.Varone@amd.com>
Cc: Chuanyu Tseng <chuanyu.tseng@amd.com>
Cc: Roman Li <roman.li@amd.com>
Cc: Alex Hung <alex.hung@amd.com>
Cc: Tom Chung <chiahsuan.chung@amd.com>
Cc: Dan Carpenter <dan.carpenter@linaro.org>
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Handling unused function parameter due to cause compiler warning
Reviewed-by: Clayton King <clayton.king@amd.com>
Reviewed-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Gaghik Khachatrian <gaghik.khachatrian@amd.com>
Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Rename the enum 'pixel_format' to 'dc_pixel_format' to avoid potential
name conflicts with the pixel_format struct defined in
include/video/pixel_format.h.
Signed-off-by: Hou Wenlong <houwenlong.hwl@antgroup.com>
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[Why]
There was previously a dc debug flag to indicate that tiling
changes should only be a medium update instead of full. The
function get_plane_info_type was refactored to not rely on dc
state, but in the process the logic was unintentionally changed,
which leads to screen corruption in some cases.
[How]
- add flag to tiling struct to avoid full update when necessary
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Reviewed-by: Aric Cyr <aric.cyr@amd.com>
Signed-off-by: Joshua Aberback <joshua.aberback@amd.com>
Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[Why]
A deadlock occurs when using inbox0 lock for cursor operations on
PSR-SU and Replays that does not when using the inbox1 locking path.
This is because of a priority inversion issue where inbox1 work
cannot be serviced while holding the HW lock from driver and sending
cursor notifications to DMUB.
Typically the lower priority of inbox1 for the lock command would
allow the PSR and Replay FSMs to complete their transition prior
to giving driver the lock but this is no longer the case with inbox0
having the highest priority in servicing.
[How]
This will reintroduce any synchronization bugs that were there
with Replay or PSR-SU touching the cursor at the same time as driver.
Reviewed-by: Charlene Liu <charlene.liu@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[WHY&HOW]
A single HUBP can be used to fetch 3DLUT and broadcast to a
single HUBP. Add logic to select the top pipe for a given
plane and use it's HUBP as the broadcast source for multiple
MPC's.
Reviewed-by: Ilya Bakoulin <ilya.bakoulin@amd.com>
Signed-off-by: Dillon Varone <Dillon.Varone@amd.com>
Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[WHY&HOW]
DC currently has fragmented definitions of update types. This changes
consolidates them into a single interface, and adds expanded
functionality to accommodate all use cases.
- adds `dc_check_update_state_and_surfaces_for_stream` to determine
update type including state, surface, and stream changes.
- adds missing surface/stream update checks to
`dc_check_update_surfaces_for_stream`
- adds new update type `UPDATE_TYPE_ADDR_ONLY` to accomodate flows where
further distinction from `UPDATE_TYPE_FAST` was needed
- removes caller reliance on `enable_legacy_fast_update` to determine
which commit function to use, instead embedding it in the update type
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Dillon Varone <Dillon.Varone@amd.com>
Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[Why]
Post-driver cases always use linear tiling yet gfx handling for this
case is improper, allowing for incorrect gfx structs to be populated and
used.
[How]
Query DC for the apporpriate linear tiling mode and populate the DCN
specific gfx version structs.
Reviewed-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Nicholas Carbones <Nicholas.Carbones@amd.com>
Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com>
Tested-by: Dan Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This patch provides a bulk merge to align driver
support for DCN42 with Display Core version 3.2.373.
It includes upgrade for:
- clk_mgr
- dml2/dml21
- optc
- hubp
- mpc
- optc
- hwseq
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Roman Li <Roman.Li@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Tested-by: Dan Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Add support for DCN 4.2 clock manager.
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Roman Li <Roman.Li@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Tested-by: Dan Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[WHAT]
Silence warning by cleaning up unused code.
Reviewed-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Clay King <clayking@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Tested-by: Dan Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[WHY]
We're checking surface and stream updates after they've been applied to
their respective states within `update_planes_and_stream_state`.
Medium updates under the HWSS V3 fast path that are not supported or
tested are getting implicitly if they don't trigger a DML validation
and getting updated in place on the dc->current_state context.
[HOW]
Fix this issue by moving up the fast path determination check prior
to `update_planes_and_stream_state`. This is how the V2 path works
and how the V3 path used to work prior to the refactors in this area.
Reviewed-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Tested-by: Dan Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This can be called while preemption is disabled, for example by
dcn32_internal_validate_bw which is called with the FPU active.
Fixes "BUG: scheduling while atomic" messages I encounter on my Navi31
machine.
Signed-off-by: Natalie Vock <natalie.vock@gmx.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[WHY+HOW]
Adding visual confirm to visually track changes in refresh rate.
Reviewed-by: Aric Cyr <aric.cyr@amd.com>
Signed-off-by: Muaaz Nisar <muanisar@amd.com>
Signed-off-by: Ivan Lipski <ivan.lipski@amd.com>
Tested-by: Dan Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[Why&How]
Resolve type mismatch warnings by ensuring loop counters and compared
values use matching unsigned types (size_t or int) in array iteration.
Reviewed-by: Aric Cyr <aric.cyr@amd.com>
Signed-off-by: Gaghik Khachatrian <gaghik.khachatrian@amd.com>
Signed-off-by: Ivan Lipski <ivan.lipski@amd.com>
Tested-by: Dan Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[why & how]
On D3 path during dc_set_power_state, we may be in idle_allowed=true,
at which point we will exit idle via dc_wake_and_execute_dmub_cmd_list
which doesn't update dc->idle_optimizations_allowed to false. This
would cause any future attempts to allow idle optimizations via the DC
helper to get skipped because the value is stale and not reflective of
the actual HW state.
Move dc_exit_ips_for_hw_access() to the top of the function.
Additionally ensure that dc_power_down_on_boot thread holds the DC
lock and only runs if there are 0 streams.
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Ovidiu Bunea <ovidiu.bunea@amd.com>
Signed-off-by: Ivan Lipski <ivan.lipski@amd.com>
Tested-by: Dan Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Add support for DCN 4.2 in Display Core
Signed-off-by: Roman Li <Roman.Li@amd.com>
Acked-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This converts some of the visually simpler cases that have been split
over multiple lines. I only did the ones that are easy to verify the
resulting diff by having just that final GFP_KERNEL argument on the next
line.
Somebody should probably do a proper coccinelle script for this, but for
me the trivial script actually resulted in an assertion failure in the
middle of the script. I probably had made it a bit _too_ trivial.
So after fighting that far a while I decided to just do some of the
syntactically simpler cases with variations of the previous 'sed'
scripts.
The more syntactically complex multi-line cases would mostly really want
whitespace cleanup anyway.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This was done entirely with mindless brute force, using
git grep -l '\<k[vmz]*alloc_objs*(.*, GFP_KERNEL)' |
xargs sed -i 's/\(alloc_objs*(.*\), GFP_KERNEL)/\1)/'
to convert the new alloc_obj() users that had a simple GFP_KERNEL
argument to just drop that argument.
Note that due to the extreme simplicity of the scripting, any slightly
more complex cases spread over multiple lines would not be triggered:
they definitely exist, but this covers the vast bulk of the cases, and
the resulting diff is also then easier to check automatically.
For the same reason the 'flex' versions will be done as a separate
conversion.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This is the result of running the Coccinelle script from
scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to
avoid scalar types (which need careful case-by-case checking), and
instead replace kmalloc-family calls that allocate struct or union
object instances:
Single allocations: kmalloc(sizeof(TYPE), ...)
are replaced with: kmalloc_obj(TYPE, ...)
Array allocations: kmalloc_array(COUNT, sizeof(TYPE), ...)
are replaced with: kmalloc_objs(TYPE, COUNT, ...)
Flex array allocations: kmalloc(struct_size(PTR, FAM, COUNT), ...)
are replaced with: kmalloc_flex(*PTR, FAM, COUNT, ...)
(where TYPE may also be *VAR)
The resulting allocations no longer return "void *", instead returning
"TYPE *".
Signed-off-by: Kees Cook <kees@kernel.org>
|
|
This reverts commit 08a01ec306db ("drm/amd/display: Add Gfx Base Case For Linear Tiling Handling")
Reason for revert: Got blank screen issues while doing PNP
Reviewed-by: Joshua Aberback <joshua.aberback@amd.com>
Signed-off-by: Nicholas Carbones <Nicholas.Carbones@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[Why & How]
Adding mouse trigger in dc_stream to
recover from low refresh rate idle state
upon mouse movement without vsync interrupts.
Reviewed-by: Aric Cyr <aric.cyr@amd.com>
Signed-off-by: Muaaz Nisar <muaaz.nisar@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This reverts commit ba448f9ed62cf5a89603a738e6de91fc6c42ab35.
It cause some regression.
Reviewed-by: Sreeja Golui <sreeja.golui@amd.com>
Signed-off-by: Muaaz Nisar <muanisar@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
KASAN reports a NULL instruction fetch (RIP=0x0) from
dc_stream_program_cursor_position():
BUG: kernel NULL pointer dereference, address: 0000000000000000
RIP: 0010:0x0
Call Trace:
dc_stream_program_cursor_position+0x344/0x920 [amdgpu]
amdgpu_dm_atomic_commit_tail+...
[ +1.041013] BUG: kernel NULL pointer dereference, address: 0000000000000000
[ +0.000027] #PF: supervisor instruction fetch in kernel mode
[ +0.000013] #PF: error_code(0x0010) - not-present page
[ +0.000012] PGD 0 P4D 0
[ +0.000017] Oops: Oops: 0010 [#1] SMP KASAN NOPTI
[ +0.000017] CPU: 0 UID: 0 PID: 10 Comm: kworker/0:1 Tainted: G E 6.18.0+ #3 PREEMPT(voluntary)
[ +0.000023] Tainted: [E]=UNSIGNED_MODULE
[ +0.000010] Hardware name: ASUS System Product Name/ROG STRIX B550-F GAMING (WI-FI), BIOS 1401 12/03/2020
[ +0.000016] Workqueue: events drm_mode_rmfb_work_fn
[ +0.000022] RIP: 0010:0x0
[ +0.000017] Code: Unable to access opcode bytes at 0xffffffffffffffd6.
[ +0.000015] RSP: 0018:ffffc9000017f4c8 EFLAGS: 00010246
[ +0.000016] RAX: 0000000000000000 RBX: ffff88810afdda80 RCX: 1ffff110457000d1
[ +0.000014] RDX: 1ffffffff87b75bd RSI: 0000000000000000 RDI: ffff88810afdda80
[ +0.000014] RBP: ffffc9000017f538 R08: 0000000000000000 R09: ffff88822b800690
[ +0.000013] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffffc3dbac20
[ +0.000014] R13: 0000000000000000 R14: ffff88811ab80000 R15: dffffc0000000000
[ +0.000014] FS: 0000000000000000(0000) GS:ffff888434599000(0000) knlGS:0000000000000000
[ +0.000015] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ +0.000013] CR2: ffffffffffffffd6 CR3: 000000010ee88000 CR4: 0000000000350ef0
[ +0.000014] Call Trace:
[ +0.000010] <TASK>
[ +0.000010] dc_stream_program_cursor_position+0x344/0x920 [amdgpu]
[ +0.001086] ? __pfx_mutex_lock+0x10/0x10
[ +0.000015] ? unwind_next_frame+0x18b/0xa70
[ +0.000019] amdgpu_dm_atomic_commit_tail+0x1124/0xfa20 [amdgpu]
[ +0.001040] ? ret_from_fork_asm+0x1a/0x30
[ +0.000018] ? filter_irq_stacks+0x90/0xa0
[ +0.000022] ? __pfx_amdgpu_dm_atomic_commit_tail+0x10/0x10 [amdgpu]
[ +0.001058] ? kasan_save_track+0x18/0x70
[ +0.000015] ? kasan_save_alloc_info+0x37/0x60
[ +0.000015] ? __kasan_kmalloc+0xc3/0xd0
[ +0.000013] ? __kmalloc_cache_noprof+0x1aa/0x600
[ +0.000016] ? drm_atomic_helper_setup_commit+0x788/0x1450
[ +0.000017] ? drm_atomic_helper_commit+0x7e/0x290
[ +0.000014] ? drm_atomic_commit+0x205/0x2e0
[ +0.000015] ? process_one_work+0x629/0xf80
[ +0.000016] ? worker_thread+0x87f/0x1570
[ +0.000020] ? srso_return_thunk+0x5/0x5f
[ +0.000014] ? __kasan_check_write+0x14/0x30
[ +0.000014] ? srso_return_thunk+0x5/0x5f
[ +0.000013] ? _raw_spin_lock_irq+0x8a/0xf0
[ +0.000015] ? __pfx__raw_spin_lock_irq+0x10/0x10
[ +0.000016] ? srso_return_thunk+0x5/0x5f
[ +0.000013] ? __kasan_check_write+0x14/0x30
[ +0.000014] ? srso_return_thunk+0x5/0x5f
[ +0.000013] ? __wait_for_common+0x204/0x460
[ +0.000015] ? sched_clock_noinstr+0x9/0x10
[ +0.000014] ? __pfx_schedule_timeout+0x10/0x10
[ +0.000014] ? local_clock_noinstr+0xe/0xd0
[ +0.000015] ? __pfx___wait_for_common+0x10/0x10
[ +0.000014] ? srso_return_thunk+0x5/0x5f
[ +0.000013] ? __wait_for_common+0x204/0x460
[ +0.000014] ? __pfx_schedule_timeout+0x10/0x10
[ +0.000015] ? __kasan_kmalloc+0xc3/0xd0
[ +0.000015] ? srso_return_thunk+0x5/0x5f
[ +0.000013] ? wait_for_completion_timeout+0x1d/0x30
[ +0.000015] ? srso_return_thunk+0x5/0x5f
[ +0.000013] ? drm_crtc_commit_wait+0x32/0x180
[ +0.000015] ? srso_return_thunk+0x5/0x5f
[ +0.000013] ? drm_atomic_helper_wait_for_dependencies+0x46a/0x800
[ +0.000019] commit_tail+0x231/0x510
[ +0.000017] drm_atomic_helper_commit+0x219/0x290
[ +0.000015] ? __pfx_drm_atomic_helper_commit+0x10/0x10
[ +0.000016] drm_atomic_commit+0x205/0x2e0
[ +0.000014] ? __pfx_drm_atomic_commit+0x10/0x10
[ +0.000013] ? __pfx_drm_connector_free+0x10/0x10
[ +0.000014] ? __pfx___drm_printfn_info+0x10/0x10
[ +0.000017] ? srso_return_thunk+0x5/0x5f
[ +0.000013] ? drm_atomic_set_crtc_for_connector+0x49e/0x660
[ +0.000015] ? drm_atomic_set_fb_for_plane+0x155/0x290
[ +0.000015] drm_framebuffer_remove+0xa9b/0x1240
[ +0.000014] ? finish_task_switch.isra.0+0x15a/0x840
[ +0.000015] ? __switch_to+0x385/0xda0
[ +0.000015] ? srso_safe_ret+0x1/0x20
[ +0.000013] ? __pfx_drm_framebuffer_remove+0x10/0x10
[ +0.000016] ? kasan_print_address_stack_frame+0x221/0x280
[ +0.000015] drm_mode_rmfb_work_fn+0x14b/0x240
[ +0.000015] process_one_work+0x629/0xf80
[ +0.000012] ? srso_return_thunk+0x5/0x5f
[ +0.000013] ? __kasan_check_write+0x14/0x30
[ +0.000019] worker_thread+0x87f/0x1570
[ +0.000013] ? __pfx__raw_spin_lock_irqsave+0x10/0x10
[ +0.000014] ? __pfx_try_to_wake_up+0x10/0x10
[ +0.000017] ? srso_return_thunk+0x5/0x5f
[ +0.000013] ? kasan_print_address_stack_frame+0x227/0x280
[ +0.000017] ? __pfx_worker_thread+0x10/0x10
[ +0.000014] kthread+0x396/0x830
[ +0.000013] ? __pfx__raw_spin_lock_irq+0x10/0x10
[ +0.000015] ? __pfx_kthread+0x10/0x10
[ +0.000012] ? srso_return_thunk+0x5/0x5f
[ +0.000013] ? __kasan_check_write+0x14/0x30
[ +0.000014] ? srso_return_thunk+0x5/0x5f
[ +0.000013] ? recalc_sigpending+0x180/0x210
[ +0.000015] ? srso_return_thunk+0x5/0x5f
[ +0.000013] ? __pfx_kthread+0x10/0x10
[ +0.000014] ret_from_fork+0x31c/0x3e0
[ +0.000014] ? __pfx_kthread+0x10/0x10
[ +0.000013] ret_from_fork_asm+0x1a/0x30
[ +0.000019] </TASK>
[ +0.000010] Modules linked in: rfcomm(E) cmac(E) algif_hash(E) algif_skcipher(E) af_alg(E) snd_seq_dummy(E) snd_hrtimer(E) qrtr(E) xt_MASQUERADE(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) xt_mark(E) xt_tcpudp(E) nft_compat(E) nf_tables(E) x_tables(E) bnep(E) snd_hda_codec_alc882(E) snd_hda_codec_atihdmi(E) snd_hda_codec_realtek_lib(E) snd_hda_codec_hdmi(E) snd_hda_codec_generic(E) iwlmvm(E) snd_hda_intel(E) binfmt_misc(E) snd_hda_codec(E) snd_hda_core(E) mac80211(E) snd_intel_dspcfg(E) snd_intel_sdw_acpi(E) snd_hwdep(E) snd_pcm(E) libarc4(E) snd_seq_midi(E) snd_seq_midi_event(E) snd_rawmidi(E) amd_atl(E) intel_rapl_msr(E) snd_seq(E) intel_rapl_common(E) iwlwifi(E) jc42(E) snd_seq_device(E) btusb(E) snd_timer(E) btmtk(E) btrtl(E) edac_mce_amd(E) eeepc_wmi(E) polyval_clmulni(E) btbcm(E) ghash_clmulni_intel(E) asus_wmi(E) ee1004(E) platform_profile(E) btintel(E) snd(E) nls_iso8859_1(E) aesni_intel(E) soundcore(E) i2c_piix4(E) cfg80211(E) sparse_keymap(E) wmi_bmof(E) bluetooth(E) k10temp(E) rapl(E)
[ +0.000300] i2c_smbus(E) ccp(E) joydev(E) input_leds(E) gpio_amdpt(E) mac_hid(E) sch_fq_codel(E) msr(E) parport_pc(E) ppdev(E) lp(E) parport(E) efi_pstore(E) nfnetlink(E) dmi_sysfs(E) autofs4(E) cdc_ether(E) usbnet(E) amdgpu(E) amdxcp(E) hid_generic(E) i2c_algo_bit(E) drm_ttm_helper(E) ttm(E) drm_exec(E) drm_panel_backlight_quirks(E) gpu_sched(E) drm_suballoc_helper(E) video(E) drm_buddy(E) usbhid(E) drm_display_helper(E) r8152(E) hid(E) mii(E) cec(E) ahci(E) rc_core(E) igc(E) libahci(E) wmi(E)
[ +0.000294] CR2: 0000000000000000
[ +0.000013] ---[ end trace 0000000000000000 ]---
The crash happens when we unconditionally call into the timing generator
manual trigger hook:
pipe_ctx->stream_res.tg->funcs->program_manual_trigger(...)
On some configurations the timing generator (tg), its funcs table, or the
program_manual_trigger callback can be NULL. Guard all of these before
calling the hook. If the first pipe matching the stream cannot trigger,
keep scanning to find another matching pipe with a valid hook.
The issue was originally found on Vg20/DCE 12.1
Mario successfully tested on Polaris 11/DCE 11.2
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Cc: Alexander Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Fixes: ba448f9ed62c ("drm/amd/display: mouse event trigger to boost RR when idle")
Suggested-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Vitaly Prosyak <vitaly.prosyak@amd.com>
Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Reviewed-and-tested-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[why]
Virtual encoders & hwss were grouped in a separate directory,
not aligned with dio and link component structure.
[how]
Moved virtual_link_encoder and virtual_stream_encoder to dc/dio/virtual/.
Moved virtual_link_hwss to dc/link/hwss/ and renamed to link_hwss_virtual.
Removed dc/virtual/ directory.
Updated all includes and build files (Makefiles)
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Bhuvanachandra Pinninti <bpinnint@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[Why]
DCN 3.6+ hardware supports CRC-32 polynomial in addition to the
legacy CRC-16. Enable 32-bit CRC values per color component for
improvement of precision in display validation.
[How]
When userspace sets crc_poly_mode (0=CRC-16, 1=CRC-32) via the debugfs
interface, the value is stored in dm_irq_params.crc_poly_mode. When CRC
source configuration triggers amdgpu_dm_crtc_configure_crc_source(),
crc_poly_mode is retrieved from dm_irq_params and passed to
dc_stream_configure_crc().
In the DC layer, dc_stream_configure_crc() sets crc_poly_mode into the
crc_params structure and passes it to optc35_configure_crc(). If the
hardware supports the OTG_CRC_POLY_SEL register, the register is
programmed to select CRC-16 or CRC-32 polynomial.
When reading CRC values, optc35_get_crc() checks whether CRC32 register
masks are available. If present, it reads 32-bit CRC values from
OTG_CRC0/1_DATA_R32/G32/B32 registers; otherwise, it falls back
to reading 16-bit CRC values from legacy OTG_CRC0/1_DATA_RG/B
registers.
Reviewed-by: ChiaHsuan Chung <chiahsuan.chung@amd.com>
Signed-off-by: Chenyu Chen <chen-yu.chen@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Dan Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[Why]
Post-driver cases always use linear tiling yet there is no dedicated
Gfx handling for this condition.
[How]
Add DcGfxBase/DalGfxBase to gfx version enums and set tiling to linear
when it is used. Also, enforce the use of proper tiling format as tiling
information is used.
Reviewed-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Nicholas Carbones <ncarbone@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Dan Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[WHY]
To correctly control external panel replay fsm.
[HOW]
1. External panel replay is 1-A option only now.
2. Update cursor update and dirty rects commands for external
panel replay support.
3. Add external panel replay support flag in dc.
Reviewed-by: Robin Chen <robin.chen@amd.com>
Signed-off-by: Peichen Huang <PeiChen.Huang@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Dan Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[WHY+HOW]
Add trigger event to boost refresh rate on mouse movement.
Reviewed-by: Jun Lei <jun.lei@amd.com>
Signed-off-by: Muaaz Nisar <muanisar@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Dan Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Existing version check that limits the sequence to clear update flags
should be performed for all asics. Exclude DCE asics for now.
Reviewed-by: Sun peng (Leo) Li <sunpeng.li@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Dan Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[how]
- Consolidate memory QoS measurement functions into a single interface for
better maintainability and usability.
- Update function naming for improved clarity.
- Unify latency measurements into a single function call with update
programming sequence.
- Add `start_measuring_urgent_assertion_count` and
`get_urgent_assertion_count` interfaces.
- Add `start_measuring_prefetch_data_size` and `get_prefetch_data_size`
interfaces.
- Update start_measuring_unbounded_bandwidth implementation to measure 200
data returns in the middle of prefetch window.
Reviewed-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Wenjing Liu <wenjing.liu@amd.com>
Signed-off-by: Matthew Stewart <matthew.stewart2@amd.com>
Tested-by: Dan Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[Why]
The update v3 path got refactored into new functions, which happened just
before the previous implementation was submitted, which resulted in the
optimizations not executing. This commit re-implements the same logic in
the new codepath.
Reviewed-by: Aric Cyr <aric.cyr@amd.com>
Signed-off-by: Joshua Aberback <joshua.aberback@amd.com>
Signed-off-by: Matthew Stewart <matthew.stewart2@amd.com>
Tested-by: Dan Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[WHY]
Panel Replay is not an eDP-specific function.
[HOW]
Create new Panel Replay source files and move the Panel Replay
functions from the eDP files to the new files. Additionally, create
a new link_service construct function to assign the related
function pointers.
Reviewed-by: Robin Chen <robin.chen@amd.com>
Signed-off-by: Peichen Huang <PeiChen.Huang@amd.com>
Signed-off-by: Matthew Stewart <matthew.stewart2@amd.com>
Tested-by: Dan Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[Why]
Reenable new split implementation, previously partially reverted due
to issues with ODM on high-bandwidth displays 4k144Hz, resulting
in a corrupted gray screen.
Minimal flows require two separate commits, with extra intermediate
commit to enable seamless transitions, each followed by a swap. Since
new design requires commit to be run in execute and swap in cleanup
stage, an attempt was made to reorder them from CSCS (Commit-Swap-Commit-Swap)
to CCSS (Commit-Commit-Swap-Swap). Not only is this not viable, but
was implemented incorrectly as CCS, one swap missing.
[How]
* Change UPDATE_V3_FLOW_NEW_CONTEXT_MINIMAL_NEW/CURRENT to execute
and cleanup one commit, then run UPDATE_V3_FLOW_NEW_CONTEXT_SEAMLESS,
which closely matches old implementation where minimal flows fall back
to seamless.
* Fix uninitialized variable error.
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Dominik Kaszewski <dominik.kaszewski@amd.com>
Signed-off-by: Chenyu Chen <chen-yu.chen@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[Why]
The evaluation for whether we need to use the DMUB HW lock isn't the
same as whether we need to unlock which results in a hang when the
fast path is used for ASIC without FAMS support.
[How]
Store a flag that indicates whether we should use the lock and use
that same flag to specify whether unlocking is needed.
Reviewed-by: Swapnil Patel <swapnil.patel@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Chenyu Chen <chen-yu.chen@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[Why]
Currently all of the preparation and execution of plane update is done
under a DC lock, blocking other code from accessing DC for longer than
strictly necessary.
[How]
Break the v3 update flow into 3 parts:
* prepare - locked, calculate update flow and modify DC state
* execute - unlocked, program hardware
* cleanup - locked, finalize DC state and free temp resources
Legacy v2 flow too compilicated to break down for now, link new API
with old by executing everything in slightly misnamed prepare stage.
V2:
Keep the new code structure, but point all users back at the old code,
until fully tested.
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Dominik Kaszewski <dominik.kaszewski@amd.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Signed-off-by: Chenyu Chen <chen-yu.chen@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[Why]
In non-seamless pipe transitions, it can take several frames to process
a single flip. One of the reasons is the 2-step transition implementation
where first the minimal transition state is applied, then the final state
is applied, all within the same flip. This delay is noticeable to the user
in some video playback scenarios, which makes for a bad user experience.
[How]
- in applicable non-seamless cases, complete the flip with the minimal
state applied, start a counter, and create all new contexts as minimal
- if another pipe transition occurs while counting, reset the counter
- when the counter finishes, promote the current flip to a full update
and restore creation of optimized contexts
- when creating minimal states from new context, apply stream updates
Reviewed-by: Aric Cyr <aric.cyr@amd.com>
Signed-off-by: Joshua Aberback <joshua.aberback@amd.com>
Signed-off-by: Chenyu Chen <chen-yu.chen@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This can get called from an atomic context.
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4470
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|