diff options
| author | Ruijing Dong <ruijing.dong@amd.com> | 2026-03-17 13:54:11 -0400 |
|---|---|---|
| committer | Alex Deucher <alexander.deucher@amd.com> | 2026-03-23 14:47:47 -0400 |
| commit | 2d300ebfc411205fa31ba7741c5821d381912381 (patch) | |
| tree | aab698fbb90b883c44c4dfb09216e502a2cf75ec /include/linux/tc_act/git@git.tavy.me:linux.git | |
| parent | aed3d041ab061ec8a64f50a3edda0f4db7280025 (diff) | |
drm/amdgpu: fix strsep() corrupting lockup_timeout on multi-GPU (v3)
amdgpu_device_get_job_timeout_settings() passes a pointer directly
to the global amdgpu_lockup_timeout[] buffer into strsep().
strsep() destructively replaces delimiter characters with '\0'
in-place.
On multi-GPU systems, this function is called once per device.
When a multi-value setting like "0,0,0,-1" is used, the first
GPU's call transforms the global buffer into "0\00\00\0-1". The
second GPU then sees only "0" (terminated at the first '\0'),
parses a single value, hits the single-value fallthrough
(index == 1), and applies timeout=0 to all rings — causing
immediate false job timeouts.
Fix this by copying into a stack-local array before calling
strsep(), so the global module parameter buffer remains intact
across calls. The buffer is AMDGPU_MAX_TIMEOUT_PARAM_LENGTH
(256) bytes, which is safe for the stack.
v2: wrap commit message to 72 columns, add Assisted-by tag.
v3: use stack array with strscpy() instead of kstrdup()/kfree()
to avoid unnecessary heap allocation (Christian).
This patch was developed with assistance from Claude (claude-opus-4-6).
Assisted-by: Claude:claude-opus-4-6
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Ruijing Dong <ruijing.dong@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 94d79f51efecb74be1d88dde66bdc8bfcca17935)
Cc: stable@vger.kernel.org
Diffstat (limited to 'include/linux/tc_act/git@git.tavy.me:linux.git')
0 files changed, 0 insertions, 0 deletions
