<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/include/linux/gpu_buddy.h, branch master</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>drm/buddy: Improve offset-aligned allocation handling</title>
<updated>2026-03-09T07:06:10+00:00</updated>
<author>
<name>Arunpravin Paneer Selvam</name>
<email>Arunpravin.PaneerSelvam@amd.com</email>
</author>
<published>2026-03-06T06:01:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=493740d790cce709d285cd1022d16d05439b7d5b'/>
<id>493740d790cce709d285cd1022d16d05439b7d5b</id>
<content type='text'>
Large alignment requests previously forced the buddy allocator to search by
alignment order, which often caused higher-order free blocks to be split even
when a suitably aligned smaller region already existed within them. This led
to excessive fragmentation, especially for workloads requesting small sizes
with large alignment constraints.

This change prioritizes the requested allocation size during the search and
uses an augmented RB-tree field (subtree_max_alignment) to efficiently locate
free blocks that satisfy both size and offset-alignment requirements. As a
result, the allocator can directly select an aligned sub-region without
splitting larger blocks unnecessarily.

A practical example is the VKCTS test
dEQP-VK.memory.allocation.basic.size_8KiB.reverse.count_4000, which repeatedly
allocates 8 KiB buffers with a 256 KiB alignment. Previously, such allocations
caused large blocks to be split aggressively, despite smaller aligned regions
being sufficient. With this change, those aligned regions are reused directly,
significantly reducing fragmentation.

This improvement is visible in the amdgpu VRAM buddy allocator state
(/sys/kernel/debug/dri/1/amdgpu_vram_mm). After the change, higher-order blocks
are preserved and the number of low-order fragments is substantially reduced.

Before:
  order- 5 free: 1936 MiB, blocks: 15490
  order- 4 free:  967 MiB, blocks: 15486
  order- 3 free:  483 MiB, blocks: 15485
  order- 2 free:  241 MiB, blocks: 15486
  order- 1 free:  241 MiB, blocks: 30948

After:
  order- 5 free:  493 MiB, blocks:  3941
  order- 4 free:  246 MiB, blocks:  3943
  order- 3 free:  123 MiB, blocks:  4101
  order- 2 free:   61 MiB, blocks:  4101
  order- 1 free:   61 MiB, blocks:  8018

By avoiding unnecessary splits, this change improves allocator efficiency and
helps maintain larger contiguous free regions under heavy offset-aligned
allocation workloads.

v2:(Matthew)
  - Update augmented information along the path to the inserted node.

v3:
  - Move the patch to gpu/buddy.c file.

v4:(Matthew)
  - Use the helper instead of calling _ffs directly
  - Remove gpu_buddy_block_order(block) &gt;= order check and drop order
  - Drop !node check as all callers handle this already
  - Return larger than any other possible alignment for __ffs64(0)
  - Replace __ffs with __ffs64

v5:(Matthew)
  - Drop subtree_max_alignment initialization at gpu_block_alloc()

Signed-off-by: Arunpravin Paneer Selvam &lt;Arunpravin.PaneerSelvam@amd.com&gt;
Suggested-by: Christian König &lt;christian.koenig@amd.com&gt;
Reviewed-by: Matthew Auld &lt;matthew.auld@intel.com&gt;
Link: https://patch.msgid.link/20260306060155.2114-1-Arunpravin.PaneerSelvam@amd.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Large alignment requests previously forced the buddy allocator to search by
alignment order, which often caused higher-order free blocks to be split even
when a suitably aligned smaller region already existed within them. This led
to excessive fragmentation, especially for workloads requesting small sizes
with large alignment constraints.

This change prioritizes the requested allocation size during the search and
uses an augmented RB-tree field (subtree_max_alignment) to efficiently locate
free blocks that satisfy both size and offset-alignment requirements. As a
result, the allocator can directly select an aligned sub-region without
splitting larger blocks unnecessarily.

A practical example is the VKCTS test
dEQP-VK.memory.allocation.basic.size_8KiB.reverse.count_4000, which repeatedly
allocates 8 KiB buffers with a 256 KiB alignment. Previously, such allocations
caused large blocks to be split aggressively, despite smaller aligned regions
being sufficient. With this change, those aligned regions are reused directly,
significantly reducing fragmentation.

This improvement is visible in the amdgpu VRAM buddy allocator state
(/sys/kernel/debug/dri/1/amdgpu_vram_mm). After the change, higher-order blocks
are preserved and the number of low-order fragments is substantially reduced.

Before:
  order- 5 free: 1936 MiB, blocks: 15490
  order- 4 free:  967 MiB, blocks: 15486
  order- 3 free:  483 MiB, blocks: 15485
  order- 2 free:  241 MiB, blocks: 15486
  order- 1 free:  241 MiB, blocks: 30948

After:
  order- 5 free:  493 MiB, blocks:  3941
  order- 4 free:  246 MiB, blocks:  3943
  order- 3 free:  123 MiB, blocks:  4101
  order- 2 free:   61 MiB, blocks:  4101
  order- 1 free:   61 MiB, blocks:  8018

By avoiding unnecessary splits, this change improves allocator efficiency and
helps maintain larger contiguous free regions under heavy offset-aligned
allocation workloads.

v2:(Matthew)
  - Update augmented information along the path to the inserted node.

v3:
  - Move the patch to gpu/buddy.c file.

v4:(Matthew)
  - Use the helper instead of calling _ffs directly
  - Remove gpu_buddy_block_order(block) &gt;= order check and drop order
  - Drop !node check as all callers handle this already
  - Return larger than any other possible alignment for __ffs64(0)
  - Replace __ffs with __ffs64

v5:(Matthew)
  - Drop subtree_max_alignment initialization at gpu_block_alloc()

Signed-off-by: Arunpravin Paneer Selvam &lt;Arunpravin.PaneerSelvam@amd.com&gt;
Suggested-by: Christian König &lt;christian.koenig@amd.com&gt;
Reviewed-by: Matthew Auld &lt;matthew.auld@intel.com&gt;
Link: https://patch.msgid.link/20260306060155.2114-1-Arunpravin.PaneerSelvam@amd.com
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/buddy: Move internal helpers to buddy.c</title>
<updated>2026-02-23T06:51:05+00:00</updated>
<author>
<name>Sanjay Yadav</name>
<email>sanjay.kumar.yadav@intel.com</email>
</author>
<published>2026-02-12T09:25:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=df8c7892e06efa5df2aa780a338f33a4f666370b'/>
<id>df8c7892e06efa5df2aa780a338f33a4f666370b</id>
<content type='text'>
Move gpu_buddy_block_state(), gpu_buddy_block_is_allocated(),
and gpu_buddy_block_is_split() from gpu_buddy.h to gpu_buddy.c
as static functions since they have no external callers.

Remove gpu_get_buddy() as it was an unused exported wrapper
around the internal __get_buddy().

No functional changes.

v2:
- Rebased after DRM buddy allocator moved to drivers/gpu/
- Keep gpu_buddy_block_is_free() in header since it's now
  used by drm_buddy.c
- Updated commit message

Cc: Christian König &lt;christian.koenig@amd.com&gt;
Cc: Arunpravin Paneer Selvam &lt;Arunpravin.PaneerSelvam@amd.com&gt;
Suggested-by: Matthew Auld &lt;matthew.auld@intel.com&gt;
Signed-off-by: Sanjay Yadav &lt;sanjay.kumar.yadav@intel.com&gt;
Reviewed-by: Arunpravin Paneer Selvam &lt;Arunpravin.PaneerSelvam@amd.com&gt;
Signed-off-by: Arunpravin Paneer Selvam &lt;Arunpravin.PaneerSelvam@amd.com&gt;
Link: https://patch.msgid.link/20260212092527.718455-6-sanjay.kumar.yadav@intel.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Move gpu_buddy_block_state(), gpu_buddy_block_is_allocated(),
and gpu_buddy_block_is_split() from gpu_buddy.h to gpu_buddy.c
as static functions since they have no external callers.

Remove gpu_get_buddy() as it was an unused exported wrapper
around the internal __get_buddy().

No functional changes.

v2:
- Rebased after DRM buddy allocator moved to drivers/gpu/
- Keep gpu_buddy_block_is_free() in header since it's now
  used by drm_buddy.c
- Updated commit message

Cc: Christian König &lt;christian.koenig@amd.com&gt;
Cc: Arunpravin Paneer Selvam &lt;Arunpravin.PaneerSelvam@amd.com&gt;
Suggested-by: Matthew Auld &lt;matthew.auld@intel.com&gt;
Signed-off-by: Sanjay Yadav &lt;sanjay.kumar.yadav@intel.com&gt;
Reviewed-by: Arunpravin Paneer Selvam &lt;Arunpravin.PaneerSelvam@amd.com&gt;
Signed-off-by: Arunpravin Paneer Selvam &lt;Arunpravin.PaneerSelvam@amd.com&gt;
Link: https://patch.msgid.link/20260212092527.718455-6-sanjay.kumar.yadav@intel.com
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/buddy: Add kernel-doc for allocator structures and flags</title>
<updated>2026-02-23T06:50:34+00:00</updated>
<author>
<name>Sanjay Yadav</name>
<email>sanjay.kumar.yadav@intel.com</email>
</author>
<published>2026-02-12T09:25:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=5cab6d386bd30c3bb4efceb05b25842a6f144693'/>
<id>5cab6d386bd30c3bb4efceb05b25842a6f144693</id>
<content type='text'>
Add missing kernel-doc for GPU buddy allocator flags,
gpu_buddy_block, and gpu_buddy. The documentation covers block
header fields, allocator roots, free trees, and allocation flags
such as RANGE, TOPDOWN, CONTIGUOUS, CLEAR, and TRIM_DISABLE.
Private members are marked with kernel-doc private markers
and documented with regular comments.

No functional changes.

v2:
- Corrected GPU_BUDDY_CLEAR_TREE and GPU_BUDDY_DIRTY_TREE index
  values (Arun)
- Rebased after DRM buddy allocator moved to drivers/gpu/
- Updated commit message

v3:
- Document reserved bits 8:6 in header layout (Arun)
- Fix checkpatch warning

Cc: Christian König &lt;christian.koenig@amd.com&gt;
Cc: Arunpravin Paneer Selvam &lt;Arunpravin.PaneerSelvam@amd.com&gt;
Suggested-by: Matthew Auld &lt;matthew.auld@intel.com&gt;
Signed-off-by: Sanjay Yadav &lt;sanjay.kumar.yadav@intel.com&gt;
Reviewed-by: Arunpravin Paneer Selvam &lt;Arunpravin.PaneerSelvam@amd.com&gt;
Signed-off-by: Arunpravin Paneer Selvam &lt;Arunpravin.PaneerSelvam@amd.com&gt;
Link: https://patch.msgid.link/20260212092527.718455-5-sanjay.kumar.yadav@intel.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add missing kernel-doc for GPU buddy allocator flags,
gpu_buddy_block, and gpu_buddy. The documentation covers block
header fields, allocator roots, free trees, and allocation flags
such as RANGE, TOPDOWN, CONTIGUOUS, CLEAR, and TRIM_DISABLE.
Private members are marked with kernel-doc private markers
and documented with regular comments.

No functional changes.

v2:
- Corrected GPU_BUDDY_CLEAR_TREE and GPU_BUDDY_DIRTY_TREE index
  values (Arun)
- Rebased after DRM buddy allocator moved to drivers/gpu/
- Updated commit message

v3:
- Document reserved bits 8:6 in header layout (Arun)
- Fix checkpatch warning

Cc: Christian König &lt;christian.koenig@amd.com&gt;
Cc: Arunpravin Paneer Selvam &lt;Arunpravin.PaneerSelvam@amd.com&gt;
Suggested-by: Matthew Auld &lt;matthew.auld@intel.com&gt;
Signed-off-by: Sanjay Yadav &lt;sanjay.kumar.yadav@intel.com&gt;
Reviewed-by: Arunpravin Paneer Selvam &lt;Arunpravin.PaneerSelvam@amd.com&gt;
Signed-off-by: Arunpravin Paneer Selvam &lt;Arunpravin.PaneerSelvam@amd.com&gt;
Link: https://patch.msgid.link/20260212092527.718455-5-sanjay.kumar.yadav@intel.com
</pre>
</div>
</content>
</entry>
<entry>
<title>gpu: Move DRM buddy allocator one level up (part two)</title>
<updated>2026-02-06T01:38:35+00:00</updated>
<author>
<name>Joel Fernandes</name>
<email>joelagnelf@nvidia.com</email>
</author>
<published>2026-02-05T22:52:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=ba110db8e1bc206c13fd7d985e79b033f53bfdea'/>
<id>ba110db8e1bc206c13fd7d985e79b033f53bfdea</id>
<content type='text'>
Move the DRM buddy allocator one level up so that it can be used by GPU
drivers (example, nova-core) that have usecases other than DRM (such as
VFIO vGPU support). Modify the API, structures and Kconfigs to use
"gpu_buddy" terminology. Adapt the drivers and tests to use the new API.

The commit cannot be split due to bisectability, however no functional
change is intended. Verified by running K-UNIT tests and build tested
various configurations.

Signed-off-by: Joel Fernandes &lt;joelagnelf@nvidia.com&gt;
Reviewed-by: Dave Airlie &lt;airlied@redhat.com&gt;
[airlied: I've split this into two so git can find copies easier.
I've also just nuked drm_random library, that stuff needs to be done
elsewhere and only the buddy tests seem to be using it].
Signed-off-by: Dave Airlie &lt;airlied@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Move the DRM buddy allocator one level up so that it can be used by GPU
drivers (example, nova-core) that have usecases other than DRM (such as
VFIO vGPU support). Modify the API, structures and Kconfigs to use
"gpu_buddy" terminology. Adapt the drivers and tests to use the new API.

The commit cannot be split due to bisectability, however no functional
change is intended. Verified by running K-UNIT tests and build tested
various configurations.

Signed-off-by: Joel Fernandes &lt;joelagnelf@nvidia.com&gt;
Reviewed-by: Dave Airlie &lt;airlied@redhat.com&gt;
[airlied: I've split this into two so git can find copies easier.
I've also just nuked drm_random library, that stuff needs to be done
elsewhere and only the buddy tests seem to be using it].
Signed-off-by: Dave Airlie &lt;airlied@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>gpu: Move DRM buddy allocator one level up (part one)</title>
<updated>2026-02-06T01:34:02+00:00</updated>
<author>
<name>Joel Fernandes</name>
<email>joelagnelf@nvidia.com</email>
</author>
<published>2026-02-05T22:52:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=4a9671a03f2be13acde0cb15c5208767a9cc56e4'/>
<id>4a9671a03f2be13acde0cb15c5208767a9cc56e4</id>
<content type='text'>
Move the DRM buddy allocator one level up so that it can be used by GPU
drivers (example, nova-core) that have usecases other than DRM (such as
VFIO vGPU support). Modify the API, structures and Kconfigs to use
"gpu_buddy" terminology. Adapt the drivers and tests to use the new API.

The commit cannot be split due to bisectability, however no functional
change is intended. Verified by running K-UNIT tests and build tested
various configurations.

Signed-off-by: Joel Fernandes &lt;joelagnelf@nvidia.com&gt;
Reviewed-by: Dave Airlie &lt;airlied@redhat.com&gt;
[airlied: I've split this into two so git can find copies easier.
I've also just nuked drm_random library, that stuff needs to be done
elsewhere and only the buddy tests seem to be using it].
Signed-off-by: Dave Airlie &lt;airlied@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Move the DRM buddy allocator one level up so that it can be used by GPU
drivers (example, nova-core) that have usecases other than DRM (such as
VFIO vGPU support). Modify the API, structures and Kconfigs to use
"gpu_buddy" terminology. Adapt the drivers and tests to use the new API.

The commit cannot be split due to bisectability, however no functional
change is intended. Verified by running K-UNIT tests and build tested
various configurations.

Signed-off-by: Joel Fernandes &lt;joelagnelf@nvidia.com&gt;
Reviewed-by: Dave Airlie &lt;airlied@redhat.com&gt;
[airlied: I've split this into two so git can find copies easier.
I've also just nuked drm_random library, that stuff needs to be done
elsewhere and only the buddy tests seem to be using it].
Signed-off-by: Dave Airlie &lt;airlied@redhat.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
