<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c, branch linux-rolling-stable</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>drm/vmwgfx: Return the correct value in vmw_translate_ptr functions</title>
<updated>2026-03-12T11:09:06+00:00</updated>
<author>
<name>Ian Forbes</name>
<email>ian.forbes@broadcom.com</email>
</author>
<published>2026-01-13T17:53:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=149f028772fa2879d9316b924ce948a6a0877e45'/>
<id>149f028772fa2879d9316b924ce948a6a0877e45</id>
<content type='text'>
[ Upstream commit 5023ca80f9589295cb60735016e39fc5cc714243 ]

Before the referenced fixes these functions used a lookup function that
returned a pointer. This was changed to another lookup function that
returned an error code with the pointer becoming an out parameter.

The error path when the lookup failed was not changed to reflect this
change and the code continued to return the PTR_ERR of the now
uninitialized pointer. This could cause the vmw_translate_ptr functions
to return success when they actually failed causing further uninitialized
and OOB accesses.

Reported-by: Kuzey Arda Bulut &lt;kuzeyardabulut@gmail.com&gt;
Fixes: a309c7194e8a ("drm/vmwgfx: Remove rcu locks from user resources")
Signed-off-by: Ian Forbes &lt;ian.forbes@broadcom.com&gt;
Reviewed-by: Zack Rusin &lt;zack.rusin@broadcom.com&gt;
Signed-off-by: Zack Rusin &lt;zack.rusin@broadcom.com&gt;
Link: https://patch.msgid.link/20260113175357.129285-1-ian.forbes@broadcom.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 5023ca80f9589295cb60735016e39fc5cc714243 ]

Before the referenced fixes these functions used a lookup function that
returned a pointer. This was changed to another lookup function that
returned an error code with the pointer becoming an out parameter.

The error path when the lookup failed was not changed to reflect this
change and the code continued to return the PTR_ERR of the now
uninitialized pointer. This could cause the vmw_translate_ptr functions
to return success when they actually failed causing further uninitialized
and OOB accesses.

Reported-by: Kuzey Arda Bulut &lt;kuzeyardabulut@gmail.com&gt;
Fixes: a309c7194e8a ("drm/vmwgfx: Remove rcu locks from user resources")
Signed-off-by: Ian Forbes &lt;ian.forbes@broadcom.com&gt;
Reviewed-by: Zack Rusin &lt;zack.rusin@broadcom.com&gt;
Signed-off-by: Zack Rusin &lt;zack.rusin@broadcom.com&gt;
Link: https://patch.msgid.link/20260113175357.129285-1-ian.forbes@broadcom.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/vmwgfx: Validate command header size against SVGA_CMD_MAX_DATASIZE</title>
<updated>2025-11-07T04:59:40+00:00</updated>
<author>
<name>Ian Forbes</name>
<email>ian.forbes@broadcom.com</email>
</author>
<published>2025-10-21T19:01:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=32b415a9dc2c212e809b7ebc2b14bc3fbda2b9af'/>
<id>32b415a9dc2c212e809b7ebc2b14bc3fbda2b9af</id>
<content type='text'>
This data originates from userspace and is used in buffer offset
calculations which could potentially overflow causing an out-of-bounds
access.

Fixes: 8ce75f8ab904 ("drm/vmwgfx: Update device includes for DX device functionality")
Reported-by: Rohit Keshri &lt;rkeshri@redhat.com&gt;
Signed-off-by: Ian Forbes &lt;ian.forbes@broadcom.com&gt;
Reviewed-by: Maaz Mombasawala &lt;maaz.mombasawala@broadcom.com&gt;
Signed-off-by: Zack Rusin &lt;zack.rusin@broadcom.com&gt;
Link: https://patch.msgid.link/20251021190128.13014-1-ian.forbes@broadcom.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This data originates from userspace and is used in buffer offset
calculations which could potentially overflow causing an out-of-bounds
access.

Fixes: 8ce75f8ab904 ("drm/vmwgfx: Update device includes for DX device functionality")
Reported-by: Rohit Keshri &lt;rkeshri@redhat.com&gt;
Signed-off-by: Ian Forbes &lt;ian.forbes@broadcom.com&gt;
Reviewed-by: Maaz Mombasawala &lt;maaz.mombasawala@broadcom.com&gt;
Signed-off-by: Zack Rusin &lt;zack.rusin@broadcom.com&gt;
Link: https://patch.msgid.link/20251021190128.13014-1-ian.forbes@broadcom.com
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/vmwgfx: Fix a null-ptr access in the cursor snooper</title>
<updated>2025-10-06T15:57:17+00:00</updated>
<author>
<name>Zack Rusin</name>
<email>zack.rusin@broadcom.com</email>
</author>
<published>2025-09-17T15:36:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=5ac2c0279053a2c5265d46903432fb26ae2d0da2'/>
<id>5ac2c0279053a2c5265d46903432fb26ae2d0da2</id>
<content type='text'>
Check that the resource which is converted to a surface exists before
trying to use the cursor snooper on it.

vmw_cmd_res_check allows explicit invalid (SVGA3D_INVALID_ID) identifiers
because some svga commands accept SVGA3D_INVALID_ID to mean "no surface",
unfortunately functions that accept the actual surfaces as objects might
(and in case of the cursor snooper, do not) be able to handle null
objects. Make sure that we validate not only the identifier (via the
vmw_cmd_res_check) but also check that the actual resource exists before
trying to do something with it.

Fixes unchecked null-ptr reference in the snooping code.

Signed-off-by: Zack Rusin &lt;zack.rusin@broadcom.com&gt;
Fixes: c0951b797e7d ("drm/vmwgfx: Refactor resource management")
Reported-by: Kuzey Arda Bulut &lt;kuzeyardabulut@gmail.com&gt;
Cc: Broadcom internal kernel review list &lt;bcm-kernel-feedback-list@broadcom.com&gt;
Cc: dri-devel@lists.freedesktop.org
Reviewed-by: Ian Forbes &lt;ian.forbes@broadcom.com&gt;
Link: https://lore.kernel.org/r/20250917153655.1968583-1-zack.rusin@broadcom.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Check that the resource which is converted to a surface exists before
trying to use the cursor snooper on it.

vmw_cmd_res_check allows explicit invalid (SVGA3D_INVALID_ID) identifiers
because some svga commands accept SVGA3D_INVALID_ID to mean "no surface",
unfortunately functions that accept the actual surfaces as objects might
(and in case of the cursor snooper, do not) be able to handle null
objects. Make sure that we validate not only the identifier (via the
vmw_cmd_res_check) but also check that the actual resource exists before
trying to do something with it.

Fixes unchecked null-ptr reference in the snooping code.

Signed-off-by: Zack Rusin &lt;zack.rusin@broadcom.com&gt;
Fixes: c0951b797e7d ("drm/vmwgfx: Refactor resource management")
Reported-by: Kuzey Arda Bulut &lt;kuzeyardabulut@gmail.com&gt;
Cc: Broadcom internal kernel review list &lt;bcm-kernel-feedback-list@broadcom.com&gt;
Cc: dri-devel@lists.freedesktop.org
Reviewed-by: Ian Forbes &lt;ian.forbes@broadcom.com&gt;
Link: https://lore.kernel.org/r/20250917153655.1968583-1-zack.rusin@broadcom.com
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/vmwgfx: Implement dma_fence_ops properly</title>
<updated>2025-06-18T02:49:33+00:00</updated>
<author>
<name>Ian Forbes</name>
<email>ian.forbes@broadcom.com</email>
</author>
<published>2025-05-30T18:35:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=db6a94b263541af5678e520fb3959b03ec88ad2b'/>
<id>db6a94b263541af5678e520fb3959b03ec88ad2b</id>
<content type='text'>
vmwgfx's fencing predates dma_fence and as a result dma_fence_ops was never
properly implemented, especially with respect to enabling signaling.

Because of this dma_fence callbacks don't work properly. This change
implements enable_signaling properly so that dma_fence callbacks now
work as expected.

It also removes vmwgfx's custom implementation of fence callbacks
and removes vmwgfx's custom dma_fence_ops::wait function which is no
longer necessary now that enable_signaling works.

Signed-off-by: Ian Forbes &lt;ian.forbes@broadcom.com&gt;
Signed-off-by: Zack Rusin &lt;zack.rusin@broadcom.com&gt;
Link: https://lore.kernel.org/r/20250530183510.733175-2-ian.forbes@broadcom.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
vmwgfx's fencing predates dma_fence and as a result dma_fence_ops was never
properly implemented, especially with respect to enabling signaling.

Because of this dma_fence callbacks don't work properly. This change
implements enable_signaling properly so that dma_fence callbacks now
work as expected.

It also removes vmwgfx's custom implementation of fence callbacks
and removes vmwgfx's custom dma_fence_ops::wait function which is no
longer necessary now that enable_signaling works.

Signed-off-by: Ian Forbes &lt;ian.forbes@broadcom.com&gt;
Signed-off-by: Zack Rusin &lt;zack.rusin@broadcom.com&gt;
Link: https://lore.kernel.org/r/20250530183510.733175-2-ian.forbes@broadcom.com
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/vmwgfx: Update last_read_seqno under the fence lock</title>
<updated>2025-06-18T02:49:31+00:00</updated>
<author>
<name>Ian Forbes</name>
<email>ian.forbes@broadcom.com</email>
</author>
<published>2025-05-30T18:35:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=c82f55f4aa57bf5ba412d55856fe50514b47b971'/>
<id>c82f55f4aa57bf5ba412d55856fe50514b47b971</id>
<content type='text'>
There was a possible race in vmw_update_seqno. Because of this race it
was possible for last_read_seqno to go backwards. Remove this function
and replace it with vmw_update_fences which now sets and returns the
last_read_seqno while holding the fence lock. This serialization via the
fence lock ensures that last_read_seqno is monotonic again.

Signed-off-by: Ian Forbes &lt;ian.forbes@broadcom.com&gt;
Signed-off-by: Zack Rusin &lt;zack.rusin@broadcom.com&gt;
Link: https://lore.kernel.org/r/20250530183510.733175-1-ian.forbes@broadcom.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
There was a possible race in vmw_update_seqno. Because of this race it
was possible for last_read_seqno to go backwards. Remove this function
and replace it with vmw_update_fences which now sets and returns the
last_read_seqno while holding the fence lock. This serialization via the
fence lock ensures that last_read_seqno is monotonic again.

Signed-off-by: Ian Forbes &lt;ian.forbes@broadcom.com&gt;
Signed-off-by: Zack Rusin &lt;zack.rusin@broadcom.com&gt;
Link: https://lore.kernel.org/r/20250530183510.733175-1-ian.forbes@broadcom.com
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/vmwgfx: Add seqno waiter for sync_files</title>
<updated>2025-03-10T18:31:43+00:00</updated>
<author>
<name>Ian Forbes</name>
<email>ian.forbes@broadcom.com</email>
</author>
<published>2025-02-28T20:06:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=0039a3b35b10d9c15d3d26320532ab56cc566750'/>
<id>0039a3b35b10d9c15d3d26320532ab56cc566750</id>
<content type='text'>
Because sync_files are passive waiters they do not participate in
the processing of fences like the traditional vmw_fence_wait IOCTL.
If userspace exclusively uses sync_files for synchronization then
nothing in the kernel actually processes fence updates as interrupts
for fences are masked and ignored if the kernel does not indicate to the
SVGA device that there are active waiters.

This oversight results in a bug where the entire GUI can freeze waiting
on a sync_file that will never be signalled as we've masked the interrupts
to signal its completion. This bug is incredibly racy as any process which
interacts with the fencing code via the 3D stack can process the stuck
fences on behalf of the stuck process causing it to run again. Even a
simple app like eglinfo is enough to resume the stuck process. Usually
this bug is seen at a login screen like GDM because there are no other
3D apps running.

By adding a seqno waiter we re-enable interrupt based processing of the
dma_fences associated with the sync_file which is signalled as part of a
dma_fence_callback.

This has likely been broken since it was initially added to the kernel in
2017 but has gone unnoticed until mutter recently started using sync_files
heavily over the course of 2024 as part of their explicit sync support.

Fixes: c906965dee22 ("drm/vmwgfx: Add export fence to file descriptor support")
Signed-off-by: Ian Forbes &lt;ian.forbes@broadcom.com&gt;
Signed-off-by: Zack Rusin &lt;zack.rusin@broadcom.com&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20250228200633.642417-1-ian.forbes@broadcom.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Because sync_files are passive waiters they do not participate in
the processing of fences like the traditional vmw_fence_wait IOCTL.
If userspace exclusively uses sync_files for synchronization then
nothing in the kernel actually processes fence updates as interrupts
for fences are masked and ignored if the kernel does not indicate to the
SVGA device that there are active waiters.

This oversight results in a bug where the entire GUI can freeze waiting
on a sync_file that will never be signalled as we've masked the interrupts
to signal its completion. This bug is incredibly racy as any process which
interacts with the fencing code via the 3D stack can process the stuck
fences on behalf of the stuck process causing it to run again. Even a
simple app like eglinfo is enough to resume the stuck process. Usually
this bug is seen at a login screen like GDM because there are no other
3D apps running.

By adding a seqno waiter we re-enable interrupt based processing of the
dma_fences associated with the sync_file which is signalled as part of a
dma_fence_callback.

This has likely been broken since it was initially added to the kernel in
2017 but has gone unnoticed until mutter recently started using sync_files
heavily over the course of 2024 as part of their explicit sync support.

Fixes: c906965dee22 ("drm/vmwgfx: Add export fence to file descriptor support")
Signed-off-by: Ian Forbes &lt;ian.forbes@broadcom.com&gt;
Signed-off-by: Zack Rusin &lt;zack.rusin@broadcom.com&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20250228200633.642417-1-ian.forbes@broadcom.com
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/vmwgfx: Refactor cursor handling</title>
<updated>2025-03-10T18:30:33+00:00</updated>
<author>
<name>Zack Rusin</name>
<email>zack.rusin@broadcom.com</email>
</author>
<published>2025-03-07T12:57:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=965544150d1cadf0e8f5bb6c13c19697e46e1429'/>
<id>965544150d1cadf0e8f5bb6c13c19697e46e1429</id>
<content type='text'>
Refactor cursor handling to make the code maintainable again. Over the
last 12 years the svga device improved support for virtualized cursors
and at the same time the drm interfaces evolved quite a bit from
pre-atomic to current atomic ones. vmwgfx only added new code over
the years, instead of adjusting/refactoring the paths.

Export the cursor plane handling to its own file. Remove special
handling of the legacy cursor support to make it fit within the global
cursor plane mechanism.

Finally redo dirty tracking because memcmp never worked correctly
resulting in the cursor not being properly updated in the guest.

Signed-off-by: Zack Rusin &lt;zack.rusin@broadcom.com&gt;
Reviewed-by: Maaz Mombasawala &lt;maaz.mombasawala@broadcom.com&gt;
Reviewed-by: Martin Krastev &lt;martin.krastev@broadcom.com&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20250307125836.3877138-2-zack.rusin@broadcom.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Refactor cursor handling to make the code maintainable again. Over the
last 12 years the svga device improved support for virtualized cursors
and at the same time the drm interfaces evolved quite a bit from
pre-atomic to current atomic ones. vmwgfx only added new code over
the years, instead of adjusting/refactoring the paths.

Export the cursor plane handling to its own file. Remove special
handling of the legacy cursor support to make it fit within the global
cursor plane mechanism.

Finally redo dirty tracking because memcmp never worked correctly
resulting in the cursor not being properly updated in the guest.

Signed-off-by: Zack Rusin &lt;zack.rusin@broadcom.com&gt;
Reviewed-by: Maaz Mombasawala &lt;maaz.mombasawala@broadcom.com&gt;
Reviewed-by: Martin Krastev &lt;martin.krastev@broadcom.com&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20250307125836.3877138-2-zack.rusin@broadcom.com
</pre>
</div>
</content>
</entry>
<entry>
<title>fix missing vmalloc.h includes</title>
<updated>2024-04-26T03:55:49+00:00</updated>
<author>
<name>Kent Overstreet</name>
<email>kent.overstreet@linux.dev</email>
</author>
<published>2024-03-21T16:36:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=0069455bcbf9ea73ffe4553ed6d2b4e4cad703de'/>
<id>0069455bcbf9ea73ffe4553ed6d2b4e4cad703de</id>
<content type='text'>
Patch series "Memory allocation profiling", v6.

Overview:
Low overhead [1] per-callsite memory allocation profiling. Not just for
debug kernels, overhead low enough to be deployed in production.

Example output:
  root@moria-kvm:~# sort -rn /proc/allocinfo
   127664128    31168 mm/page_ext.c:270 func:alloc_page_ext
    56373248     4737 mm/slub.c:2259 func:alloc_slab_page
    14880768     3633 mm/readahead.c:247 func:page_cache_ra_unbounded
    14417920     3520 mm/mm_init.c:2530 func:alloc_large_system_hash
    13377536      234 block/blk-mq.c:3421 func:blk_mq_alloc_rqs
    11718656     2861 mm/filemap.c:1919 func:__filemap_get_folio
     9192960     2800 kernel/fork.c:307 func:alloc_thread_stack_node
     4206592        4 net/netfilter/nf_conntrack_core.c:2567 func:nf_ct_alloc_hashtable
     4136960     1010 drivers/staging/ctagmod/ctagmod.c:20 [ctagmod] func:ctagmod_start
     3940352      962 mm/memory.c:4214 func:alloc_anon_folio
     2894464    22613 fs/kernfs/dir.c:615 func:__kernfs_new_node
     ...

Usage:
kconfig options:
 - CONFIG_MEM_ALLOC_PROFILING
 - CONFIG_MEM_ALLOC_PROFILING_ENABLED_BY_DEFAULT
 - CONFIG_MEM_ALLOC_PROFILING_DEBUG
   adds warnings for allocations that weren't accounted because of a
   missing annotation

sysctl:
  /proc/sys/vm/mem_profiling

Runtime info:
  /proc/allocinfo

Notes:

[1]: Overhead
To measure the overhead we are comparing the following configurations:
(1) Baseline with CONFIG_MEMCG_KMEM=n
(2) Disabled by default (CONFIG_MEM_ALLOC_PROFILING=y &amp;&amp;
    CONFIG_MEM_ALLOC_PROFILING_BY_DEFAULT=n)
(3) Enabled by default (CONFIG_MEM_ALLOC_PROFILING=y &amp;&amp;
    CONFIG_MEM_ALLOC_PROFILING_BY_DEFAULT=y)
(4) Enabled at runtime (CONFIG_MEM_ALLOC_PROFILING=y &amp;&amp;
    CONFIG_MEM_ALLOC_PROFILING_BY_DEFAULT=n &amp;&amp; /proc/sys/vm/mem_profiling=1)
(5) Baseline with CONFIG_MEMCG_KMEM=y &amp;&amp; allocating with __GFP_ACCOUNT
(6) Disabled by default (CONFIG_MEM_ALLOC_PROFILING=y &amp;&amp;
    CONFIG_MEM_ALLOC_PROFILING_BY_DEFAULT=n)  &amp;&amp; CONFIG_MEMCG_KMEM=y
(7) Enabled by default (CONFIG_MEM_ALLOC_PROFILING=y &amp;&amp;
    CONFIG_MEM_ALLOC_PROFILING_BY_DEFAULT=y) &amp;&amp; CONFIG_MEMCG_KMEM=y

Performance overhead:
To evaluate performance we implemented an in-kernel test executing
multiple get_free_page/free_page and kmalloc/kfree calls with allocation
sizes growing from 8 to 240 bytes with CPU frequency set to max and CPU
affinity set to a specific CPU to minimize the noise. Below are results
from running the test on Ubuntu 22.04.2 LTS with 6.8.0-rc1 kernel on
56 core Intel Xeon:

                        kmalloc                 pgalloc
(1 baseline)            6.764s                  16.902s
(2 default disabled)    6.793s  (+0.43%)        17.007s (+0.62%)
(3 default enabled)     7.197s  (+6.40%)        23.666s (+40.02%)
(4 runtime enabled)     7.405s  (+9.48%)        23.901s (+41.41%)
(5 memcg)               13.388s (+97.94%)       48.460s (+186.71%)
(6 def disabled+memcg)  13.332s (+97.10%)       48.105s (+184.61%)
(7 def enabled+memcg)   13.446s (+98.78%)       54.963s (+225.18%)

Memory overhead:
Kernel size:

   text           data        bss         dec         diff
(1) 26515311	      18890222    17018880    62424413
(2) 26524728	      19423818    16740352    62688898    264485
(3) 26524724	      19423818    16740352    62688894    264481
(4) 26524728	      19423818    16740352    62688898    264485
(5) 26541782	      18964374    16957440    62463596    39183

Memory consumption on a 56 core Intel CPU with 125GB of memory:
Code tags:           192 kB
PageExts:         262144 kB (256MB)
SlabExts:           9876 kB (9.6MB)
PcpuExts:            512 kB (0.5MB)

Total overhead is 0.2% of total memory.

Benchmarks:

Hackbench tests run 100 times:
hackbench -s 512 -l 200 -g 15 -f 25 -P
      baseline       disabled profiling           enabled profiling
avg   0.3543         0.3559 (+0.0016)             0.3566 (+0.0023)
stdev 0.0137         0.0188                       0.0077


hackbench -l 10000
      baseline       disabled profiling           enabled profiling
avg   6.4218         6.4306 (+0.0088)             6.5077 (+0.0859)
stdev 0.0933         0.0286                       0.0489

stress-ng tests:
stress-ng --class memory --seq 4 -t 60
stress-ng --class cpu --seq 4 -t 60
Results posted at: https://evilpiepirate.org/~kent/memalloc_prof_v4_stress-ng/

[2] https://lore.kernel.org/all/20240306182440.2003814-1-surenb@google.com/


This patch (of 37):

The next patch drops vmalloc.h from a system header in order to fix a
circular dependency; this adds it to all the files that were pulling it in
implicitly.

[kent.overstreet@linux.dev: fix arch/alpha/lib/memcpy.c]
  Link: https://lkml.kernel.org/r/20240327002152.3339937-1-kent.overstreet@linux.dev
[surenb@google.com: fix arch/x86/mm/numa_32.c]
  Link: https://lkml.kernel.org/r/20240402180933.1663992-1-surenb@google.com
[kent.overstreet@linux.dev: a few places were depending on sizes.h]
  Link: https://lkml.kernel.org/r/20240404034744.1664840-1-kent.overstreet@linux.dev
[arnd@arndb.de: fix mm/kasan/hw_tags.c]
  Link: https://lkml.kernel.org/r/20240404124435.3121534-1-arnd@kernel.org
[surenb@google.com: fix arc build]
  Link: https://lkml.kernel.org/r/20240405225115.431056-1-surenb@google.com
Link: https://lkml.kernel.org/r/20240321163705.3067592-1-surenb@google.com
Link: https://lkml.kernel.org/r/20240321163705.3067592-2-surenb@google.com
Signed-off-by: Kent Overstreet &lt;kent.overstreet@linux.dev&gt;
Signed-off-by: Suren Baghdasaryan &lt;surenb@google.com&gt;
Signed-off-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
Reviewed-by: Pasha Tatashin &lt;pasha.tatashin@soleen.com&gt;
Tested-by: Kees Cook &lt;keescook@chromium.org&gt;
Cc: Alexander Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Alex Gaynor &lt;alex.gaynor@gmail.com&gt;
Cc: Alice Ryhl &lt;aliceryhl@google.com&gt;
Cc: Andreas Hindborg &lt;a.hindborg@samsung.com&gt;
Cc: Benno Lossin &lt;benno.lossin@proton.me&gt;
Cc: "Björn Roy Baron" &lt;bjorn3_gh@protonmail.com&gt;
Cc: Boqun Feng &lt;boqun.feng@gmail.com&gt;
Cc: Christoph Lameter &lt;cl@linux.com&gt;
Cc: Dennis Zhou &lt;dennis@kernel.org&gt;
Cc: Gary Guo &lt;gary@garyguo.net&gt;
Cc: Miguel Ojeda &lt;ojeda@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Wedson Almeida Filho &lt;wedsonaf@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Patch series "Memory allocation profiling", v6.

Overview:
Low overhead [1] per-callsite memory allocation profiling. Not just for
debug kernels, overhead low enough to be deployed in production.

Example output:
  root@moria-kvm:~# sort -rn /proc/allocinfo
   127664128    31168 mm/page_ext.c:270 func:alloc_page_ext
    56373248     4737 mm/slub.c:2259 func:alloc_slab_page
    14880768     3633 mm/readahead.c:247 func:page_cache_ra_unbounded
    14417920     3520 mm/mm_init.c:2530 func:alloc_large_system_hash
    13377536      234 block/blk-mq.c:3421 func:blk_mq_alloc_rqs
    11718656     2861 mm/filemap.c:1919 func:__filemap_get_folio
     9192960     2800 kernel/fork.c:307 func:alloc_thread_stack_node
     4206592        4 net/netfilter/nf_conntrack_core.c:2567 func:nf_ct_alloc_hashtable
     4136960     1010 drivers/staging/ctagmod/ctagmod.c:20 [ctagmod] func:ctagmod_start
     3940352      962 mm/memory.c:4214 func:alloc_anon_folio
     2894464    22613 fs/kernfs/dir.c:615 func:__kernfs_new_node
     ...

Usage:
kconfig options:
 - CONFIG_MEM_ALLOC_PROFILING
 - CONFIG_MEM_ALLOC_PROFILING_ENABLED_BY_DEFAULT
 - CONFIG_MEM_ALLOC_PROFILING_DEBUG
   adds warnings for allocations that weren't accounted because of a
   missing annotation

sysctl:
  /proc/sys/vm/mem_profiling

Runtime info:
  /proc/allocinfo

Notes:

[1]: Overhead
To measure the overhead we are comparing the following configurations:
(1) Baseline with CONFIG_MEMCG_KMEM=n
(2) Disabled by default (CONFIG_MEM_ALLOC_PROFILING=y &amp;&amp;
    CONFIG_MEM_ALLOC_PROFILING_BY_DEFAULT=n)
(3) Enabled by default (CONFIG_MEM_ALLOC_PROFILING=y &amp;&amp;
    CONFIG_MEM_ALLOC_PROFILING_BY_DEFAULT=y)
(4) Enabled at runtime (CONFIG_MEM_ALLOC_PROFILING=y &amp;&amp;
    CONFIG_MEM_ALLOC_PROFILING_BY_DEFAULT=n &amp;&amp; /proc/sys/vm/mem_profiling=1)
(5) Baseline with CONFIG_MEMCG_KMEM=y &amp;&amp; allocating with __GFP_ACCOUNT
(6) Disabled by default (CONFIG_MEM_ALLOC_PROFILING=y &amp;&amp;
    CONFIG_MEM_ALLOC_PROFILING_BY_DEFAULT=n)  &amp;&amp; CONFIG_MEMCG_KMEM=y
(7) Enabled by default (CONFIG_MEM_ALLOC_PROFILING=y &amp;&amp;
    CONFIG_MEM_ALLOC_PROFILING_BY_DEFAULT=y) &amp;&amp; CONFIG_MEMCG_KMEM=y

Performance overhead:
To evaluate performance we implemented an in-kernel test executing
multiple get_free_page/free_page and kmalloc/kfree calls with allocation
sizes growing from 8 to 240 bytes with CPU frequency set to max and CPU
affinity set to a specific CPU to minimize the noise. Below are results
from running the test on Ubuntu 22.04.2 LTS with 6.8.0-rc1 kernel on
56 core Intel Xeon:

                        kmalloc                 pgalloc
(1 baseline)            6.764s                  16.902s
(2 default disabled)    6.793s  (+0.43%)        17.007s (+0.62%)
(3 default enabled)     7.197s  (+6.40%)        23.666s (+40.02%)
(4 runtime enabled)     7.405s  (+9.48%)        23.901s (+41.41%)
(5 memcg)               13.388s (+97.94%)       48.460s (+186.71%)
(6 def disabled+memcg)  13.332s (+97.10%)       48.105s (+184.61%)
(7 def enabled+memcg)   13.446s (+98.78%)       54.963s (+225.18%)

Memory overhead:
Kernel size:

   text           data        bss         dec         diff
(1) 26515311	      18890222    17018880    62424413
(2) 26524728	      19423818    16740352    62688898    264485
(3) 26524724	      19423818    16740352    62688894    264481
(4) 26524728	      19423818    16740352    62688898    264485
(5) 26541782	      18964374    16957440    62463596    39183

Memory consumption on a 56 core Intel CPU with 125GB of memory:
Code tags:           192 kB
PageExts:         262144 kB (256MB)
SlabExts:           9876 kB (9.6MB)
PcpuExts:            512 kB (0.5MB)

Total overhead is 0.2% of total memory.

Benchmarks:

Hackbench tests run 100 times:
hackbench -s 512 -l 200 -g 15 -f 25 -P
      baseline       disabled profiling           enabled profiling
avg   0.3543         0.3559 (+0.0016)             0.3566 (+0.0023)
stdev 0.0137         0.0188                       0.0077


hackbench -l 10000
      baseline       disabled profiling           enabled profiling
avg   6.4218         6.4306 (+0.0088)             6.5077 (+0.0859)
stdev 0.0933         0.0286                       0.0489

stress-ng tests:
stress-ng --class memory --seq 4 -t 60
stress-ng --class cpu --seq 4 -t 60
Results posted at: https://evilpiepirate.org/~kent/memalloc_prof_v4_stress-ng/

[2] https://lore.kernel.org/all/20240306182440.2003814-1-surenb@google.com/


This patch (of 37):

The next patch drops vmalloc.h from a system header in order to fix a
circular dependency; this adds it to all the files that were pulling it in
implicitly.

[kent.overstreet@linux.dev: fix arch/alpha/lib/memcpy.c]
  Link: https://lkml.kernel.org/r/20240327002152.3339937-1-kent.overstreet@linux.dev
[surenb@google.com: fix arch/x86/mm/numa_32.c]
  Link: https://lkml.kernel.org/r/20240402180933.1663992-1-surenb@google.com
[kent.overstreet@linux.dev: a few places were depending on sizes.h]
  Link: https://lkml.kernel.org/r/20240404034744.1664840-1-kent.overstreet@linux.dev
[arnd@arndb.de: fix mm/kasan/hw_tags.c]
  Link: https://lkml.kernel.org/r/20240404124435.3121534-1-arnd@kernel.org
[surenb@google.com: fix arc build]
  Link: https://lkml.kernel.org/r/20240405225115.431056-1-surenb@google.com
Link: https://lkml.kernel.org/r/20240321163705.3067592-1-surenb@google.com
Link: https://lkml.kernel.org/r/20240321163705.3067592-2-surenb@google.com
Signed-off-by: Kent Overstreet &lt;kent.overstreet@linux.dev&gt;
Signed-off-by: Suren Baghdasaryan &lt;surenb@google.com&gt;
Signed-off-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
Reviewed-by: Pasha Tatashin &lt;pasha.tatashin@soleen.com&gt;
Tested-by: Kees Cook &lt;keescook@chromium.org&gt;
Cc: Alexander Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Alex Gaynor &lt;alex.gaynor@gmail.com&gt;
Cc: Alice Ryhl &lt;aliceryhl@google.com&gt;
Cc: Andreas Hindborg &lt;a.hindborg@samsung.com&gt;
Cc: Benno Lossin &lt;benno.lossin@proton.me&gt;
Cc: "Björn Roy Baron" &lt;bjorn3_gh@protonmail.com&gt;
Cc: Boqun Feng &lt;boqun.feng@gmail.com&gt;
Cc: Christoph Lameter &lt;cl@linux.com&gt;
Cc: Dennis Zhou &lt;dennis@kernel.org&gt;
Cc: Gary Guo &lt;gary@garyguo.net&gt;
Cc: Miguel Ojeda &lt;ojeda@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Wedson Almeida Filho &lt;wedsonaf@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/vmwgfx: Fix possible null pointer derefence with invalid contexts</title>
<updated>2024-01-26T18:48:06+00:00</updated>
<author>
<name>Zack Rusin</name>
<email>zack.rusin@broadcom.com</email>
</author>
<published>2024-01-10T20:03:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=517621b7060096e48e42f545fa6646fc00252eac'/>
<id>517621b7060096e48e42f545fa6646fc00252eac</id>
<content type='text'>
vmw_context_cotable can return either an error or a null pointer and its
usage sometimes went unchecked. Subsequent code would then try to access
either a null pointer or an error value.

The invalid dereferences were only possible with malformed userspace
apps which never properly initialized the rendering contexts.

Check the results of vmw_context_cotable to fix the invalid derefs.

Thanks:
ziming zhang(@ezrak1e) from Ant Group Light-Year Security Lab
who was the first person to discover it.
Niels De Graef who reported it and helped to track down the poc.

Fixes: 9c079b8ce8bf ("drm/vmwgfx: Adapt execbuf to the new validation api")
Cc: &lt;stable@vger.kernel.org&gt; # v4.20+
Reported-by: Niels De Graef  &lt;ndegraef@redhat.com&gt;
Signed-off-by: Zack Rusin &lt;zack.rusin@broadcom.com&gt;
Cc: Martin Krastev &lt;martin.krastev@broadcom.com&gt;
Cc: Maaz Mombasawala &lt;maaz.mombasawala@broadcom.com&gt;
Cc: Ian Forbes &lt;ian.forbes@broadcom.com&gt;
Cc: Broadcom internal kernel review list &lt;bcm-kernel-feedback-list@broadcom.com&gt;
Cc: dri-devel@lists.freedesktop.org
Reviewed-by: Maaz Mombasawala &lt;maaz.mombasawala@broadcom.com&gt;
Reviewed-by: Martin Krastev &lt;martin.krastev@broadcom.com&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20240110200305.94086-1-zack.rusin@broadcom.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
vmw_context_cotable can return either an error or a null pointer and its
usage sometimes went unchecked. Subsequent code would then try to access
either a null pointer or an error value.

The invalid dereferences were only possible with malformed userspace
apps which never properly initialized the rendering contexts.

Check the results of vmw_context_cotable to fix the invalid derefs.

Thanks:
ziming zhang(@ezrak1e) from Ant Group Light-Year Security Lab
who was the first person to discover it.
Niels De Graef who reported it and helped to track down the poc.

Fixes: 9c079b8ce8bf ("drm/vmwgfx: Adapt execbuf to the new validation api")
Cc: &lt;stable@vger.kernel.org&gt; # v4.20+
Reported-by: Niels De Graef  &lt;ndegraef@redhat.com&gt;
Signed-off-by: Zack Rusin &lt;zack.rusin@broadcom.com&gt;
Cc: Martin Krastev &lt;martin.krastev@broadcom.com&gt;
Cc: Maaz Mombasawala &lt;maaz.mombasawala@broadcom.com&gt;
Cc: Ian Forbes &lt;ian.forbes@broadcom.com&gt;
Cc: Broadcom internal kernel review list &lt;bcm-kernel-feedback-list@broadcom.com&gt;
Cc: dri-devel@lists.freedesktop.org
Reviewed-by: Maaz Mombasawala &lt;maaz.mombasawala@broadcom.com&gt;
Reviewed-by: Martin Krastev &lt;martin.krastev@broadcom.com&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20240110200305.94086-1-zack.rusin@broadcom.com
</pre>
</div>
</content>
</entry>
<entry>
<title>drm/vmwgfx: Add SVGA_3D_CMD_DEFINE_GB_SURFACE_V4 to command array.</title>
<updated>2024-01-26T17:50:43+00:00</updated>
<author>
<name>Ian Forbes</name>
<email>ian.forbes@broadcom.com</email>
</author>
<published>2024-01-08T21:16:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=f3e17b42b28d2b71f54cfcf4530690bcbdaf23ed'/>
<id>f3e17b42b28d2b71f54cfcf4530690bcbdaf23ed</id>
<content type='text'>
Without this definition device errors will display the command name
as (null) when debug logging is enabled.

Signed-off-by: Ian Forbes &lt;ian.forbes@broadcom.com&gt;
Signed-off-by: Zack Rusin &lt;zack.rusin@broadcom.com&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20240108211655.13187-1-ian.forbes@broadcom.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Without this definition device errors will display the command name
as (null) when debug logging is enabled.

Signed-off-by: Ian Forbes &lt;ian.forbes@broadcom.com&gt;
Signed-off-by: Zack Rusin &lt;zack.rusin@broadcom.com&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20240108211655.13187-1-ian.forbes@broadcom.com
</pre>
</div>
</content>
</entry>
</feed>
