<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/arch/x86/kernel/cpu/resctrl, branch v6.2</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>x86/resctrl: Fix event counts regression in reused RMIDs</title>
<updated>2023-01-10T18:51:59+00:00</updated>
<author>
<name>Peter Newman</name>
<email>peternewman@google.com</email>
</author>
<published>2022-12-20T16:41:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=2a81160d29d65b5876ab3f824fda99ae0219f05e'/>
<id>2a81160d29d65b5876ab3f824fda99ae0219f05e</id>
<content type='text'>
When creating a new monitoring group, the RMID allocated for it may have
been used by a group which was previously removed. In this case, the
hardware counters will have non-zero values which should be deducted
from what is reported in the new group's counts.

resctrl_arch_reset_rmid() initializes the prev_msr value for counters to
0, causing the initial count to be charged to the new group. Resurrect
__rmid_read() and use it to initialize prev_msr correctly.

Unlike before, __rmid_read() checks for error bits in the MSR read so
that callers don't need to.

Fixes: 1d81d15db39c ("x86/resctrl: Move mbm_overflow_count() into resctrl_arch_rmid_read()")
Signed-off-by: Peter Newman &lt;peternewman@google.com&gt;
Signed-off-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
Reviewed-by: Reinette Chatre &lt;reinette.chatre@intel.com&gt;
Tested-by: Babu Moger &lt;babu.moger@amd.com&gt;
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20221220164132.443083-1-peternewman@google.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When creating a new monitoring group, the RMID allocated for it may have
been used by a group which was previously removed. In this case, the
hardware counters will have non-zero values which should be deducted
from what is reported in the new group's counts.

resctrl_arch_reset_rmid() initializes the prev_msr value for counters to
0, causing the initial count to be charged to the new group. Resurrect
__rmid_read() and use it to initialize prev_msr correctly.

Unlike before, __rmid_read() checks for error bits in the MSR read so
that callers don't need to.

Fixes: 1d81d15db39c ("x86/resctrl: Move mbm_overflow_count() into resctrl_arch_rmid_read()")
Signed-off-by: Peter Newman &lt;peternewman@google.com&gt;
Signed-off-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
Reviewed-by: Reinette Chatre &lt;reinette.chatre@intel.com&gt;
Tested-by: Babu Moger &lt;babu.moger@amd.com&gt;
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20221220164132.443083-1-peternewman@google.com
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/resctrl: Fix task CLOSID/RMID update race</title>
<updated>2023-01-10T18:47:30+00:00</updated>
<author>
<name>Peter Newman</name>
<email>peternewman@google.com</email>
</author>
<published>2022-12-20T16:11:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=fe1f0714385fbcf76b0cbceb02b7277d842014fc'/>
<id>fe1f0714385fbcf76b0cbceb02b7277d842014fc</id>
<content type='text'>
When the user moves a running task to a new rdtgroup using the task's
file interface or by deleting its rdtgroup, the resulting change in
CLOSID/RMID must be immediately propagated to the PQR_ASSOC MSR on the
task(s) CPUs.

x86 allows reordering loads with prior stores, so if the task starts
running between a task_curr() check that the CPU hoisted before the
stores in the CLOSID/RMID update then it can start running with the old
CLOSID/RMID until it is switched again because __rdtgroup_move_task()
failed to determine that it needs to be interrupted to obtain the new
CLOSID/RMID.

Refer to the diagram below:

CPU 0                                   CPU 1
-----                                   -----
__rdtgroup_move_task():
  curr &lt;- t1-&gt;cpu-&gt;rq-&gt;curr
                                        __schedule():
                                          rq-&gt;curr &lt;- t1
                                        resctrl_sched_in():
                                          t1-&gt;{closid,rmid} -&gt; {1,1}
  t1-&gt;{closid,rmid} &lt;- {2,2}
  if (curr == t1) // false
   IPI(t1-&gt;cpu)

A similar race impacts rdt_move_group_tasks(), which updates tasks in a
deleted rdtgroup.

In both cases, use smp_mb() to order the task_struct::{closid,rmid}
stores before the loads in task_curr().  In particular, in the
rdt_move_group_tasks() case, simply execute an smp_mb() on every
iteration with a matching task.

It is possible to use a single smp_mb() in rdt_move_group_tasks(), but
this would require two passes and a means of remembering which
task_structs were updated in the first loop. However, benchmarking
results below showed too little performance impact in the simple
approach to justify implementing the two-pass approach.

Times below were collected using `perf stat` to measure the time to
remove a group containing a 1600-task, parallel workload.

CPU: Intel(R) Xeon(R) Platinum P-8136 CPU @ 2.00GHz (112 threads)

  # mkdir /sys/fs/resctrl/test
  # echo $$ &gt; /sys/fs/resctrl/test/tasks
  # perf bench sched messaging -g 40 -l 100000

task-clock time ranges collected using:

  # perf stat rmdir /sys/fs/resctrl/test

Baseline:                     1.54 - 1.60 ms
smp_mb() every matching task: 1.57 - 1.67 ms

  [ bp: Massage commit message. ]

Fixes: ae28d1aae48a ("x86/resctrl: Use an IPI instead of task_work_add() to update PQR_ASSOC MSR")
Fixes: 0efc89be9471 ("x86/intel_rdt: Update task closid immediately on CPU in rmdir and unmount")
Signed-off-by: Peter Newman &lt;peternewman@google.com&gt;
Signed-off-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
Reviewed-by: Reinette Chatre &lt;reinette.chatre@intel.com&gt;
Reviewed-by: Babu Moger &lt;babu.moger@amd.com&gt;
Cc: &lt;stable@kernel.org&gt;
Link: https://lore.kernel.org/r/20221220161123.432120-1-peternewman@google.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When the user moves a running task to a new rdtgroup using the task's
file interface or by deleting its rdtgroup, the resulting change in
CLOSID/RMID must be immediately propagated to the PQR_ASSOC MSR on the
task(s) CPUs.

x86 allows reordering loads with prior stores, so if the task starts
running between a task_curr() check that the CPU hoisted before the
stores in the CLOSID/RMID update then it can start running with the old
CLOSID/RMID until it is switched again because __rdtgroup_move_task()
failed to determine that it needs to be interrupted to obtain the new
CLOSID/RMID.

Refer to the diagram below:

CPU 0                                   CPU 1
-----                                   -----
__rdtgroup_move_task():
  curr &lt;- t1-&gt;cpu-&gt;rq-&gt;curr
                                        __schedule():
                                          rq-&gt;curr &lt;- t1
                                        resctrl_sched_in():
                                          t1-&gt;{closid,rmid} -&gt; {1,1}
  t1-&gt;{closid,rmid} &lt;- {2,2}
  if (curr == t1) // false
   IPI(t1-&gt;cpu)

A similar race impacts rdt_move_group_tasks(), which updates tasks in a
deleted rdtgroup.

In both cases, use smp_mb() to order the task_struct::{closid,rmid}
stores before the loads in task_curr().  In particular, in the
rdt_move_group_tasks() case, simply execute an smp_mb() on every
iteration with a matching task.

It is possible to use a single smp_mb() in rdt_move_group_tasks(), but
this would require two passes and a means of remembering which
task_structs were updated in the first loop. However, benchmarking
results below showed too little performance impact in the simple
approach to justify implementing the two-pass approach.

Times below were collected using `perf stat` to measure the time to
remove a group containing a 1600-task, parallel workload.

CPU: Intel(R) Xeon(R) Platinum P-8136 CPU @ 2.00GHz (112 threads)

  # mkdir /sys/fs/resctrl/test
  # echo $$ &gt; /sys/fs/resctrl/test/tasks
  # perf bench sched messaging -g 40 -l 100000

task-clock time ranges collected using:

  # perf stat rmdir /sys/fs/resctrl/test

Baseline:                     1.54 - 1.60 ms
smp_mb() every matching task: 1.57 - 1.67 ms

  [ bp: Massage commit message. ]

Fixes: ae28d1aae48a ("x86/resctrl: Use an IPI instead of task_work_add() to update PQR_ASSOC MSR")
Fixes: 0efc89be9471 ("x86/intel_rdt: Update task closid immediately on CPU in rmdir and unmount")
Signed-off-by: Peter Newman &lt;peternewman@google.com&gt;
Signed-off-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
Reviewed-by: Reinette Chatre &lt;reinette.chatre@intel.com&gt;
Reviewed-by: Babu Moger &lt;babu.moger@amd.com&gt;
Cc: &lt;stable@kernel.org&gt;
Link: https://lore.kernel.org/r/20221220161123.432120-1-peternewman@google.com
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'driver-core-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core</title>
<updated>2022-12-16T11:54:54+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2022-12-16T11:54:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=71a7507afbc3f27c346898f13ab9bfd918613c34'/>
<id>71a7507afbc3f27c346898f13ab9bfd918613c34</id>
<content type='text'>
Pull driver core updates from Greg KH:
 "Here is the set of driver core and kernfs changes for 6.2-rc1.

  The "big" change in here is the addition of a new macro,
  container_of_const() that will preserve the "const-ness" of a pointer
  passed into it.

  The "problem" of the current container_of() macro is that if you pass
  in a "const *", out of it can comes a non-const pointer unless you
  specifically ask for it. For many usages, we want to preserve the
  "const" attribute by using the same call. For a specific example, this
  series changes the kobj_to_dev() macro to use it, allowing it to be
  used no matter what the const value is. This prevents every subsystem
  from having to declare 2 different individual macros (i.e.
  kobj_const_to_dev() and kobj_to_dev()) and having the compiler enforce
  the const value at build time, which having 2 macros would not do
  either.

  The driver for all of this have been discussions with the Rust kernel
  developers as to how to properly mark driver core, and kobject,
  objects as being "non-mutable". The changes to the kobject and driver
  core in this pull request are the result of that, as there are lots of
  paths where kobjects and device pointers are not modified at all, so
  marking them as "const" allows the compiler to enforce this.

  So, a nice side affect of the Rust development effort has been already
  to clean up the driver core code to be more obvious about object
  rules.

  All of this has been bike-shedded in quite a lot of detail on lkml
  with different names and implementations resulting in the tiny version
  we have in here, much better than my original proposal. Lots of
  subsystem maintainers have acked the changes as well.

  Other than this change, included in here are smaller stuff like:

   - kernfs fixes and updates to handle lock contention better

   - vmlinux.lds.h fixes and updates

   - sysfs and debugfs documentation updates

   - device property updates

  All of these have been in the linux-next tree for quite a while with
  no problems"

* tag 'driver-core-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (58 commits)
  device property: Fix documentation for fwnode_get_next_parent()
  firmware_loader: fix up to_fw_sysfs() to preserve const
  usb.h: take advantage of container_of_const()
  device.h: move kobj_to_dev() to use container_of_const()
  container_of: add container_of_const() that preserves const-ness of the pointer
  driver core: fix up missed drivers/s390/char/hmcdrv_dev.c class.devnode() conversion.
  driver core: fix up missed scsi/cxlflash class.devnode() conversion.
  driver core: fix up some missing class.devnode() conversions.
  driver core: make struct class.devnode() take a const *
  driver core: make struct class.dev_uevent() take a const *
  cacheinfo: Remove of_node_put() for fw_token
  device property: Add a blank line in Kconfig of tests
  device property: Rename goto label to be more precise
  device property: Move PROPERTY_ENTRY_BOOL() a bit down
  device property: Get rid of __PROPERTY_ENTRY_ARRAY_EL*SIZE*()
  kernfs: fix all kernel-doc warnings and multiple typos
  driver core: pass a const * into of_device_uevent()
  kobject: kset_uevent_ops: make name() callback take a const *
  kobject: kset_uevent_ops: make filter() callback take a const *
  kobject: make kobject_namespace take a const *
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull driver core updates from Greg KH:
 "Here is the set of driver core and kernfs changes for 6.2-rc1.

  The "big" change in here is the addition of a new macro,
  container_of_const() that will preserve the "const-ness" of a pointer
  passed into it.

  The "problem" of the current container_of() macro is that if you pass
  in a "const *", out of it can comes a non-const pointer unless you
  specifically ask for it. For many usages, we want to preserve the
  "const" attribute by using the same call. For a specific example, this
  series changes the kobj_to_dev() macro to use it, allowing it to be
  used no matter what the const value is. This prevents every subsystem
  from having to declare 2 different individual macros (i.e.
  kobj_const_to_dev() and kobj_to_dev()) and having the compiler enforce
  the const value at build time, which having 2 macros would not do
  either.

  The driver for all of this have been discussions with the Rust kernel
  developers as to how to properly mark driver core, and kobject,
  objects as being "non-mutable". The changes to the kobject and driver
  core in this pull request are the result of that, as there are lots of
  paths where kobjects and device pointers are not modified at all, so
  marking them as "const" allows the compiler to enforce this.

  So, a nice side affect of the Rust development effort has been already
  to clean up the driver core code to be more obvious about object
  rules.

  All of this has been bike-shedded in quite a lot of detail on lkml
  with different names and implementations resulting in the tiny version
  we have in here, much better than my original proposal. Lots of
  subsystem maintainers have acked the changes as well.

  Other than this change, included in here are smaller stuff like:

   - kernfs fixes and updates to handle lock contention better

   - vmlinux.lds.h fixes and updates

   - sysfs and debugfs documentation updates

   - device property updates

  All of these have been in the linux-next tree for quite a while with
  no problems"

* tag 'driver-core-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (58 commits)
  device property: Fix documentation for fwnode_get_next_parent()
  firmware_loader: fix up to_fw_sysfs() to preserve const
  usb.h: take advantage of container_of_const()
  device.h: move kobj_to_dev() to use container_of_const()
  container_of: add container_of_const() that preserves const-ness of the pointer
  driver core: fix up missed drivers/s390/char/hmcdrv_dev.c class.devnode() conversion.
  driver core: fix up missed scsi/cxlflash class.devnode() conversion.
  driver core: fix up some missing class.devnode() conversions.
  driver core: make struct class.devnode() take a const *
  driver core: make struct class.dev_uevent() take a const *
  cacheinfo: Remove of_node_put() for fw_token
  device property: Add a blank line in Kconfig of tests
  device property: Rename goto label to be more precise
  device property: Move PROPERTY_ENTRY_BOOL() a bit down
  device property: Get rid of __PROPERTY_ENTRY_ARRAY_EL*SIZE*()
  kernfs: fix all kernel-doc warnings and multiple typos
  driver core: pass a const * into of_device_uevent()
  kobject: kset_uevent_ops: make name() callback take a const *
  kobject: kset_uevent_ops: make filter() callback take a const *
  kobject: make kobject_namespace take a const *
  ...
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/resctrl: Move MSR defines into msr-index.h</title>
<updated>2022-11-27T22:00:45+00:00</updated>
<author>
<name>Borislav Petkov</name>
<email>bp@suse.de</email>
</author>
<published>2022-11-06T21:24:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=97fa21f65c3eb5bbab9b4734bed37fd624cddd86'/>
<id>97fa21f65c3eb5bbab9b4734bed37fd624cddd86</id>
<content type='text'>
msr-index.h should contain all MSRs for easier grepping for MSR numbers
when dealing with unchecked MSR access warnings, for example.

Move the resctrl ones. Prefix IA32_PQR_ASSOC with "MSR_" while at it.

No functional changes.

Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Link: https://lore.kernel.org/r/20221106212923.20699-1-bp@alien8.de
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
msr-index.h should contain all MSRs for easier grepping for MSR numbers
when dealing with unchecked MSR access warnings, for example.

Move the resctrl ones. Prefix IA32_PQR_ASSOC with "MSR_" while at it.

No functional changes.

Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Link: https://lore.kernel.org/r/20221106212923.20699-1-bp@alien8.de
</pre>
</div>
</content>
</entry>
<entry>
<title>driver core: make struct class.devnode() take a const *</title>
<updated>2022-11-24T16:12:27+00:00</updated>
<author>
<name>Greg Kroah-Hartman</name>
<email>gregkh@linuxfoundation.org</email>
</author>
<published>2022-11-23T12:25:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=ff62b8e6588fb07bedda7423622c140c4edd66a7'/>
<id>ff62b8e6588fb07bedda7423622c140c4edd66a7</id>
<content type='text'>
The devnode() in struct class should not be modifying the device that is
passed into it, so mark it as a const * and propagate the function
signature changes out into all relevant subsystems that use this
callback.

Cc: Fenghua Yu &lt;fenghua.yu@intel.com&gt;
Cc: Reinette Chatre &lt;reinette.chatre@intel.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Borislav Petkov &lt;bp@alien8.de&gt;
Cc: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Cc: x86@kernel.org
Cc: "H. Peter Anvin" &lt;hpa@zytor.com&gt;
Cc: FUJITA Tomonori &lt;fujita.tomonori@lab.ntt.co.jp&gt;
Cc: Jens Axboe &lt;axboe@kernel.dk&gt;
Cc: Justin Sanders &lt;justin@coraid.com&gt;
Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: Sumit Semwal &lt;sumit.semwal@linaro.org&gt;
Cc: Benjamin Gaignard &lt;benjamin.gaignard@collabora.com&gt;
Cc: Liam Mark &lt;lmark@codeaurora.org&gt;
Cc: Laura Abbott &lt;labbott@redhat.com&gt;
Cc: Brian Starkey &lt;Brian.Starkey@arm.com&gt;
Cc: John Stultz &lt;jstultz@google.com&gt;
Cc: "Christian König" &lt;christian.koenig@amd.com&gt;
Cc: Maarten Lankhorst &lt;maarten.lankhorst@linux.intel.com&gt;
Cc: Maxime Ripard &lt;mripard@kernel.org&gt;
Cc: Thomas Zimmermann &lt;tzimmermann@suse.de&gt;
Cc: David Airlie &lt;airlied@gmail.com&gt;
Cc: Daniel Vetter &lt;daniel@ffwll.ch&gt;
Cc: Jason Gunthorpe &lt;jgg@ziepe.ca&gt;
Cc: Leon Romanovsky &lt;leon@kernel.org&gt;
Cc: Dennis Dalessandro &lt;dennis.dalessandro@cornelisnetworks.com&gt;
Cc: Dmitry Torokhov &lt;dmitry.torokhov@gmail.com&gt;
Cc: Mauro Carvalho Chehab &lt;mchehab@kernel.org&gt;
Cc: Sean Young &lt;sean@mess.org&gt;
Cc: Frank Haverkamp &lt;haver@linux.ibm.com&gt;
Cc: Jiri Slaby &lt;jirislaby@kernel.org&gt;
Cc: "Michael S. Tsirkin" &lt;mst@redhat.com&gt;
Cc: Jason Wang &lt;jasowang@redhat.com&gt;
Cc: Alex Williamson &lt;alex.williamson@redhat.com&gt;
Cc: Cornelia Huck &lt;cohuck@redhat.com&gt;
Cc: Kees Cook &lt;keescook@chromium.org&gt;
Cc: Anton Vorontsov &lt;anton@enomsg.org&gt;
Cc: Colin Cross &lt;ccross@android.com&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Cc: Jaroslav Kysela &lt;perex@perex.cz&gt;
Cc: Takashi Iwai &lt;tiwai@suse.com&gt;
Cc: Hans Verkuil &lt;hverkuil-cisco@xs4all.nl&gt;
Cc: Christophe JAILLET &lt;christophe.jaillet@wanadoo.fr&gt;
Cc: Xie Yongji &lt;xieyongji@bytedance.com&gt;
Cc: Gautam Dawar &lt;gautam.dawar@xilinx.com&gt;
Cc: Dan Carpenter &lt;error27@gmail.com&gt;
Cc: Eli Cohen &lt;elic@nvidia.com&gt;
Cc: Parav Pandit &lt;parav@nvidia.com&gt;
Cc: Maxime Coquelin &lt;maxime.coquelin@redhat.com&gt;
Cc: alsa-devel@alsa-project.org
Cc: dri-devel@lists.freedesktop.org
Cc: kvm@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Cc: linux-block@vger.kernel.org
Cc: linux-input@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-media@vger.kernel.org
Cc: linux-rdma@vger.kernel.org
Cc: linux-scsi@vger.kernel.org
Cc: linux-usb@vger.kernel.org
Cc: virtualization@lists.linux-foundation.org
Link: https://lore.kernel.org/r/20221123122523.1332370-2-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The devnode() in struct class should not be modifying the device that is
passed into it, so mark it as a const * and propagate the function
signature changes out into all relevant subsystems that use this
callback.

Cc: Fenghua Yu &lt;fenghua.yu@intel.com&gt;
Cc: Reinette Chatre &lt;reinette.chatre@intel.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Borislav Petkov &lt;bp@alien8.de&gt;
Cc: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Cc: x86@kernel.org
Cc: "H. Peter Anvin" &lt;hpa@zytor.com&gt;
Cc: FUJITA Tomonori &lt;fujita.tomonori@lab.ntt.co.jp&gt;
Cc: Jens Axboe &lt;axboe@kernel.dk&gt;
Cc: Justin Sanders &lt;justin@coraid.com&gt;
Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: Sumit Semwal &lt;sumit.semwal@linaro.org&gt;
Cc: Benjamin Gaignard &lt;benjamin.gaignard@collabora.com&gt;
Cc: Liam Mark &lt;lmark@codeaurora.org&gt;
Cc: Laura Abbott &lt;labbott@redhat.com&gt;
Cc: Brian Starkey &lt;Brian.Starkey@arm.com&gt;
Cc: John Stultz &lt;jstultz@google.com&gt;
Cc: "Christian König" &lt;christian.koenig@amd.com&gt;
Cc: Maarten Lankhorst &lt;maarten.lankhorst@linux.intel.com&gt;
Cc: Maxime Ripard &lt;mripard@kernel.org&gt;
Cc: Thomas Zimmermann &lt;tzimmermann@suse.de&gt;
Cc: David Airlie &lt;airlied@gmail.com&gt;
Cc: Daniel Vetter &lt;daniel@ffwll.ch&gt;
Cc: Jason Gunthorpe &lt;jgg@ziepe.ca&gt;
Cc: Leon Romanovsky &lt;leon@kernel.org&gt;
Cc: Dennis Dalessandro &lt;dennis.dalessandro@cornelisnetworks.com&gt;
Cc: Dmitry Torokhov &lt;dmitry.torokhov@gmail.com&gt;
Cc: Mauro Carvalho Chehab &lt;mchehab@kernel.org&gt;
Cc: Sean Young &lt;sean@mess.org&gt;
Cc: Frank Haverkamp &lt;haver@linux.ibm.com&gt;
Cc: Jiri Slaby &lt;jirislaby@kernel.org&gt;
Cc: "Michael S. Tsirkin" &lt;mst@redhat.com&gt;
Cc: Jason Wang &lt;jasowang@redhat.com&gt;
Cc: Alex Williamson &lt;alex.williamson@redhat.com&gt;
Cc: Cornelia Huck &lt;cohuck@redhat.com&gt;
Cc: Kees Cook &lt;keescook@chromium.org&gt;
Cc: Anton Vorontsov &lt;anton@enomsg.org&gt;
Cc: Colin Cross &lt;ccross@android.com&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Cc: Jaroslav Kysela &lt;perex@perex.cz&gt;
Cc: Takashi Iwai &lt;tiwai@suse.com&gt;
Cc: Hans Verkuil &lt;hverkuil-cisco@xs4all.nl&gt;
Cc: Christophe JAILLET &lt;christophe.jaillet@wanadoo.fr&gt;
Cc: Xie Yongji &lt;xieyongji@bytedance.com&gt;
Cc: Gautam Dawar &lt;gautam.dawar@xilinx.com&gt;
Cc: Dan Carpenter &lt;error27@gmail.com&gt;
Cc: Eli Cohen &lt;elic@nvidia.com&gt;
Cc: Parav Pandit &lt;parav@nvidia.com&gt;
Cc: Maxime Coquelin &lt;maxime.coquelin@redhat.com&gt;
Cc: alsa-devel@alsa-project.org
Cc: dri-devel@lists.freedesktop.org
Cc: kvm@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Cc: linux-block@vger.kernel.org
Cc: linux-input@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-media@vger.kernel.org
Cc: linux-rdma@vger.kernel.org
Cc: linux-scsi@vger.kernel.org
Cc: linux-usb@vger.kernel.org
Cc: virtualization@lists.linux-foundation.org
Link: https://lore.kernel.org/r/20221123122523.1332370-2-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/resctrl: Remove arch_has_empty_bitmaps</title>
<updated>2022-10-24T08:30:29+00:00</updated>
<author>
<name>Babu Moger</name>
<email>babu.moger@amd.com</email>
</author>
<published>2022-09-27T20:16:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=2d4daa549c17b6ba4845a751c7a78d3b2419d78f'/>
<id>2d4daa549c17b6ba4845a751c7a78d3b2419d78f</id>
<content type='text'>
The field arch_has_empty_bitmaps is not required anymore. The field
min_cbm_bits is enough to validate the CBM (capacity bit mask) if the
architecture can support the zero CBM or not.

Suggested-by: Reinette Chatre &lt;reinette.chatre@intel.com&gt;
Signed-off-by: Babu Moger &lt;babu.moger@amd.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Reviewed-by: Reinette Chatre &lt;reinette.chatre@intel.com&gt;
Reviewed-by: Fenghua Yu &lt;fenghua.yu@intel.com&gt;
Link: https://lore.kernel.org/r/166430979654.372014.615622285687642644.stgit@bmoger-ubuntu
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The field arch_has_empty_bitmaps is not required anymore. The field
min_cbm_bits is enough to validate the CBM (capacity bit mask) if the
architecture can support the zero CBM or not.

Suggested-by: Reinette Chatre &lt;reinette.chatre@intel.com&gt;
Signed-off-by: Babu Moger &lt;babu.moger@amd.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Reviewed-by: Reinette Chatre &lt;reinette.chatre@intel.com&gt;
Reviewed-by: Fenghua Yu &lt;fenghua.yu@intel.com&gt;
Link: https://lore.kernel.org/r/166430979654.372014.615622285687642644.stgit@bmoger-ubuntu
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/resctrl: Fix min_cbm_bits for AMD</title>
<updated>2022-10-18T18:25:16+00:00</updated>
<author>
<name>Babu Moger</name>
<email>babu.moger@amd.com</email>
</author>
<published>2022-09-27T20:16:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=67bf6493449b09590f9f71d7df29efb392b12d25'/>
<id>67bf6493449b09590f9f71d7df29efb392b12d25</id>
<content type='text'>
AMD systems support zero CBM (capacity bit mask) for cache allocation.
That is reflected in rdt_init_res_defs_amd() by:

  r-&gt;cache.arch_has_empty_bitmaps = true;

However given the unified code in cbm_validate(), checking for:

  val == 0 &amp;&amp; !arch_has_empty_bitmaps

is not enough because of another check in cbm_validate():

  if ((zero_bit - first_bit) &lt; r-&gt;cache.min_cbm_bits)

The default value of r-&gt;cache.min_cbm_bits = 1.

Leading to:

  $ cd /sys/fs/resctrl
  $ mkdir foo
  $ cd foo
  $ echo L3:0=0 &gt; schemata
    -bash: echo: write error: Invalid argument
  $ cat /sys/fs/resctrl/info/last_cmd_status
    Need at least 1 bits in the mask

Initialize the min_cbm_bits to 0 for AMD. Also, remove the default
setting of min_cbm_bits and initialize it separately.

After the fix:

  $ cd /sys/fs/resctrl
  $ mkdir foo
  $ cd foo
  $ echo L3:0=0 &gt; schemata
  $ cat /sys/fs/resctrl/info/last_cmd_status
    ok

Fixes: 316e7f901f5a ("x86/resctrl: Add struct rdt_cache::arch_has_{sparse, empty}_bitmaps")
Co-developed-by: Stephane Eranian &lt;eranian@google.com&gt;
Signed-off-by: Stephane Eranian &lt;eranian@google.com&gt;
Signed-off-by: Babu Moger &lt;babu.moger@amd.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Reviewed-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Reviewed-by: James Morse &lt;james.morse@arm.com&gt;
Reviewed-by: Reinette Chatre &lt;reinette.chatre@intel.com&gt;
Reviewed-by: Fenghua Yu &lt;fenghua.yu@intel.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Link: https://lore.kernel.org/lkml/20220517001234.3137157-1-eranian@google.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
AMD systems support zero CBM (capacity bit mask) for cache allocation.
That is reflected in rdt_init_res_defs_amd() by:

  r-&gt;cache.arch_has_empty_bitmaps = true;

However given the unified code in cbm_validate(), checking for:

  val == 0 &amp;&amp; !arch_has_empty_bitmaps

is not enough because of another check in cbm_validate():

  if ((zero_bit - first_bit) &lt; r-&gt;cache.min_cbm_bits)

The default value of r-&gt;cache.min_cbm_bits = 1.

Leading to:

  $ cd /sys/fs/resctrl
  $ mkdir foo
  $ cd foo
  $ echo L3:0=0 &gt; schemata
    -bash: echo: write error: Invalid argument
  $ cat /sys/fs/resctrl/info/last_cmd_status
    Need at least 1 bits in the mask

Initialize the min_cbm_bits to 0 for AMD. Also, remove the default
setting of min_cbm_bits and initialize it separately.

After the fix:

  $ cd /sys/fs/resctrl
  $ mkdir foo
  $ cd foo
  $ echo L3:0=0 &gt; schemata
  $ cat /sys/fs/resctrl/info/last_cmd_status
    ok

Fixes: 316e7f901f5a ("x86/resctrl: Add struct rdt_cache::arch_has_{sparse, empty}_bitmaps")
Co-developed-by: Stephane Eranian &lt;eranian@google.com&gt;
Signed-off-by: Stephane Eranian &lt;eranian@google.com&gt;
Signed-off-by: Babu Moger &lt;babu.moger@amd.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Reviewed-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Reviewed-by: James Morse &lt;james.morse@arm.com&gt;
Reviewed-by: Reinette Chatre &lt;reinette.chatre@intel.com&gt;
Reviewed-by: Fenghua Yu &lt;fenghua.yu@intel.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Link: https://lore.kernel.org/lkml/20220517001234.3137157-1-eranian@google.com
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/resctrl: Make resctrl_arch_rmid_read() return values in bytes</title>
<updated>2022-09-23T12:25:05+00:00</updated>
<author>
<name>James Morse</name>
<email>james.morse@arm.com</email>
</author>
<published>2022-09-02T15:48:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=f7b1843eca6fe295ba0c71fc02a3291954078f2b'/>
<id>f7b1843eca6fe295ba0c71fc02a3291954078f2b</id>
<content type='text'>
resctrl_arch_rmid_read() returns a value in chunks, as read from the
hardware. This needs scaling to bytes by mon_scale, as provided by
the architecture code.

Now that resctrl_arch_rmid_read() performs the overflow and corrections
itself, it may as well return a value in bytes directly. This allows
the accesses to the architecture specific 'hw' structure to be removed.

Move the mon_scale conversion into resctrl_arch_rmid_read().
mbm_bw_count() is updated to calculate bandwidth from bytes.

Signed-off-by: James Morse &lt;james.morse@arm.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Reviewed-by: Jamie Iles &lt;quic_jiles@quicinc.com&gt;
Reviewed-by: Shaopeng Tan &lt;tan.shaopeng@fujitsu.com&gt;
Reviewed-by: Reinette Chatre &lt;reinette.chatre@intel.com&gt;
Tested-by: Xin Hao &lt;xhao@linux.alibaba.com&gt;
Tested-by: Shaopeng Tan &lt;tan.shaopeng@fujitsu.com&gt;
Tested-by: Cristian Marussi &lt;cristian.marussi@arm.com&gt;
Link: https://lore.kernel.org/r/20220902154829.30399-22-james.morse@arm.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
resctrl_arch_rmid_read() returns a value in chunks, as read from the
hardware. This needs scaling to bytes by mon_scale, as provided by
the architecture code.

Now that resctrl_arch_rmid_read() performs the overflow and corrections
itself, it may as well return a value in bytes directly. This allows
the accesses to the architecture specific 'hw' structure to be removed.

Move the mon_scale conversion into resctrl_arch_rmid_read().
mbm_bw_count() is updated to calculate bandwidth from bytes.

Signed-off-by: James Morse &lt;james.morse@arm.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Reviewed-by: Jamie Iles &lt;quic_jiles@quicinc.com&gt;
Reviewed-by: Shaopeng Tan &lt;tan.shaopeng@fujitsu.com&gt;
Reviewed-by: Reinette Chatre &lt;reinette.chatre@intel.com&gt;
Tested-by: Xin Hao &lt;xhao@linux.alibaba.com&gt;
Tested-by: Shaopeng Tan &lt;tan.shaopeng@fujitsu.com&gt;
Tested-by: Cristian Marussi &lt;cristian.marussi@arm.com&gt;
Link: https://lore.kernel.org/r/20220902154829.30399-22-james.morse@arm.com
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/resctrl: Add resctrl_rmid_realloc_limit to abstract x86's boot_cpu_data</title>
<updated>2022-09-23T12:24:16+00:00</updated>
<author>
<name>James Morse</name>
<email>james.morse@arm.com</email>
</author>
<published>2022-09-02T15:48:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=d80975e264c8f01518890f3d91ab5bada8fa7f5e'/>
<id>d80975e264c8f01518890f3d91ab5bada8fa7f5e</id>
<content type='text'>
resctrl_rmid_realloc_threshold can be set by user-space. The maximum
value is specified by the architecture.

Currently max_threshold_occ_write() reads the maximum value from
boot_cpu_data.x86_cache_size, which is not portable to another
architecture.

Add resctrl_rmid_realloc_limit to describe the maximum size in bytes
that user-space can set the threshold to.

Signed-off-by: James Morse &lt;james.morse@arm.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Reviewed-by: Jamie Iles &lt;quic_jiles@quicinc.com&gt;
Reviewed-by: Shaopeng Tan &lt;tan.shaopeng@fujitsu.com&gt;
Reviewed-by: Reinette Chatre &lt;reinette.chatre@intel.com&gt;
Tested-by: Xin Hao &lt;xhao@linux.alibaba.com&gt;
Tested-by: Shaopeng Tan &lt;tan.shaopeng@fujitsu.com&gt;
Tested-by: Cristian Marussi &lt;cristian.marussi@arm.com&gt;
Link: https://lore.kernel.org/r/20220902154829.30399-21-james.morse@arm.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
resctrl_rmid_realloc_threshold can be set by user-space. The maximum
value is specified by the architecture.

Currently max_threshold_occ_write() reads the maximum value from
boot_cpu_data.x86_cache_size, which is not portable to another
architecture.

Add resctrl_rmid_realloc_limit to describe the maximum size in bytes
that user-space can set the threshold to.

Signed-off-by: James Morse &lt;james.morse@arm.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Reviewed-by: Jamie Iles &lt;quic_jiles@quicinc.com&gt;
Reviewed-by: Shaopeng Tan &lt;tan.shaopeng@fujitsu.com&gt;
Reviewed-by: Reinette Chatre &lt;reinette.chatre@intel.com&gt;
Tested-by: Xin Hao &lt;xhao@linux.alibaba.com&gt;
Tested-by: Shaopeng Tan &lt;tan.shaopeng@fujitsu.com&gt;
Tested-by: Cristian Marussi &lt;cristian.marussi@arm.com&gt;
Link: https://lore.kernel.org/r/20220902154829.30399-21-james.morse@arm.com
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/resctrl: Rename and change the units of resctrl_cqm_threshold</title>
<updated>2022-09-23T12:23:41+00:00</updated>
<author>
<name>James Morse</name>
<email>james.morse@arm.com</email>
</author>
<published>2022-09-02T15:48:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=ae2328b52962531c2d7c6b531022a3eb2d680f17'/>
<id>ae2328b52962531c2d7c6b531022a3eb2d680f17</id>
<content type='text'>
resctrl_cqm_threshold is stored in a hardware specific chunk size,
but exposed to user-space as bytes.

This means the filesystem parts of resctrl need to know how the hardware
counts, to convert the user provided byte value to chunks. The interface
between the architecture's resctrl code and the filesystem ought to
treat everything as bytes.

Change the unit of resctrl_cqm_threshold to bytes. resctrl_arch_rmid_read()
still returns its value in chunks, so this needs converting to bytes.
As all the users have been touched, rename the variable to
resctrl_rmid_realloc_threshold, which describes what the value is for.

Neither r-&gt;num_rmid nor hw_res-&gt;mon_scale are guaranteed to be a power
of 2, so the existing code introduces a rounding error from resctrl's
theoretical fraction of the cache usage. This behaviour is kept as it
ensures the user visible value matches the value read from hardware
when the rmid will be reallocated.

Signed-off-by: James Morse &lt;james.morse@arm.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Reviewed-by: Jamie Iles &lt;quic_jiles@quicinc.com&gt;
Reviewed-by: Shaopeng Tan &lt;tan.shaopeng@fujitsu.com&gt;
Reviewed-by: Reinette Chatre &lt;reinette.chatre@intel.com&gt;
Tested-by: Xin Hao &lt;xhao@linux.alibaba.com&gt;
Tested-by: Shaopeng Tan &lt;tan.shaopeng@fujitsu.com&gt;
Tested-by: Cristian Marussi &lt;cristian.marussi@arm.com&gt;
Link: https://lore.kernel.org/r/20220902154829.30399-20-james.morse@arm.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
resctrl_cqm_threshold is stored in a hardware specific chunk size,
but exposed to user-space as bytes.

This means the filesystem parts of resctrl need to know how the hardware
counts, to convert the user provided byte value to chunks. The interface
between the architecture's resctrl code and the filesystem ought to
treat everything as bytes.

Change the unit of resctrl_cqm_threshold to bytes. resctrl_arch_rmid_read()
still returns its value in chunks, so this needs converting to bytes.
As all the users have been touched, rename the variable to
resctrl_rmid_realloc_threshold, which describes what the value is for.

Neither r-&gt;num_rmid nor hw_res-&gt;mon_scale are guaranteed to be a power
of 2, so the existing code introduces a rounding error from resctrl's
theoretical fraction of the cache usage. This behaviour is kept as it
ensures the user visible value matches the value read from hardware
when the rmid will be reallocated.

Signed-off-by: James Morse &lt;james.morse@arm.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Reviewed-by: Jamie Iles &lt;quic_jiles@quicinc.com&gt;
Reviewed-by: Shaopeng Tan &lt;tan.shaopeng@fujitsu.com&gt;
Reviewed-by: Reinette Chatre &lt;reinette.chatre@intel.com&gt;
Tested-by: Xin Hao &lt;xhao@linux.alibaba.com&gt;
Tested-by: Shaopeng Tan &lt;tan.shaopeng@fujitsu.com&gt;
Tested-by: Cristian Marussi &lt;cristian.marussi@arm.com&gt;
Link: https://lore.kernel.org/r/20220902154829.30399-20-james.morse@arm.com
</pre>
</div>
</content>
</entry>
</feed>
