<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/arch/x86/kernel/cpu/resctrl/internal.h, branch v6.10</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>x86/resctrl: Simplify call convention for MSR update functions</title>
<updated>2024-04-24T11:47:00+00:00</updated>
<author>
<name>Tony Luck</name>
<email>tony.luck@intel.com</email>
</author>
<published>2024-03-08T21:38:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=bd4955d4bc2182ccb660c9c30a4dd7f36feaf943'/>
<id>bd4955d4bc2182ccb660c9c30a4dd7f36feaf943</id>
<content type='text'>
The per-resource MSR update functions cat_wrmsr(), mba_wrmsr_intel(),
and mba_wrmsr_amd() all take three arguments:

  (struct rdt_domain *d, struct msr_param *m, struct rdt_resource *r)

struct msr_param contains pointers to both struct rdt_resource and struct
rdt_domain, thus only struct msr_param is necessary.

Pass struct msr_param as a single parameter. Clean up formatting and
fix some fir tree declaration ordering.

No functional change.

Suggested-by: Reinette Chatre &lt;reinette.chatre@intel.com&gt;
Signed-off-by: Tony Luck &lt;tony.luck@intel.com&gt;
Signed-off-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
Reviewed-by: Reinette Chatre &lt;reinette.chatre@intel.com&gt;
Reviewed-by: Babu Moger &lt;babu.moger@amd.com&gt;
Tested-by: Maciej Wieczor-Retman &lt;maciej.wieczor-retman@intel.com&gt;
Link: https://lore.kernel.org/r/20240308213846.77075-3-tony.luck@intel.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The per-resource MSR update functions cat_wrmsr(), mba_wrmsr_intel(),
and mba_wrmsr_amd() all take three arguments:

  (struct rdt_domain *d, struct msr_param *m, struct rdt_resource *r)

struct msr_param contains pointers to both struct rdt_resource and struct
rdt_domain, thus only struct msr_param is necessary.

Pass struct msr_param as a single parameter. Clean up formatting and
fix some fir tree declaration ordering.

No functional change.

Suggested-by: Reinette Chatre &lt;reinette.chatre@intel.com&gt;
Signed-off-by: Tony Luck &lt;tony.luck@intel.com&gt;
Signed-off-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
Reviewed-by: Reinette Chatre &lt;reinette.chatre@intel.com&gt;
Reviewed-by: Babu Moger &lt;babu.moger@amd.com&gt;
Tested-by: Maciej Wieczor-Retman &lt;maciej.wieczor-retman@intel.com&gt;
Link: https://lore.kernel.org/r/20240308213846.77075-3-tony.luck@intel.com
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/resctrl: Pass domain to target CPU</title>
<updated>2024-04-24T11:41:41+00:00</updated>
<author>
<name>Tony Luck</name>
<email>tony.luck@intel.com</email>
</author>
<published>2024-03-08T21:38:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=e3ca96e479c91d6ee657d3caa5092a6a3a620f9f'/>
<id>e3ca96e479c91d6ee657d3caa5092a6a3a620f9f</id>
<content type='text'>
reset_all_ctrls() and resctrl_arch_update_domains() use on_each_cpu_mask()
to call rdt_ctrl_update() on potentially one CPU from each domain.

But this means rdt_ctrl_update() needs to figure out which domain to
apply changes to. Doing so requires a search of all domains in a resource,
which can only be done safely if cpus_lock is held. Both callers do hold
this lock, but there isn't a way for a function called on another CPU
via IPI to verify this.

Commit

  c0d848fcb09d ("x86/resctrl: Remove lockdep annotation that triggers
  false positive")

removed the incorrect assertions.

Add the target domain to the msr_param structure and call
rdt_ctrl_update() for each domain separately using
smp_call_function_single(). This means that rdt_ctrl_update() doesn't
need to search for the domain and get_domain_from_cpu() can safely
assert that the cpus_lock is held since the remaining callers do not use
IPI.

Signed-off-by: Tony Luck &lt;tony.luck@intel.com&gt;
Signed-off-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
Reviewed-by: Reinette Chatre &lt;reinette.chatre@intel.com&gt;
Reviewed-by: James Morse &lt;james.morse@arm.com&gt;
Reviewed-by: Babu Moger &lt;babu.moger@amd.com&gt;
Tested-by: Maciej Wieczor-Retman &lt;maciej.wieczor-retman@intel.com&gt;
Link: https://lore.kernel.org/r/20240308213846.77075-2-tony.luck@intel.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
reset_all_ctrls() and resctrl_arch_update_domains() use on_each_cpu_mask()
to call rdt_ctrl_update() on potentially one CPU from each domain.

But this means rdt_ctrl_update() needs to figure out which domain to
apply changes to. Doing so requires a search of all domains in a resource,
which can only be done safely if cpus_lock is held. Both callers do hold
this lock, but there isn't a way for a function called on another CPU
via IPI to verify this.

Commit

  c0d848fcb09d ("x86/resctrl: Remove lockdep annotation that triggers
  false positive")

removed the incorrect assertions.

Add the target domain to the msr_param structure and call
rdt_ctrl_update() for each domain separately using
smp_call_function_single(). This means that rdt_ctrl_update() doesn't
need to search for the domain and get_domain_from_cpu() can safely
assert that the cpus_lock is held since the remaining callers do not use
IPI.

Signed-off-by: Tony Luck &lt;tony.luck@intel.com&gt;
Signed-off-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
Reviewed-by: Reinette Chatre &lt;reinette.chatre@intel.com&gt;
Reviewed-by: James Morse &lt;james.morse@arm.com&gt;
Reviewed-by: Babu Moger &lt;babu.moger@amd.com&gt;
Tested-by: Maciej Wieczor-Retman &lt;maciej.wieczor-retman@intel.com&gt;
Link: https://lore.kernel.org/r/20240308213846.77075-2-tony.luck@intel.com
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/resctrl: Fix uninitialized memory read when last CPU of domain goes offline</title>
<updated>2024-04-03T07:30:01+00:00</updated>
<author>
<name>Reinette Chatre</name>
<email>reinette.chatre@intel.com</email>
</author>
<published>2024-04-01T18:16:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=c3eeb1ffc6a88af9b002e22be0f70851759be03a'/>
<id>c3eeb1ffc6a88af9b002e22be0f70851759be03a</id>
<content type='text'>
Tony encountered this OOPS when the last CPU of a domain goes
offline while running a kernel built with CONFIG_NO_HZ_FULL:

    BUG: kernel NULL pointer dereference, address: 0000000000000000
    #PF: supervisor read access in kernel mode
    #PF: error_code(0x0000) - not-present page
    PGD 0
    Oops: 0000 [#1] PREEMPT SMP NOPTI
    ...
    RIP: 0010:__find_nth_andnot_bit+0x66/0x110
    ...
    Call Trace:
     &lt;TASK&gt;
     ? __die()
     ? page_fault_oops()
     ? exc_page_fault()
     ? asm_exc_page_fault()
     cpumask_any_housekeeping()
     mbm_setup_overflow_handler()
     resctrl_offline_cpu()
     resctrl_arch_offline_cpu()
     cpuhp_invoke_callback()
     cpuhp_thread_fun()
     smpboot_thread_fn()
     kthread()
     ret_from_fork()
     ret_from_fork_asm()
     &lt;/TASK&gt;

The NULL pointer dereference is encountered while searching for another
online CPU in the domain (of which there are none) that can be used to
run the MBM overflow handler.

Because the kernel is configured with CONFIG_NO_HZ_FULL the search for
another CPU (in its effort to prefer those CPUs that aren't marked
nohz_full) consults the mask representing the nohz_full CPUs,
tick_nohz_full_mask. On a kernel with CONFIG_CPUMASK_OFFSTACK=y
tick_nohz_full_mask is not allocated unless the kernel is booted with
the "nohz_full=" parameter and because of that any access to
tick_nohz_full_mask needs to be guarded with tick_nohz_full_enabled().

Replace the IS_ENABLED(CONFIG_NO_HZ_FULL) with tick_nohz_full_enabled().
The latter ensures tick_nohz_full_mask can be accessed safely and can be
used whether kernel is built with CONFIG_NO_HZ_FULL enabled or not.

[ Use Ingo's suggestion that combines the two NO_HZ checks into one. ]

Fixes: a4846aaf3945 ("x86/resctrl: Add cpumask_any_housekeeping() for limbo/overflow")
Reported-by: Tony Luck &lt;tony.luck@intel.com&gt;
Signed-off-by: Reinette Chatre &lt;reinette.chatre@intel.com&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Reviewed-by: Babu Moger &lt;babu.moger@amd.com&gt;
Link: https://lore.kernel.org/r/ff8dfc8d3dcb04b236d523d1e0de13d2ef585223.1711993956.git.reinette.chatre@intel.com
Closes: https://lore.kernel.org/lkml/ZgIFT5gZgIQ9A9G7@agluck-desk3/
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Tony encountered this OOPS when the last CPU of a domain goes
offline while running a kernel built with CONFIG_NO_HZ_FULL:

    BUG: kernel NULL pointer dereference, address: 0000000000000000
    #PF: supervisor read access in kernel mode
    #PF: error_code(0x0000) - not-present page
    PGD 0
    Oops: 0000 [#1] PREEMPT SMP NOPTI
    ...
    RIP: 0010:__find_nth_andnot_bit+0x66/0x110
    ...
    Call Trace:
     &lt;TASK&gt;
     ? __die()
     ? page_fault_oops()
     ? exc_page_fault()
     ? asm_exc_page_fault()
     cpumask_any_housekeeping()
     mbm_setup_overflow_handler()
     resctrl_offline_cpu()
     resctrl_arch_offline_cpu()
     cpuhp_invoke_callback()
     cpuhp_thread_fun()
     smpboot_thread_fn()
     kthread()
     ret_from_fork()
     ret_from_fork_asm()
     &lt;/TASK&gt;

The NULL pointer dereference is encountered while searching for another
online CPU in the domain (of which there are none) that can be used to
run the MBM overflow handler.

Because the kernel is configured with CONFIG_NO_HZ_FULL the search for
another CPU (in its effort to prefer those CPUs that aren't marked
nohz_full) consults the mask representing the nohz_full CPUs,
tick_nohz_full_mask. On a kernel with CONFIG_CPUMASK_OFFSTACK=y
tick_nohz_full_mask is not allocated unless the kernel is booted with
the "nohz_full=" parameter and because of that any access to
tick_nohz_full_mask needs to be guarded with tick_nohz_full_enabled().

Replace the IS_ENABLED(CONFIG_NO_HZ_FULL) with tick_nohz_full_enabled().
The latter ensures tick_nohz_full_mask can be accessed safely and can be
used whether kernel is built with CONFIG_NO_HZ_FULL enabled or not.

[ Use Ingo's suggestion that combines the two NO_HZ checks into one. ]

Fixes: a4846aaf3945 ("x86/resctrl: Add cpumask_any_housekeeping() for limbo/overflow")
Reported-by: Tony Luck &lt;tony.luck@intel.com&gt;
Signed-off-by: Reinette Chatre &lt;reinette.chatre@intel.com&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Reviewed-by: Babu Moger &lt;babu.moger@amd.com&gt;
Link: https://lore.kernel.org/r/ff8dfc8d3dcb04b236d523d1e0de13d2ef585223.1711993956.git.reinette.chatre@intel.com
Closes: https://lore.kernel.org/lkml/ZgIFT5gZgIQ9A9G7@agluck-desk3/
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/resctrl: Allow overflow/limbo handlers to be scheduled on any-but CPU</title>
<updated>2024-02-16T18:18:33+00:00</updated>
<author>
<name>James Morse</name>
<email>james.morse@arm.com</email>
</author>
<published>2024-02-13T18:44:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=978fcca954cb52249babbc14e53de53c88dd6433'/>
<id>978fcca954cb52249babbc14e53de53c88dd6433</id>
<content type='text'>
When a CPU is taken offline resctrl may need to move the overflow or limbo
handlers to run on a different CPU.

Once the offline callbacks have been split, cqm_setup_limbo_handler() will be
called while the CPU that is going offline is still present in the CPU mask.

Pass the CPU to exclude to cqm_setup_limbo_handler() and
mbm_setup_overflow_handler(). These functions can use a variant of
cpumask_any_but() when selecting the CPU. -1 is used to indicate no CPUs need
excluding.

Signed-off-by: James Morse &lt;james.morse@arm.com&gt;
Signed-off-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
Reviewed-by: Shaopeng Tan &lt;tan.shaopeng@fujitsu.com&gt;
Reviewed-by: Babu Moger &lt;babu.moger@amd.com&gt;
Reviewed-by: Reinette Chatre &lt;reinette.chatre@intel.com&gt;
Tested-by: Shaopeng Tan &lt;tan.shaopeng@fujitsu.com&gt;
Tested-by: Peter Newman &lt;peternewman@google.com&gt;
Tested-by: Babu Moger &lt;babu.moger@amd.com&gt;
Tested-by: Carl Worth &lt;carl@os.amperecomputing.com&gt; # arm64
Link: https://lore.kernel.org/r/20240213184438.16675-22-james.morse@arm.com
Signed-off-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When a CPU is taken offline resctrl may need to move the overflow or limbo
handlers to run on a different CPU.

Once the offline callbacks have been split, cqm_setup_limbo_handler() will be
called while the CPU that is going offline is still present in the CPU mask.

Pass the CPU to exclude to cqm_setup_limbo_handler() and
mbm_setup_overflow_handler(). These functions can use a variant of
cpumask_any_but() when selecting the CPU. -1 is used to indicate no CPUs need
excluding.

Signed-off-by: James Morse &lt;james.morse@arm.com&gt;
Signed-off-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
Reviewed-by: Shaopeng Tan &lt;tan.shaopeng@fujitsu.com&gt;
Reviewed-by: Babu Moger &lt;babu.moger@amd.com&gt;
Reviewed-by: Reinette Chatre &lt;reinette.chatre@intel.com&gt;
Tested-by: Shaopeng Tan &lt;tan.shaopeng@fujitsu.com&gt;
Tested-by: Peter Newman &lt;peternewman@google.com&gt;
Tested-by: Babu Moger &lt;babu.moger@amd.com&gt;
Tested-by: Carl Worth &lt;carl@os.amperecomputing.com&gt; # arm64
Link: https://lore.kernel.org/r/20240213184438.16675-22-james.morse@arm.com
Signed-off-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/resctrl: Add helpers for system wide mon/alloc capable</title>
<updated>2024-02-16T18:18:33+00:00</updated>
<author>
<name>James Morse</name>
<email>james.morse@arm.com</email>
</author>
<published>2024-02-13T18:44:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=30017b60706c2ba72a0a4da7d5ef8f5fa95a2f01'/>
<id>30017b60706c2ba72a0a4da7d5ef8f5fa95a2f01</id>
<content type='text'>
resctrl reads rdt_alloc_capable or rdt_mon_capable to determine whether any of
the resources support the corresponding features.  resctrl also uses the
static keys that affect the architecture's context-switch code to determine the
same thing.

This forces another architecture to have the same static keys.

As the static key is enabled based on the capable flag, and none of the
filesystem uses of these are in the scheduler path, move the capable flags
behind helpers, and use these in the filesystem code instead of the static key.

After this change, only the architecture code manages and uses the static keys
to ensure __resctrl_sched_in() does not need runtime checks.

This avoids multiple architectures having to define the same static keys.

Cases where the static key implicitly tested if the resctrl filesystem was
mounted all have an explicit check now.

Signed-off-by: James Morse &lt;james.morse@arm.com&gt;
Signed-off-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
Reviewed-by: Shaopeng Tan &lt;tan.shaopeng@fujitsu.com&gt;
Reviewed-by: Reinette Chatre &lt;reinette.chatre@intel.com&gt;
Reviewed-by: Babu Moger &lt;babu.moger@amd.com&gt;
Tested-by: Shaopeng Tan &lt;tan.shaopeng@fujitsu.com&gt;
Tested-by: Peter Newman &lt;peternewman@google.com&gt;
Tested-by: Babu Moger &lt;babu.moger@amd.com&gt;
Tested-by: Carl Worth &lt;carl@os.amperecomputing.com&gt; # arm64
Link: https://lore.kernel.org/r/20240213184438.16675-20-james.morse@arm.com
Signed-off-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
resctrl reads rdt_alloc_capable or rdt_mon_capable to determine whether any of
the resources support the corresponding features.  resctrl also uses the
static keys that affect the architecture's context-switch code to determine the
same thing.

This forces another architecture to have the same static keys.

As the static key is enabled based on the capable flag, and none of the
filesystem uses of these are in the scheduler path, move the capable flags
behind helpers, and use these in the filesystem code instead of the static key.

After this change, only the architecture code manages and uses the static keys
to ensure __resctrl_sched_in() does not need runtime checks.

This avoids multiple architectures having to define the same static keys.

Cases where the static key implicitly tested if the resctrl filesystem was
mounted all have an explicit check now.

Signed-off-by: James Morse &lt;james.morse@arm.com&gt;
Signed-off-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
Reviewed-by: Shaopeng Tan &lt;tan.shaopeng@fujitsu.com&gt;
Reviewed-by: Reinette Chatre &lt;reinette.chatre@intel.com&gt;
Reviewed-by: Babu Moger &lt;babu.moger@amd.com&gt;
Tested-by: Shaopeng Tan &lt;tan.shaopeng@fujitsu.com&gt;
Tested-by: Peter Newman &lt;peternewman@google.com&gt;
Tested-by: Babu Moger &lt;babu.moger@amd.com&gt;
Tested-by: Carl Worth &lt;carl@os.amperecomputing.com&gt; # arm64
Link: https://lore.kernel.org/r/20240213184438.16675-20-james.morse@arm.com
Signed-off-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/resctrl: Move alloc/mon static keys into helpers</title>
<updated>2024-02-16T18:18:32+00:00</updated>
<author>
<name>James Morse</name>
<email>james.morse@arm.com</email>
</author>
<published>2024-02-13T18:44:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=5db6a4a75c95f6967d57906ba7b82756d1985d63'/>
<id>5db6a4a75c95f6967d57906ba7b82756d1985d63</id>
<content type='text'>
resctrl enables three static keys depending on the features it has enabled.
Another architecture's context switch code may look different, any static keys
that control it should be buried behind helpers.

Move the alloc/mon logic into arch-specific helpers as a preparatory step for
making the rdt_enable_key's status something the arch code decides.

This means other architectures don't have to mirror the static keys.

Signed-off-by: James Morse &lt;james.morse@arm.com&gt;
Signed-off-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
Reviewed-by: Shaopeng Tan &lt;tan.shaopeng@fujitsu.com&gt;
Reviewed-by: Reinette Chatre &lt;reinette.chatre@intel.com&gt;
Reviewed-by: Babu Moger &lt;babu.moger@amd.com&gt;
Tested-by: Shaopeng Tan &lt;tan.shaopeng@fujitsu.com&gt;
Tested-by: Peter Newman &lt;peternewman@google.com&gt;
Tested-by: Babu Moger &lt;babu.moger@amd.com&gt;
Tested-by: Carl Worth &lt;carl@os.amperecomputing.com&gt; # arm64
Link: https://lore.kernel.org/r/20240213184438.16675-18-james.morse@arm.com
Signed-off-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
resctrl enables three static keys depending on the features it has enabled.
Another architecture's context switch code may look different, any static keys
that control it should be buried behind helpers.

Move the alloc/mon logic into arch-specific helpers as a preparatory step for
making the rdt_enable_key's status something the arch code decides.

This means other architectures don't have to mirror the static keys.

Signed-off-by: James Morse &lt;james.morse@arm.com&gt;
Signed-off-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
Reviewed-by: Shaopeng Tan &lt;tan.shaopeng@fujitsu.com&gt;
Reviewed-by: Reinette Chatre &lt;reinette.chatre@intel.com&gt;
Reviewed-by: Babu Moger &lt;babu.moger@amd.com&gt;
Tested-by: Shaopeng Tan &lt;tan.shaopeng@fujitsu.com&gt;
Tested-by: Peter Newman &lt;peternewman@google.com&gt;
Tested-by: Babu Moger &lt;babu.moger@amd.com&gt;
Tested-by: Carl Worth &lt;carl@os.amperecomputing.com&gt; # arm64
Link: https://lore.kernel.org/r/20240213184438.16675-18-james.morse@arm.com
Signed-off-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/resctrl: Make resctrl_mounted checks explicit</title>
<updated>2024-02-16T18:18:32+00:00</updated>
<author>
<name>James Morse</name>
<email>james.morse@arm.com</email>
</author>
<published>2024-02-13T18:44:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=13e5769debf09588543db83836c524148873929f'/>
<id>13e5769debf09588543db83836c524148873929f</id>
<content type='text'>
The rdt_enable_key is switched when resctrl is mounted, and used to prevent
a second mount of the filesystem. It also enables the architecture's context
switch code.

This requires another architecture to have the same set of static keys, as
resctrl depends on them too. The existing users of these static keys are
implicitly also checking if the filesystem is mounted.

Make the resctrl_mounted checks explicit: resctrl can keep track of whether it
has been mounted once. This doesn't need to be combined with whether the arch
code is context switching the CLOSID.

rdt_mon_enable_key is never used just to test that resctrl is mounted, but does
also have this implication. Add a resctrl_mounted to all uses of
rdt_mon_enable_key.

This will allow the static key changing to be moved behind resctrl_arch_ calls.

Signed-off-by: James Morse &lt;james.morse@arm.com&gt;
Signed-off-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
Reviewed-by: Shaopeng Tan &lt;tan.shaopeng@fujitsu.com&gt;
Reviewed-by: Reinette Chatre &lt;reinette.chatre@intel.com&gt;
Reviewed-by: Babu Moger &lt;babu.moger@amd.com&gt;
Tested-by: Shaopeng Tan &lt;tan.shaopeng@fujitsu.com&gt;
Tested-by: Peter Newman &lt;peternewman@google.com&gt;
Tested-by: Babu Moger &lt;babu.moger@amd.com&gt;
Tested-by: Carl Worth &lt;carl@os.amperecomputing.com&gt; # arm64
Link: https://lore.kernel.org/r/20240213184438.16675-17-james.morse@arm.com
Signed-off-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The rdt_enable_key is switched when resctrl is mounted, and used to prevent
a second mount of the filesystem. It also enables the architecture's context
switch code.

This requires another architecture to have the same set of static keys, as
resctrl depends on them too. The existing users of these static keys are
implicitly also checking if the filesystem is mounted.

Make the resctrl_mounted checks explicit: resctrl can keep track of whether it
has been mounted once. This doesn't need to be combined with whether the arch
code is context switching the CLOSID.

rdt_mon_enable_key is never used just to test that resctrl is mounted, but does
also have this implication. Add a resctrl_mounted to all uses of
rdt_mon_enable_key.

This will allow the static key changing to be moved behind resctrl_arch_ calls.

Signed-off-by: James Morse &lt;james.morse@arm.com&gt;
Signed-off-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
Reviewed-by: Shaopeng Tan &lt;tan.shaopeng@fujitsu.com&gt;
Reviewed-by: Reinette Chatre &lt;reinette.chatre@intel.com&gt;
Reviewed-by: Babu Moger &lt;babu.moger@amd.com&gt;
Tested-by: Shaopeng Tan &lt;tan.shaopeng@fujitsu.com&gt;
Tested-by: Peter Newman &lt;peternewman@google.com&gt;
Tested-by: Babu Moger &lt;babu.moger@amd.com&gt;
Tested-by: Carl Worth &lt;carl@os.amperecomputing.com&gt; # arm64
Link: https://lore.kernel.org/r/20240213184438.16675-17-james.morse@arm.com
Signed-off-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/resctrl: Allow arch to allocate memory needed in resctrl_arch_rmid_read()</title>
<updated>2024-02-16T18:18:32+00:00</updated>
<author>
<name>James Morse</name>
<email>james.morse@arm.com</email>
</author>
<published>2024-02-13T18:44:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=e557999f80a5ee4ec812f594ab42bb76c3ec4eb2'/>
<id>e557999f80a5ee4ec812f594ab42bb76c3ec4eb2</id>
<content type='text'>
Depending on the number of monitors available, Arm's MPAM may need to
allocate a monitor prior to reading the counter value. Allocating a
contended resource may involve sleeping.

__check_limbo() and mon_event_count() each make multiple calls to
resctrl_arch_rmid_read(), to avoid extra work on contended systems,
the allocation should be valid for multiple invocations of
resctrl_arch_rmid_read().

The memory or hardware allocated is not specific to a domain.

Add arch hooks for this allocation, which need calling before
resctrl_arch_rmid_read(). The allocated monitor is passed to
resctrl_arch_rmid_read(), then freed again afterwards. The helper
can be called on any CPU, and can sleep.

Signed-off-by: James Morse &lt;james.morse@arm.com&gt;
Signed-off-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
Reviewed-by: Shaopeng Tan &lt;tan.shaopeng@fujitsu.com&gt;
Reviewed-by: Reinette Chatre &lt;reinette.chatre@intel.com&gt;
Reviewed-by: Babu Moger &lt;babu.moger@amd.com&gt;
Tested-by: Shaopeng Tan &lt;tan.shaopeng@fujitsu.com&gt;
Tested-by: Peter Newman &lt;peternewman@google.com&gt;
Tested-by: Babu Moger &lt;babu.moger@amd.com&gt;
Tested-by: Carl Worth &lt;carl@os.amperecomputing.com&gt; # arm64
Link: https://lore.kernel.org/r/20240213184438.16675-16-james.morse@arm.com
Signed-off-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Depending on the number of monitors available, Arm's MPAM may need to
allocate a monitor prior to reading the counter value. Allocating a
contended resource may involve sleeping.

__check_limbo() and mon_event_count() each make multiple calls to
resctrl_arch_rmid_read(), to avoid extra work on contended systems,
the allocation should be valid for multiple invocations of
resctrl_arch_rmid_read().

The memory or hardware allocated is not specific to a domain.

Add arch hooks for this allocation, which need calling before
resctrl_arch_rmid_read(). The allocated monitor is passed to
resctrl_arch_rmid_read(), then freed again afterwards. The helper
can be called on any CPU, and can sleep.

Signed-off-by: James Morse &lt;james.morse@arm.com&gt;
Signed-off-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
Reviewed-by: Shaopeng Tan &lt;tan.shaopeng@fujitsu.com&gt;
Reviewed-by: Reinette Chatre &lt;reinette.chatre@intel.com&gt;
Reviewed-by: Babu Moger &lt;babu.moger@amd.com&gt;
Tested-by: Shaopeng Tan &lt;tan.shaopeng@fujitsu.com&gt;
Tested-by: Peter Newman &lt;peternewman@google.com&gt;
Tested-by: Babu Moger &lt;babu.moger@amd.com&gt;
Tested-by: Carl Worth &lt;carl@os.amperecomputing.com&gt; # arm64
Link: https://lore.kernel.org/r/20240213184438.16675-16-james.morse@arm.com
Signed-off-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/resctrl: Add cpumask_any_housekeeping() for limbo/overflow</title>
<updated>2024-02-16T18:18:32+00:00</updated>
<author>
<name>James Morse</name>
<email>james.morse@arm.com</email>
</author>
<published>2024-02-13T18:44:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=a4846aaf39455fe69fce3522b385319383666eef'/>
<id>a4846aaf39455fe69fce3522b385319383666eef</id>
<content type='text'>
The limbo and overflow code picks a CPU to use from the domain's list of online
CPUs. Work is then scheduled on these CPUs to maintain the limbo list and any
counters that may overflow.

cpumask_any() may pick a CPU that is marked nohz_full, which will either
penalise the work that CPU was dedicated to, or delay the processing of limbo
list or counters that may overflow. Perhaps indefinitely. Delaying the overflow
handling will skew the bandwidth values calculated by mba_sc, which expects to
be called once a second.

Add cpumask_any_housekeeping() as a replacement for cpumask_any() that prefers
housekeeping CPUs. This helper will still return a nohz_full CPU if that is the
only option. The CPU to use is re-evaluated each time the limbo/overflow work
runs. This ensures the work will move off a nohz_full CPU once a housekeeping
CPU is available.

Signed-off-by: James Morse &lt;james.morse@arm.com&gt;
Signed-off-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
Reviewed-by: Shaopeng Tan &lt;tan.shaopeng@fujitsu.com&gt;
Reviewed-by: Reinette Chatre &lt;reinette.chatre@intel.com&gt;
Reviewed-by: Babu Moger &lt;babu.moger@amd.com&gt;
Tested-by: Shaopeng Tan &lt;tan.shaopeng@fujitsu.com&gt;
Tested-by: Peter Newman &lt;peternewman@google.com&gt;
Tested-by: Babu Moger &lt;babu.moger@amd.com&gt;
Tested-by: Carl Worth &lt;carl@os.amperecomputing.com&gt; # arm64
Link: https://lore.kernel.org/r/20240213184438.16675-13-james.morse@arm.com
Signed-off-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The limbo and overflow code picks a CPU to use from the domain's list of online
CPUs. Work is then scheduled on these CPUs to maintain the limbo list and any
counters that may overflow.

cpumask_any() may pick a CPU that is marked nohz_full, which will either
penalise the work that CPU was dedicated to, or delay the processing of limbo
list or counters that may overflow. Perhaps indefinitely. Delaying the overflow
handling will skew the bandwidth values calculated by mba_sc, which expects to
be called once a second.

Add cpumask_any_housekeeping() as a replacement for cpumask_any() that prefers
housekeeping CPUs. This helper will still return a nohz_full CPU if that is the
only option. The CPU to use is re-evaluated each time the limbo/overflow work
runs. This ensures the work will move off a nohz_full CPU once a housekeeping
CPU is available.

Signed-off-by: James Morse &lt;james.morse@arm.com&gt;
Signed-off-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
Reviewed-by: Shaopeng Tan &lt;tan.shaopeng@fujitsu.com&gt;
Reviewed-by: Reinette Chatre &lt;reinette.chatre@intel.com&gt;
Reviewed-by: Babu Moger &lt;babu.moger@amd.com&gt;
Tested-by: Shaopeng Tan &lt;tan.shaopeng@fujitsu.com&gt;
Tested-by: Peter Newman &lt;peternewman@google.com&gt;
Tested-by: Babu Moger &lt;babu.moger@amd.com&gt;
Tested-by: Carl Worth &lt;carl@os.amperecomputing.com&gt; # arm64
Link: https://lore.kernel.org/r/20240213184438.16675-13-james.morse@arm.com
Signed-off-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/resctrl: Allocate the cleanest CLOSID by searching closid_num_dirty_rmid</title>
<updated>2024-02-16T18:18:32+00:00</updated>
<author>
<name>James Morse</name>
<email>james.morse@arm.com</email>
</author>
<published>2024-02-13T18:44:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=6eac36bb9eb0349c983313c71692c19d50b56878'/>
<id>6eac36bb9eb0349c983313c71692c19d50b56878</id>
<content type='text'>
MPAM's PMG bits extend its PARTID space, meaning the same PMG value can be used
for different control groups.

This means once a CLOSID is allocated, all its monitoring ids may still be
dirty, and held in limbo.

Instead of allocating the first free CLOSID, on architectures where
CONFIG_RESCTRL_RMID_DEPENDS_ON_CLOSID is enabled, search
closid_num_dirty_rmid[] to find the cleanest CLOSID.

The CLOSID found is returned to closid_alloc() for the free list
to be updated.

Signed-off-by: James Morse &lt;james.morse@arm.com&gt;
Signed-off-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
Reviewed-by: Shaopeng Tan &lt;tan.shaopeng@fujitsu.com&gt;
Reviewed-by: Reinette Chatre &lt;reinette.chatre@intel.com&gt;
Reviewed-by: Babu Moger &lt;babu.moger@amd.com&gt;
Tested-by: Shaopeng Tan &lt;tan.shaopeng@fujitsu.com&gt;
Tested-by: Peter Newman &lt;peternewman@google.com&gt;
Tested-by: Babu Moger &lt;babu.moger@amd.com&gt;
Tested-by: Carl Worth &lt;carl@os.amperecomputing.com&gt; # arm64
Link: https://lore.kernel.org/r/20240213184438.16675-11-james.morse@arm.com
Signed-off-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
MPAM's PMG bits extend its PARTID space, meaning the same PMG value can be used
for different control groups.

This means once a CLOSID is allocated, all its monitoring ids may still be
dirty, and held in limbo.

Instead of allocating the first free CLOSID, on architectures where
CONFIG_RESCTRL_RMID_DEPENDS_ON_CLOSID is enabled, search
closid_num_dirty_rmid[] to find the cleanest CLOSID.

The CLOSID found is returned to closid_alloc() for the free list
to be updated.

Signed-off-by: James Morse &lt;james.morse@arm.com&gt;
Signed-off-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
Reviewed-by: Shaopeng Tan &lt;tan.shaopeng@fujitsu.com&gt;
Reviewed-by: Reinette Chatre &lt;reinette.chatre@intel.com&gt;
Reviewed-by: Babu Moger &lt;babu.moger@amd.com&gt;
Tested-by: Shaopeng Tan &lt;tan.shaopeng@fujitsu.com&gt;
Tested-by: Peter Newman &lt;peternewman@google.com&gt;
Tested-by: Babu Moger &lt;babu.moger@amd.com&gt;
Tested-by: Carl Worth &lt;carl@os.amperecomputing.com&gt; # arm64
Link: https://lore.kernel.org/r/20240213184438.16675-11-james.morse@arm.com
Signed-off-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
</pre>
</div>
</content>
</entry>
</feed>
