<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/arch/x86/kernel/cpu, branch v5.11</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>x86/sgx: Maintain encl-&gt;refcount for each encl-&gt;mm_list entry</title>
<updated>2021-02-08T18:11:30+00:00</updated>
<author>
<name>Jarkko Sakkinen</name>
<email>jarkko@kernel.org</email>
</author>
<published>2021-02-07T22:14:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=2ade0d60939bcd54197c133b03b460fe62a4ec47'/>
<id>2ade0d60939bcd54197c133b03b460fe62a4ec47</id>
<content type='text'>
This has been shown in tests:

[  +0.000008] WARNING: CPU: 3 PID: 7620 at kernel/rcu/srcutree.c:374 cleanup_srcu_struct+0xed/0x100

This is essentially a use-after free, although SRCU notices it as
an SRCU cleanup in an invalid context.

== Background ==

SGX has a data structure (struct sgx_encl_mm) which keeps per-mm SGX
metadata.  This is separate from struct sgx_encl because, in theory,
an enclave can be mapped from more than one mm.  sgx_encl_mm includes
a pointer back to the sgx_encl.

This means that sgx_encl must have a longer lifetime than all of the
sgx_encl_mm's that point to it.  That's usually the case: sgx_encl_mm
is freed only after the mmu_notifier is unregistered in sgx_release().

However, there's a race.  If the process is exiting,
sgx_mmu_notifier_release() can be called in parallel with sgx_release()
instead of being called *by* it.  The mmu_notifier path keeps encl_mm
alive past when sgx_encl can be freed.  This inverts the lifetime rules
and means that sgx_mmu_notifier_release() can access a freed sgx_encl.

== Fix ==

Increase encl-&gt;refcount when encl_mm-&gt;encl is established. Release
this reference when encl_mm is freed. This ensures that encl outlives
encl_mm.

 [ bp: Massage commit message. ]

Fixes: 1728ab54b4be ("x86/sgx: Add a page reclaimer")
Reported-by: Haitao Huang &lt;haitao.huang@linux.intel.com&gt;
Signed-off-by: Jarkko Sakkinen &lt;jarkko@kernel.org&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Acked-by: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Link: https://lkml.kernel.org/r/20210207221401.29933-1-jarkko@kernel.org
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This has been shown in tests:

[  +0.000008] WARNING: CPU: 3 PID: 7620 at kernel/rcu/srcutree.c:374 cleanup_srcu_struct+0xed/0x100

This is essentially a use-after free, although SRCU notices it as
an SRCU cleanup in an invalid context.

== Background ==

SGX has a data structure (struct sgx_encl_mm) which keeps per-mm SGX
metadata.  This is separate from struct sgx_encl because, in theory,
an enclave can be mapped from more than one mm.  sgx_encl_mm includes
a pointer back to the sgx_encl.

This means that sgx_encl must have a longer lifetime than all of the
sgx_encl_mm's that point to it.  That's usually the case: sgx_encl_mm
is freed only after the mmu_notifier is unregistered in sgx_release().

However, there's a race.  If the process is exiting,
sgx_mmu_notifier_release() can be called in parallel with sgx_release()
instead of being called *by* it.  The mmu_notifier path keeps encl_mm
alive past when sgx_encl can be freed.  This inverts the lifetime rules
and means that sgx_mmu_notifier_release() can access a freed sgx_encl.

== Fix ==

Increase encl-&gt;refcount when encl_mm-&gt;encl is established. Release
this reference when encl_mm is freed. This ensures that encl outlives
encl_mm.

 [ bp: Massage commit message. ]

Fixes: 1728ab54b4be ("x86/sgx: Add a page reclaimer")
Reported-by: Haitao Huang &lt;haitao.huang@linux.intel.com&gt;
Signed-off-by: Jarkko Sakkinen &lt;jarkko@kernel.org&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Acked-by: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Link: https://lkml.kernel.org/r/20210207221401.29933-1-jarkko@kernel.org
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/split_lock: Enable the split lock feature on another Alder Lake CPU</title>
<updated>2021-02-01T20:34:51+00:00</updated>
<author>
<name>Fenghua Yu</name>
<email>fenghua.yu@intel.com</email>
</author>
<published>2021-02-01T19:00:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=8acf417805a5f5c69e9ff66f14cab022c2755161'/>
<id>8acf417805a5f5c69e9ff66f14cab022c2755161</id>
<content type='text'>
Add Alder Lake mobile processor to CPU list to enumerate and enable the
split lock feature.

Signed-off-by: Fenghua Yu &lt;fenghua.yu@intel.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Reviewed-by: Tony Luck &lt;tony.luck@intel.com&gt;
Link: https://lkml.kernel.org/r/20210201190007.4031869-1-fenghua.yu@intel.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add Alder Lake mobile processor to CPU list to enumerate and enable the
split lock feature.

Signed-off-by: Fenghua Yu &lt;fenghua.yu@intel.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Reviewed-by: Tony Luck &lt;tony.luck@intel.com&gt;
Link: https://lkml.kernel.org/r/20210201190007.4031869-1-fenghua.yu@intel.com
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'x86_urgent_for_v5.11_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2021-01-24T17:46:05+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2021-01-24T17:46:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=17b6c49da37f5d57d76bf352d32b0ac498e7c133'/>
<id>17b6c49da37f5d57d76bf352d32b0ac498e7c133</id>
<content type='text'>
Pull x86 fixes from Borislav Petkov:

 - Add a new Intel model number for Alder Lake

 - Differentiate which aspects of the FPU state get saved/restored when
   the FPU is used in-kernel and fix a boot crash on K7 due to early
   MXCSR access before CR4.OSFXSR is even set.

 - A couple of noinstr annotation fixes

 - Correct die ID setting on AMD for users of topology information which
   need the correct die ID

 - A SEV-ES fix to handle string port IO to/from kernel memory properly

* tag 'x86_urgent_for_v5.11_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/cpu: Add another Alder Lake CPU to the Intel family
  x86/mmx: Use KFPU_387 for MMX string operations
  x86/fpu: Add kernel_fpu_begin_mask() to selectively initialize state
  x86/topology: Make __max_die_per_package available unconditionally
  x86: __always_inline __{rd,wr}msr()
  x86/mce: Remove explicit/superfluous tracing
  locking/lockdep: Avoid noinstr warning for DEBUG_LOCKDEP
  locking/lockdep: Cure noinstr fail
  x86/sev: Fix nonistr violation
  x86/entry: Fix noinstr fail
  x86/cpu/amd: Set __max_die_per_package on AMD
  x86/sev-es: Handle string port IO to kernel memory properly
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull x86 fixes from Borislav Petkov:

 - Add a new Intel model number for Alder Lake

 - Differentiate which aspects of the FPU state get saved/restored when
   the FPU is used in-kernel and fix a boot crash on K7 due to early
   MXCSR access before CR4.OSFXSR is even set.

 - A couple of noinstr annotation fixes

 - Correct die ID setting on AMD for users of topology information which
   need the correct die ID

 - A SEV-ES fix to handle string port IO to/from kernel memory properly

* tag 'x86_urgent_for_v5.11_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/cpu: Add another Alder Lake CPU to the Intel family
  x86/mmx: Use KFPU_387 for MMX string operations
  x86/fpu: Add kernel_fpu_begin_mask() to selectively initialize state
  x86/topology: Make __max_die_per_package available unconditionally
  x86: __always_inline __{rd,wr}msr()
  x86/mce: Remove explicit/superfluous tracing
  locking/lockdep: Avoid noinstr warning for DEBUG_LOCKDEP
  locking/lockdep: Cure noinstr fail
  x86/sev: Fix nonistr violation
  x86/entry: Fix noinstr fail
  x86/cpu/amd: Set __max_die_per_package on AMD
  x86/sev-es: Handle string port IO to kernel memory properly
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/topology: Make __max_die_per_package available unconditionally</title>
<updated>2021-01-14T11:18:36+00:00</updated>
<author>
<name>Borislav Petkov</name>
<email>bp@suse.de</email>
</author>
<published>2021-01-14T09:36:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=1eb8f690bcb565a6600f8b6dcc78f7b239ceba17'/>
<id>1eb8f690bcb565a6600f8b6dcc78f7b239ceba17</id>
<content type='text'>
Move it outside of CONFIG_SMP in order to avoid ifdeffery at the usage
sites.

Fixes: 76e2fc63ca40 ("x86/cpu/amd: Set __max_die_per_package on AMD")
Reported-by: Stephen Rothwell &lt;sfr@canb.auug.org.au&gt;
Reported-by: kernel test robot &lt;lkp@intel.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Link: https://lkml.kernel.org/r/20210114111814.5346-1-bp@alien8.de
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Move it outside of CONFIG_SMP in order to avoid ifdeffery at the usage
sites.

Fixes: 76e2fc63ca40 ("x86/cpu/amd: Set __max_die_per_package on AMD")
Reported-by: Stephen Rothwell &lt;sfr@canb.auug.org.au&gt;
Reported-by: kernel test robot &lt;lkp@intel.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Link: https://lkml.kernel.org/r/20210114111814.5346-1-bp@alien8.de
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/mce: Remove explicit/superfluous tracing</title>
<updated>2021-01-12T20:10:59+00:00</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2021-01-06T14:36:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=737495361d4469477ffe45d51e6fc56f44f3cc6a'/>
<id>737495361d4469477ffe45d51e6fc56f44f3cc6a</id>
<content type='text'>
There's some explicit tracing left in exc_machine_check_kernel(),
remove it, as it's already implied by irqentry_nmi_enter().

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Link: https://lore.kernel.org/r/20210106144017.719310466@infradead.org

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
There's some explicit tracing left in exc_machine_check_kernel(),
remove it, as it's already implied by irqentry_nmi_enter().

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Link: https://lore.kernel.org/r/20210106144017.719310466@infradead.org

</pre>
</div>
</content>
</entry>
<entry>
<title>x86/cpu/amd: Set __max_die_per_package on AMD</title>
<updated>2021-01-12T11:21:01+00:00</updated>
<author>
<name>Yazen Ghannam</name>
<email>Yazen.Ghannam@amd.com</email>
</author>
<published>2021-01-11T10:04:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=76e2fc63ca40977af893b724b00cc2f8e9ce47a4'/>
<id>76e2fc63ca40977af893b724b00cc2f8e9ce47a4</id>
<content type='text'>
Set the maximum DIE per package variable on AMD using the
NodesPerProcessor topology value. This will be used by RAPL, among
others, to determine the maximum number of DIEs on the system in order
to do per-DIE manipulations.

 [ bp: Productize into a proper patch. ]

Fixes: 028c221ed190 ("x86/CPU/AMD: Save AMD NodeId as cpu_die_id")
Reported-by: Johnathan Smithinovic &lt;johnathan.smithinovic@gmx.at&gt;
Reported-by: Rafael Kitover &lt;rkitover@gmail.com&gt;
Signed-off-by: Yazen Ghannam &lt;Yazen.Ghannam@amd.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Tested-by: Johnathan Smithinovic &lt;johnathan.smithinovic@gmx.at&gt;
Tested-by: Rafael Kitover &lt;rkitover@gmail.com&gt;
Link: https://bugzilla.kernel.org/show_bug.cgi?id=210939
Link: https://lkml.kernel.org/r/20210106112106.GE5729@zn.tnic
Link: https://lkml.kernel.org/r/20210111101455.1194-1-bp@alien8.de
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Set the maximum DIE per package variable on AMD using the
NodesPerProcessor topology value. This will be used by RAPL, among
others, to determine the maximum number of DIEs on the system in order
to do per-DIE manipulations.

 [ bp: Productize into a proper patch. ]

Fixes: 028c221ed190 ("x86/CPU/AMD: Save AMD NodeId as cpu_die_id")
Reported-by: Johnathan Smithinovic &lt;johnathan.smithinovic@gmx.at&gt;
Reported-by: Rafael Kitover &lt;rkitover@gmail.com&gt;
Signed-off-by: Yazen Ghannam &lt;Yazen.Ghannam@amd.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Tested-by: Johnathan Smithinovic &lt;johnathan.smithinovic@gmx.at&gt;
Tested-by: Rafael Kitover &lt;rkitover@gmail.com&gt;
Link: https://bugzilla.kernel.org/show_bug.cgi?id=210939
Link: https://lkml.kernel.org/r/20210106112106.GE5729@zn.tnic
Link: https://lkml.kernel.org/r/20210111101455.1194-1-bp@alien8.de
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'hyperv-fixes-signed-20210111' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux</title>
<updated>2021-01-11T19:28:58+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2021-01-11T19:28:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=f1ee3e150bd9da2dd60a210926c86cffd4a336ea'/>
<id>f1ee3e150bd9da2dd60a210926c86cffd4a336ea</id>
<content type='text'>
Pull hyperv fixes from Wei Liu:

  - fix kexec panic/hang (Dexuan Cui)

  - fix occasional crashes when flushing TLB (Wei Liu)

* tag 'hyperv-fixes-signed-20210111' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
  x86/hyperv: check cpu mask after interrupt has been disabled
  x86/hyperv: Fix kexec panic/hang issues
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull hyperv fixes from Wei Liu:

  - fix kexec panic/hang (Dexuan Cui)

  - fix occasional crashes when flushing TLB (Wei Liu)

* tag 'hyperv-fixes-signed-20210111' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
  x86/hyperv: check cpu mask after interrupt has been disabled
  x86/hyperv: Fix kexec panic/hang issues
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/resctrl: Don't move a task to the same resource group</title>
<updated>2021-01-08T08:08:03+00:00</updated>
<author>
<name>Fenghua Yu</name>
<email>fenghua.yu@intel.com</email>
</author>
<published>2020-12-17T22:31:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=a0195f314a25582b38993bf30db11c300f4f4611'/>
<id>a0195f314a25582b38993bf30db11c300f4f4611</id>
<content type='text'>
Shakeel Butt reported in [1] that a user can request a task to be moved
to a resource group even if the task is already in the group. It just
wastes time to do the move operation which could be costly to send IPI
to a different CPU.

Add a sanity check to ensure that the move operation only happens when
the task is not already in the resource group.

[1] https://lore.kernel.org/lkml/CALvZod7E9zzHwenzf7objzGKsdBmVwTgEJ0nPgs0LUFU3SN5Pw@mail.gmail.com/

Fixes: e02737d5b826 ("x86/intel_rdt: Add tasks files")
Reported-by: Shakeel Butt &lt;shakeelb@google.com&gt;
Signed-off-by: Fenghua Yu &lt;fenghua.yu@intel.com&gt;
Signed-off-by: Reinette Chatre &lt;reinette.chatre@intel.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Reviewed-by: Tony Luck &lt;tony.luck@intel.com&gt;
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/962ede65d8e95be793cb61102cca37f7bb018e66.1608243147.git.reinette.chatre@intel.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Shakeel Butt reported in [1] that a user can request a task to be moved
to a resource group even if the task is already in the group. It just
wastes time to do the move operation which could be costly to send IPI
to a different CPU.

Add a sanity check to ensure that the move operation only happens when
the task is not already in the resource group.

[1] https://lore.kernel.org/lkml/CALvZod7E9zzHwenzf7objzGKsdBmVwTgEJ0nPgs0LUFU3SN5Pw@mail.gmail.com/

Fixes: e02737d5b826 ("x86/intel_rdt: Add tasks files")
Reported-by: Shakeel Butt &lt;shakeelb@google.com&gt;
Signed-off-by: Fenghua Yu &lt;fenghua.yu@intel.com&gt;
Signed-off-by: Reinette Chatre &lt;reinette.chatre@intel.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Reviewed-by: Tony Luck &lt;tony.luck@intel.com&gt;
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/962ede65d8e95be793cb61102cca37f7bb018e66.1608243147.git.reinette.chatre@intel.com
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/resctrl: Use an IPI instead of task_work_add() to update PQR_ASSOC MSR</title>
<updated>2021-01-08T08:03:36+00:00</updated>
<author>
<name>Fenghua Yu</name>
<email>fenghua.yu@intel.com</email>
</author>
<published>2020-12-17T22:31:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=ae28d1aae48a1258bd09a6f707ebb4231d79a761'/>
<id>ae28d1aae48a1258bd09a6f707ebb4231d79a761</id>
<content type='text'>
Currently, when moving a task to a resource group the PQR_ASSOC MSR is
updated with the new closid and rmid in an added task callback. If the
task is running, the work is run as soon as possible. If the task is not
running, the work is executed later in the kernel exit path when the
kernel returns to the task again.

Updating the PQR_ASSOC MSR as soon as possible on the CPU a moved task
is running is the right thing to do. Queueing work for a task that is
not running is unnecessary (the PQR_ASSOC MSR is already updated when
the task is scheduled in) and causing system resource waste with the way
in which it is implemented: Work to update the PQR_ASSOC register is
queued every time the user writes a task id to the "tasks" file, even if
the task already belongs to the resource group.

This could result in multiple pending work items associated with a
single task even if they are all identical and even though only a single
update with most recent values is needed. Specifically, even if a task
is moved between different resource groups while it is sleeping then it
is only the last move that is relevant but yet a work item is queued
during each move.

This unnecessary queueing of work items could result in significant
system resource waste, especially on tasks sleeping for a long time.
For example, as demonstrated by Shakeel Butt in [1] writing the same
task id to the "tasks" file can quickly consume significant memory. The
same problem (wasted system resources) occurs when moving a task between
different resource groups.

As pointed out by Valentin Schneider in [2] there is an additional issue
with the way in which the queueing of work is done in that the task_struct
update is currently done after the work is queued, resulting in a race with
the register update possibly done before the data needed by the update is
available.

To solve these issues, update the PQR_ASSOC MSR in a synchronous way
right after the new closid and rmid are ready during the task movement,
only if the task is running. If a moved task is not running nothing
is done since the PQR_ASSOC MSR will be updated next time the task is
scheduled. This is the same way used to update the register when tasks
are moved as part of resource group removal.

[1] https://lore.kernel.org/lkml/CALvZod7E9zzHwenzf7objzGKsdBmVwTgEJ0nPgs0LUFU3SN5Pw@mail.gmail.com/
[2] https://lore.kernel.org/lkml/20201123022433.17905-1-valentin.schneider@arm.com

 [ bp: Massage commit message and drop the two update_task_closid_rmid()
   variants. ]

Fixes: e02737d5b826 ("x86/intel_rdt: Add tasks files")
Reported-by: Shakeel Butt &lt;shakeelb@google.com&gt;
Reported-by: Valentin Schneider &lt;valentin.schneider@arm.com&gt;
Signed-off-by: Fenghua Yu &lt;fenghua.yu@intel.com&gt;
Signed-off-by: Reinette Chatre &lt;reinette.chatre@intel.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Reviewed-by: Tony Luck &lt;tony.luck@intel.com&gt;
Reviewed-by: James Morse &lt;james.morse@arm.com&gt;
Reviewed-by: Valentin Schneider &lt;valentin.schneider@arm.com&gt;
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/17aa2fb38fc12ce7bb710106b3e7c7b45acb9e94.1608243147.git.reinette.chatre@intel.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently, when moving a task to a resource group the PQR_ASSOC MSR is
updated with the new closid and rmid in an added task callback. If the
task is running, the work is run as soon as possible. If the task is not
running, the work is executed later in the kernel exit path when the
kernel returns to the task again.

Updating the PQR_ASSOC MSR as soon as possible on the CPU a moved task
is running is the right thing to do. Queueing work for a task that is
not running is unnecessary (the PQR_ASSOC MSR is already updated when
the task is scheduled in) and causing system resource waste with the way
in which it is implemented: Work to update the PQR_ASSOC register is
queued every time the user writes a task id to the "tasks" file, even if
the task already belongs to the resource group.

This could result in multiple pending work items associated with a
single task even if they are all identical and even though only a single
update with most recent values is needed. Specifically, even if a task
is moved between different resource groups while it is sleeping then it
is only the last move that is relevant but yet a work item is queued
during each move.

This unnecessary queueing of work items could result in significant
system resource waste, especially on tasks sleeping for a long time.
For example, as demonstrated by Shakeel Butt in [1] writing the same
task id to the "tasks" file can quickly consume significant memory. The
same problem (wasted system resources) occurs when moving a task between
different resource groups.

As pointed out by Valentin Schneider in [2] there is an additional issue
with the way in which the queueing of work is done in that the task_struct
update is currently done after the work is queued, resulting in a race with
the register update possibly done before the data needed by the update is
available.

To solve these issues, update the PQR_ASSOC MSR in a synchronous way
right after the new closid and rmid are ready during the task movement,
only if the task is running. If a moved task is not running nothing
is done since the PQR_ASSOC MSR will be updated next time the task is
scheduled. This is the same way used to update the register when tasks
are moved as part of resource group removal.

[1] https://lore.kernel.org/lkml/CALvZod7E9zzHwenzf7objzGKsdBmVwTgEJ0nPgs0LUFU3SN5Pw@mail.gmail.com/
[2] https://lore.kernel.org/lkml/20201123022433.17905-1-valentin.schneider@arm.com

 [ bp: Massage commit message and drop the two update_task_closid_rmid()
   variants. ]

Fixes: e02737d5b826 ("x86/intel_rdt: Add tasks files")
Reported-by: Shakeel Butt &lt;shakeelb@google.com&gt;
Reported-by: Valentin Schneider &lt;valentin.schneider@arm.com&gt;
Signed-off-by: Fenghua Yu &lt;fenghua.yu@intel.com&gt;
Signed-off-by: Reinette Chatre &lt;reinette.chatre@intel.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Reviewed-by: Tony Luck &lt;tony.luck@intel.com&gt;
Reviewed-by: James Morse &lt;james.morse@arm.com&gt;
Reviewed-by: Valentin Schneider &lt;valentin.schneider@arm.com&gt;
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/17aa2fb38fc12ce7bb710106b3e7c7b45acb9e94.1608243147.git.reinette.chatre@intel.com
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/mtrr: Correct the range check before performing MTRR type lookups</title>
<updated>2021-01-06T12:01:13+00:00</updated>
<author>
<name>Ying-Tsun Huang</name>
<email>ying-tsun.huang@amd.com</email>
</author>
<published>2020-12-15T07:07:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=cb7f4a8b1fb426a175d1708f05581939c61329d4'/>
<id>cb7f4a8b1fb426a175d1708f05581939c61329d4</id>
<content type='text'>
In mtrr_type_lookup(), if the input memory address region is not in the
MTRR, over 4GB, and not over the top of memory, a write-back attribute
is returned. These condition checks are for ensuring the input memory
address region is actually mapped to the physical memory.

However, if the end address is just aligned with the top of memory,
the condition check treats the address is over the top of memory, and
write-back attribute is not returned.

And this hits in a real use case with NVDIMM: the nd_pmem module tries
to map NVDIMMs as cacheable memories when NVDIMMs are connected. If a
NVDIMM is the last of the DIMMs, the performance of this NVDIMM becomes
very low since it is aligned with the top of memory and its memory type
is uncached-minus.

Move the input end address change to inclusive up into
mtrr_type_lookup(), before checking for the top of memory in either
mtrr_type_lookup_{variable,fixed}() helpers.

 [ bp: Massage commit message. ]

Fixes: 0cc705f56e40 ("x86/mm/mtrr: Clean up mtrr_type_lookup()")
Signed-off-by: Ying-Tsun Huang &lt;ying-tsun.huang@amd.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Link: https://lkml.kernel.org/r/20201215070721.4349-1-ying-tsun.huang@amd.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In mtrr_type_lookup(), if the input memory address region is not in the
MTRR, over 4GB, and not over the top of memory, a write-back attribute
is returned. These condition checks are for ensuring the input memory
address region is actually mapped to the physical memory.

However, if the end address is just aligned with the top of memory,
the condition check treats the address is over the top of memory, and
write-back attribute is not returned.

And this hits in a real use case with NVDIMM: the nd_pmem module tries
to map NVDIMMs as cacheable memories when NVDIMMs are connected. If a
NVDIMM is the last of the DIMMs, the performance of this NVDIMM becomes
very low since it is aligned with the top of memory and its memory type
is uncached-minus.

Move the input end address change to inclusive up into
mtrr_type_lookup(), before checking for the top of memory in either
mtrr_type_lookup_{variable,fixed}() helpers.

 [ bp: Massage commit message. ]

Fixes: 0cc705f56e40 ("x86/mm/mtrr: Clean up mtrr_type_lookup()")
Signed-off-by: Ying-Tsun Huang &lt;ying-tsun.huang@amd.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Link: https://lkml.kernel.org/r/20201215070721.4349-1-ying-tsun.huang@amd.com
</pre>
</div>
</content>
</entry>
</feed>
