<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/arch/x86/kernel/process_32.c, branch v6.8</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>x86/resctl: fix scheduler confusion with 'current'</title>
<updated>2023-03-08T19:48:11+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2023-03-07T21:06:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=7fef099702527c3b2c5234a2ea6a24411485a13a'/>
<id>7fef099702527c3b2c5234a2ea6a24411485a13a</id>
<content type='text'>
The implementation of 'current' on x86 is very intentionally special: it
is a very common thing to look up, and it uses 'this_cpu_read_stable()'
to get the current thread pointer efficiently from per-cpu storage.

And the keyword in there is 'stable': the current thread pointer never
changes as far as a single thread is concerned.  Even if when a thread
is preempted, or moved to another CPU, or even across an explicit call
'schedule()' that thread will still have the same value for 'current'.

It is, after all, the kernel base pointer to thread-local storage.
That's why it's stable to begin with, but it's also why it's important
enough that we have that special 'this_cpu_read_stable()' access for it.

So this is all done very intentionally to allow the compiler to treat
'current' as a value that never visibly changes, so that the compiler
can do CSE and combine multiple different 'current' accesses into one.

However, there is obviously one very special situation when the
currently running thread does actually change: inside the scheduler
itself.

So the scheduler code paths are special, and do not have a 'current'
thread at all.  Instead there are _two_ threads: the previous and the
next thread - typically called 'prev' and 'next' (or prev_p/next_p)
internally.

So this is all actually quite straightforward and simple, and not all
that complicated.

Except for when you then have special code that is run in scheduler
context, that code then has to be aware that 'current' isn't really a
valid thing.  Did you mean 'prev'? Did you mean 'next'?

In fact, even if then look at the code, and you use 'current' after the
new value has been assigned to the percpu variable, we have explicitly
told the compiler that 'current' is magical and always stable.  So the
compiler is quite free to use an older (or newer) value of 'current',
and the actual assignment to the percpu storage is not relevant even if
it might look that way.

Which is exactly what happened in the resctl code, that blithely used
'current' in '__resctrl_sched_in()' when it really wanted the new
process state (as implied by the name: we're scheduling 'into' that new
resctl state).  And clang would end up just using the old thread pointer
value at least in some configurations.

This could have happened with gcc too, and purely depends on random
compiler details.  Clang just seems to have been more aggressive about
moving the read of the per-cpu current_task pointer around.

The fix is trivial: just make the resctl code adhere to the scheduler
rules of using the prev/next thread pointer explicitly, instead of using
'current' in a situation where it just wasn't valid.

That same code is then also used outside of the scheduler context (when
a thread resctl state is explicitly changed), and then we will just pass
in 'current' as that pointer, of course.  There is no ambiguity in that
case.

The fix may be trivial, but noticing and figuring out what went wrong
was not.  The credit for that goes to Stephane Eranian.

Reported-by: Stephane Eranian &lt;eranian@google.com&gt;
Link: https://lore.kernel.org/lkml/20230303231133.1486085-1-eranian@google.com/
Link: https://lore.kernel.org/lkml/alpine.LFD.2.01.0908011214330.3304@localhost.localdomain/
Reviewed-by: Nick Desaulniers &lt;ndesaulniers@google.com&gt;
Tested-by: Tony Luck &lt;tony.luck@intel.com&gt;
Tested-by: Stephane Eranian &lt;eranian@google.com&gt;
Tested-by: Babu Moger &lt;babu.moger@amd.com&gt;
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The implementation of 'current' on x86 is very intentionally special: it
is a very common thing to look up, and it uses 'this_cpu_read_stable()'
to get the current thread pointer efficiently from per-cpu storage.

And the keyword in there is 'stable': the current thread pointer never
changes as far as a single thread is concerned.  Even if when a thread
is preempted, or moved to another CPU, or even across an explicit call
'schedule()' that thread will still have the same value for 'current'.

It is, after all, the kernel base pointer to thread-local storage.
That's why it's stable to begin with, but it's also why it's important
enough that we have that special 'this_cpu_read_stable()' access for it.

So this is all done very intentionally to allow the compiler to treat
'current' as a value that never visibly changes, so that the compiler
can do CSE and combine multiple different 'current' accesses into one.

However, there is obviously one very special situation when the
currently running thread does actually change: inside the scheduler
itself.

So the scheduler code paths are special, and do not have a 'current'
thread at all.  Instead there are _two_ threads: the previous and the
next thread - typically called 'prev' and 'next' (or prev_p/next_p)
internally.

So this is all actually quite straightforward and simple, and not all
that complicated.

Except for when you then have special code that is run in scheduler
context, that code then has to be aware that 'current' isn't really a
valid thing.  Did you mean 'prev'? Did you mean 'next'?

In fact, even if then look at the code, and you use 'current' after the
new value has been assigned to the percpu variable, we have explicitly
told the compiler that 'current' is magical and always stable.  So the
compiler is quite free to use an older (or newer) value of 'current',
and the actual assignment to the percpu storage is not relevant even if
it might look that way.

Which is exactly what happened in the resctl code, that blithely used
'current' in '__resctrl_sched_in()' when it really wanted the new
process state (as implied by the name: we're scheduling 'into' that new
resctl state).  And clang would end up just using the old thread pointer
value at least in some configurations.

This could have happened with gcc too, and purely depends on random
compiler details.  Clang just seems to have been more aggressive about
moving the read of the per-cpu current_task pointer around.

The fix is trivial: just make the resctl code adhere to the scheduler
rules of using the prev/next thread pointer explicitly, instead of using
'current' in a situation where it just wasn't valid.

That same code is then also used outside of the scheduler context (when
a thread resctl state is explicitly changed), and then we will just pass
in 'current' as that pointer, of course.  There is no ambiguity in that
case.

The fix may be trivial, but noticing and figuring out what went wrong
was not.  The credit for that goes to Stephane Eranian.

Reported-by: Stephane Eranian &lt;eranian@google.com&gt;
Link: https://lore.kernel.org/lkml/20230303231133.1486085-1-eranian@google.com/
Link: https://lore.kernel.org/lkml/alpine.LFD.2.01.0908011214330.3304@localhost.localdomain/
Reviewed-by: Nick Desaulniers &lt;ndesaulniers@google.com&gt;
Tested-by: Tony Luck &lt;tony.luck@intel.com&gt;
Tested-by: Stephane Eranian &lt;eranian@google.com&gt;
Tested-by: Babu Moger &lt;babu.moger@amd.com&gt;
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/percpu: Move current_top_of_stack next to current_task</title>
<updated>2022-10-17T14:41:05+00:00</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2022-09-15T11:11:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=c063a217bc0726c2560138229de5673dbb253a02'/>
<id>c063a217bc0726c2560138229de5673dbb253a02</id>
<content type='text'>
Extend the struct pcpu_hot cacheline with current_top_of_stack;
another very frequently used value.

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Link: https://lore.kernel.org/r/20220915111145.493038635@infradead.org
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Extend the struct pcpu_hot cacheline with current_top_of_stack;
another very frequently used value.

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Link: https://lore.kernel.org/r/20220915111145.493038635@infradead.org
</pre>
</div>
</content>
</entry>
<entry>
<title>x86: Put hot per CPU variables into a struct</title>
<updated>2022-10-17T14:41:03+00:00</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2022-09-15T11:11:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=e57ef2ed97c1d078973298658a8096644a1e9e09'/>
<id>e57ef2ed97c1d078973298658a8096644a1e9e09</id>
<content type='text'>
The layout of per-cpu variables is at the mercy of the compiler. This
can lead to random performance fluctuations from build to build.

Create a structure to hold some of the hottest per-cpu variables,
starting with current_task.

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Link: https://lore.kernel.org/r/20220915111145.179707194@infradead.org
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The layout of per-cpu variables is at the mercy of the compiler. This
can lead to random performance fluctuations from build to build.

Create a structure to hold some of the hottest per-cpu variables,
starting with current_task.

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Link: https://lore.kernel.org/r/20220915111145.179707194@infradead.org
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'x86_core_for_v5.19_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2022-05-24T01:42:07+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2022-05-24T01:42:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=de8ac81747fca15925f3488ead7804560cdea532'/>
<id>de8ac81747fca15925f3488ead7804560cdea532</id>
<content type='text'>
Pull core x86 updates from Borislav Petkov:

 - Remove all the code around GS switching on 32-bit now that it is not
   needed anymore

 - Other misc improvements

* tag 'x86_core_for_v5.19_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  bug: Use normal relative pointers in 'struct bug_entry'
  x86/nmi: Make register_nmi_handler() more robust
  x86/asm: Merge load_gs_index()
  x86/32: Remove lazy GS macros
  ELF: Remove elf_core_copy_kernel_regs()
  x86/32: Simplify ELF_CORE_COPY_REGS
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull core x86 updates from Borislav Petkov:

 - Remove all the code around GS switching on 32-bit now that it is not
   needed anymore

 - Other misc improvements

* tag 'x86_core_for_v5.19_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  bug: Use normal relative pointers in 'struct bug_entry'
  x86/nmi: Make register_nmi_handler() more robust
  x86/asm: Merge load_gs_index()
  x86/32: Remove lazy GS macros
  ELF: Remove elf_core_copy_kernel_regs()
  x86/32: Simplify ELF_CORE_COPY_REGS
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/prctl: Remove pointless task argument</title>
<updated>2022-05-13T10:56:28+00:00</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2022-05-12T12:04:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=f5c0b4f30416c670408a77be94703d04d22b57df'/>
<id>f5c0b4f30416c670408a77be94703d04d22b57df</id>
<content type='text'>
The functions invoked via do_arch_prctl_common() can only operate on
the current task and none of these function uses the task argument.

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Link: https://lore.kernel.org/r/87lev7vtxj.ffs@tglx
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The functions invoked via do_arch_prctl_common() can only operate on
the current task and none of these function uses the task argument.

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Link: https://lore.kernel.org/r/87lev7vtxj.ffs@tglx
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/32: Remove lazy GS macros</title>
<updated>2022-04-14T12:09:43+00:00</updated>
<author>
<name>Brian Gerst</name>
<email>brgerst@gmail.com</email>
</author>
<published>2022-03-25T15:39:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=3a24a60854d2ef19e0edcd11cdbbb4fabc655dde'/>
<id>3a24a60854d2ef19e0edcd11cdbbb4fabc655dde</id>
<content type='text'>
GS is always a user segment now.

Signed-off-by: Brian Gerst &lt;brgerst@gmail.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Reviewed-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Acked-by: Andy Lutomirski &lt;luto@kernel.org&gt;
Link: https://lore.kernel.org/r/20220325153953.162643-4-brgerst@gmail.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
GS is always a user segment now.

Signed-off-by: Brian Gerst &lt;brgerst@gmail.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Reviewed-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Acked-by: Andy Lutomirski &lt;luto@kernel.org&gt;
Link: https://lore.kernel.org/r/20220325153953.162643-4-brgerst@gmail.com
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/fpu: Move context switch and exit to user inlines into sched.h</title>
<updated>2021-10-20T13:27:27+00:00</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2021-10-15T01:16:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=63e81807c1f94e91b9d71c536112a40cd74bab85'/>
<id>63e81807c1f94e91b9d71c536112a40cd74bab85</id>
<content type='text'>
internal.h is a kitchen sink which needs to get out of the way to prepare
for the upcoming changes.

Move the context switch and exit to user inlines into a separate header,
which is all that code needs.

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Link: https://lkml.kernel.org/r/20211015011539.349132461@linutronix.de
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
internal.h is a kitchen sink which needs to get out of the way to prepare
for the upcoming changes.

Move the context switch and exit to user inlines into a separate header,
which is all that code needs.

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Link: https://lkml.kernel.org/r/20211015011539.349132461@linutronix.de
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/fpu: Remove pointless argument from switch_fpu_finish()</title>
<updated>2021-10-20T13:27:25+00:00</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2021-10-15T01:15:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=9568bfb4f04bd9a280c592879ccd7a26a77c1390'/>
<id>9568bfb4f04bd9a280c592879ccd7a26a77c1390</id>
<content type='text'>
Unused since the FPU switching rework.

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Link: https://lkml.kernel.org/r/20211015011538.433135710@linutronix.de
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Unused since the FPU switching rework.

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Link: https://lkml.kernel.org/r/20211015011538.433135710@linutronix.de
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/dumpstack: Add log_lvl to __show_regs()</title>
<updated>2020-07-22T21:56:53+00:00</updated>
<author>
<name>Dmitry Safonov</name>
<email>dima@arista.com</email>
</author>
<published>2020-06-29T14:48:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=44e215352cf17333992d56941b5bf4af60a67609'/>
<id>44e215352cf17333992d56941b5bf4af60a67609</id>
<content type='text'>
show_trace_log_lvl() provides x86 platform-specific way to unwind
backtrace with a given log level. Unfortunately, registers dump(s) are
not printed with the same log level - instead, KERN_DEFAULT is always
used.

Arista's switches uses quite common setup with rsyslog, where only
urgent messages goes to console (console_log_level=KERN_ERR), everything
else goes into /var/log/ as the console baud-rate often is indecently
slow (9600 bps).

Backtrace dumps without registers printed have proven to be as useful as
morning standups. Furthermore, in order to introduce KERN_UNSUPPRESSED
(which I believe is still the most elegant way to fix raciness of sysrq[1])
the log level should be passed down the stack to register dumping
functions. Besides, there is a potential use-case for printing traces
with KERN_DEBUG level [2] (where registers dump shouldn't appear with
higher log level).

Add log_lvl parameter to __show_regs().
Keep the used log level intact to separate visible change.

[1]: https://lore.kernel.org/lkml/20190528002412.1625-1-dima@arista.com/
[2]: https://lore.kernel.org/linux-doc/20190724170249.9644-1-dima@arista.com/

Signed-off-by: Dmitry Safonov &lt;dima@arista.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Acked-by: Petr Mladek &lt;pmladek@suse.com&gt;
Link: https://lkml.kernel.org/r/20200629144847.492794-3-dima@arista.com

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
show_trace_log_lvl() provides x86 platform-specific way to unwind
backtrace with a given log level. Unfortunately, registers dump(s) are
not printed with the same log level - instead, KERN_DEFAULT is always
used.

Arista's switches uses quite common setup with rsyslog, where only
urgent messages goes to console (console_log_level=KERN_ERR), everything
else goes into /var/log/ as the console baud-rate often is indecently
slow (9600 bps).

Backtrace dumps without registers printed have proven to be as useful as
morning standups. Furthermore, in order to introduce KERN_UNSUPPRESSED
(which I believe is still the most elegant way to fix raciness of sysrq[1])
the log level should be passed down the stack to register dumping
functions. Besides, there is a potential use-case for printing traces
with KERN_DEBUG level [2] (where registers dump shouldn't appear with
higher log level).

Add log_lvl parameter to __show_regs().
Keep the used log level intact to separate visible change.

[1]: https://lore.kernel.org/lkml/20190528002412.1625-1-dima@arista.com/
[2]: https://lore.kernel.org/linux-doc/20190724170249.9644-1-dima@arista.com/

Signed-off-by: Dmitry Safonov &lt;dima@arista.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Acked-by: Petr Mladek &lt;pmladek@suse.com&gt;
Link: https://lkml.kernel.org/r/20200629144847.492794-3-dima@arista.com

</pre>
</div>
</content>
</entry>
<entry>
<title>mm: don't include asm/pgtable.h if linux/mm.h is already included</title>
<updated>2020-06-09T16:39:13+00:00</updated>
<author>
<name>Mike Rapoport</name>
<email>rppt@linux.ibm.com</email>
</author>
<published>2020-06-09T04:32:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=e31cf2f4ca422ac9b14ecc4a1295b8977a20f812'/>
<id>e31cf2f4ca422ac9b14ecc4a1295b8977a20f812</id>
<content type='text'>
Patch series "mm: consolidate definitions of page table accessors", v2.

The low level page table accessors (pXY_index(), pXY_offset()) are
duplicated across all architectures and sometimes more than once.  For
instance, we have 31 definition of pgd_offset() for 25 supported
architectures.

Most of these definitions are actually identical and typically it boils
down to, e.g.

static inline unsigned long pmd_index(unsigned long address)
{
        return (address &gt;&gt; PMD_SHIFT) &amp; (PTRS_PER_PMD - 1);
}

static inline pmd_t *pmd_offset(pud_t *pud, unsigned long address)
{
        return (pmd_t *)pud_page_vaddr(*pud) + pmd_index(address);
}

These definitions can be shared among 90% of the arches provided
XYZ_SHIFT, PTRS_PER_XYZ and xyz_page_vaddr() are defined.

For architectures that really need a custom version there is always
possibility to override the generic version with the usual ifdefs magic.

These patches introduce include/linux/pgtable.h that replaces
include/asm-generic/pgtable.h and add the definitions of the page table
accessors to the new header.

This patch (of 12):

The linux/mm.h header includes &lt;asm/pgtable.h&gt; to allow inlining of the
functions involving page table manipulations, e.g.  pte_alloc() and
pmd_alloc().  So, there is no point to explicitly include &lt;asm/pgtable.h&gt;
in the files that include &lt;linux/mm.h&gt;.

The include statements in such cases are remove with a simple loop:

	for f in $(git grep -l "include &lt;linux/mm.h&gt;") ; do
		sed -i -e '/include &lt;asm\/pgtable.h&gt;/ d' $f
	done

Signed-off-by: Mike Rapoport &lt;rppt@linux.ibm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: Borislav Petkov &lt;bp@alien8.de&gt;
Cc: Brian Cain &lt;bcain@codeaurora.org&gt;
Cc: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Cc: Chris Zankel &lt;chris@zankel.net&gt;
Cc: "David S. Miller" &lt;davem@davemloft.net&gt;
Cc: Geert Uytterhoeven &lt;geert@linux-m68k.org&gt;
Cc: Greentime Hu &lt;green.hu@gmail.com&gt;
Cc: Greg Ungerer &lt;gerg@linux-m68k.org&gt;
Cc: Guan Xuetao &lt;gxt@pku.edu.cn&gt;
Cc: Guo Ren &lt;guoren@kernel.org&gt;
Cc: Heiko Carstens &lt;heiko.carstens@de.ibm.com&gt;
Cc: Helge Deller &lt;deller@gmx.de&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Ley Foon Tan &lt;ley.foon.tan@intel.com&gt;
Cc: Mark Salter &lt;msalter@redhat.com&gt;
Cc: Matthew Wilcox &lt;willy@infradead.org&gt;
Cc: Matt Turner &lt;mattst88@gmail.com&gt;
Cc: Max Filippov &lt;jcmvbkbc@gmail.com&gt;
Cc: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Cc: Michal Simek &lt;monstr@monstr.eu&gt;
Cc: Mike Rapoport &lt;rppt@kernel.org&gt;
Cc: Nick Hu &lt;nickhu@andestech.com&gt;
Cc: Paul Walmsley &lt;paul.walmsley@sifive.com&gt;
Cc: Richard Weinberger &lt;richard@nod.at&gt;
Cc: Rich Felker &lt;dalias@libc.org&gt;
Cc: Russell King &lt;linux@armlinux.org.uk&gt;
Cc: Stafford Horne &lt;shorne@gmail.com&gt;
Cc: Thomas Bogendoerfer &lt;tsbogend@alpha.franken.de&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Cc: Vincent Chen &lt;deanbo422@gmail.com&gt;
Cc: Vineet Gupta &lt;vgupta@synopsys.com&gt;
Cc: Will Deacon &lt;will@kernel.org&gt;
Cc: Yoshinori Sato &lt;ysato@users.sourceforge.jp&gt;
Link: http://lkml.kernel.org/r/20200514170327.31389-1-rppt@kernel.org
Link: http://lkml.kernel.org/r/20200514170327.31389-2-rppt@kernel.org
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Patch series "mm: consolidate definitions of page table accessors", v2.

The low level page table accessors (pXY_index(), pXY_offset()) are
duplicated across all architectures and sometimes more than once.  For
instance, we have 31 definition of pgd_offset() for 25 supported
architectures.

Most of these definitions are actually identical and typically it boils
down to, e.g.

static inline unsigned long pmd_index(unsigned long address)
{
        return (address &gt;&gt; PMD_SHIFT) &amp; (PTRS_PER_PMD - 1);
}

static inline pmd_t *pmd_offset(pud_t *pud, unsigned long address)
{
        return (pmd_t *)pud_page_vaddr(*pud) + pmd_index(address);
}

These definitions can be shared among 90% of the arches provided
XYZ_SHIFT, PTRS_PER_XYZ and xyz_page_vaddr() are defined.

For architectures that really need a custom version there is always
possibility to override the generic version with the usual ifdefs magic.

These patches introduce include/linux/pgtable.h that replaces
include/asm-generic/pgtable.h and add the definitions of the page table
accessors to the new header.

This patch (of 12):

The linux/mm.h header includes &lt;asm/pgtable.h&gt; to allow inlining of the
functions involving page table manipulations, e.g.  pte_alloc() and
pmd_alloc().  So, there is no point to explicitly include &lt;asm/pgtable.h&gt;
in the files that include &lt;linux/mm.h&gt;.

The include statements in such cases are remove with a simple loop:

	for f in $(git grep -l "include &lt;linux/mm.h&gt;") ; do
		sed -i -e '/include &lt;asm\/pgtable.h&gt;/ d' $f
	done

Signed-off-by: Mike Rapoport &lt;rppt@linux.ibm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: Borislav Petkov &lt;bp@alien8.de&gt;
Cc: Brian Cain &lt;bcain@codeaurora.org&gt;
Cc: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Cc: Chris Zankel &lt;chris@zankel.net&gt;
Cc: "David S. Miller" &lt;davem@davemloft.net&gt;
Cc: Geert Uytterhoeven &lt;geert@linux-m68k.org&gt;
Cc: Greentime Hu &lt;green.hu@gmail.com&gt;
Cc: Greg Ungerer &lt;gerg@linux-m68k.org&gt;
Cc: Guan Xuetao &lt;gxt@pku.edu.cn&gt;
Cc: Guo Ren &lt;guoren@kernel.org&gt;
Cc: Heiko Carstens &lt;heiko.carstens@de.ibm.com&gt;
Cc: Helge Deller &lt;deller@gmx.de&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Ley Foon Tan &lt;ley.foon.tan@intel.com&gt;
Cc: Mark Salter &lt;msalter@redhat.com&gt;
Cc: Matthew Wilcox &lt;willy@infradead.org&gt;
Cc: Matt Turner &lt;mattst88@gmail.com&gt;
Cc: Max Filippov &lt;jcmvbkbc@gmail.com&gt;
Cc: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Cc: Michal Simek &lt;monstr@monstr.eu&gt;
Cc: Mike Rapoport &lt;rppt@kernel.org&gt;
Cc: Nick Hu &lt;nickhu@andestech.com&gt;
Cc: Paul Walmsley &lt;paul.walmsley@sifive.com&gt;
Cc: Richard Weinberger &lt;richard@nod.at&gt;
Cc: Rich Felker &lt;dalias@libc.org&gt;
Cc: Russell King &lt;linux@armlinux.org.uk&gt;
Cc: Stafford Horne &lt;shorne@gmail.com&gt;
Cc: Thomas Bogendoerfer &lt;tsbogend@alpha.franken.de&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Cc: Vincent Chen &lt;deanbo422@gmail.com&gt;
Cc: Vineet Gupta &lt;vgupta@synopsys.com&gt;
Cc: Will Deacon &lt;will@kernel.org&gt;
Cc: Yoshinori Sato &lt;ysato@users.sourceforge.jp&gt;
Link: http://lkml.kernel.org/r/20200514170327.31389-1-rppt@kernel.org
Link: http://lkml.kernel.org/r/20200514170327.31389-2-rppt@kernel.org
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
