<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/include/linux/mm_types.h, branch v4.12.7</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>mm, mprotect: flush TLB if potentially racing with a parallel reclaim leaving stale TLB entries</title>
<updated>2017-08-11T15:33:51+00:00</updated>
<author>
<name>Mel Gorman</name>
<email>mgorman@suse.de</email>
</author>
<published>2017-08-02T20:31:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=bd081d335910bc5a67000f12c94d2951a013e67f'/>
<id>bd081d335910bc5a67000f12c94d2951a013e67f</id>
<content type='text'>
commit 3ea277194daaeaa84ce75180ec7c7a2075027a68 upstream.

Nadav Amit identified a theoritical race between page reclaim and
mprotect due to TLB flushes being batched outside of the PTL being held.

He described the race as follows:

        CPU0                            CPU1
        ----                            ----
                                        user accesses memory using RW PTE
                                        [PTE now cached in TLB]
        try_to_unmap_one()
        ==&gt; ptep_get_and_clear()
        ==&gt; set_tlb_ubc_flush_pending()
                                        mprotect(addr, PROT_READ)
                                        ==&gt; change_pte_range()
                                        ==&gt; [ PTE non-present - no flush ]

                                        user writes using cached RW PTE
        ...

        try_to_unmap_flush()

The same type of race exists for reads when protecting for PROT_NONE and
also exists for operations that can leave an old TLB entry behind such
as munmap, mremap and madvise.

For some operations like mprotect, it's not necessarily a data integrity
issue but it is a correctness issue as there is a window where an
mprotect that limits access still allows access.  For munmap, it's
potentially a data integrity issue although the race is massive as an
munmap, mmap and return to userspace must all complete between the
window when reclaim drops the PTL and flushes the TLB.  However, it's
theoritically possible so handle this issue by flushing the mm if
reclaim is potentially currently batching TLB flushes.

Other instances where a flush is required for a present pte should be ok
as either the page lock is held preventing parallel reclaim or a page
reference count is elevated preventing a parallel free leading to
corruption.  In the case of page_mkclean there isn't an obvious path
that userspace could take advantage of without using the operations that
are guarded by this patch.  Other users such as gup as a race with
reclaim looks just at PTEs.  huge page variants should be ok as they
don't race with reclaim.  mincore only looks at PTEs.  userfault also
should be ok as if a parallel reclaim takes place, it will either fault
the page back in or read some of the data before the flush occurs
triggering a fault.

Note that a variant of this patch was acked by Andy Lutomirski but this
was for the x86 parts on top of his PCID work which didn't make the 4.13
merge window as expected.  His ack is dropped from this version and
there will be a follow-on patch on top of PCID that will include his
ack.

[akpm@linux-foundation.org: tweak comments]
[akpm@linux-foundation.org: fix spello]
Link: http://lkml.kernel.org/r/20170717155523.emckq2esjro6hf3z@suse.de
Reported-by: Nadav Amit &lt;nadav.amit@gmail.com&gt;
Signed-off-by: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Andy Lutomirski &lt;luto@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 3ea277194daaeaa84ce75180ec7c7a2075027a68 upstream.

Nadav Amit identified a theoritical race between page reclaim and
mprotect due to TLB flushes being batched outside of the PTL being held.

He described the race as follows:

        CPU0                            CPU1
        ----                            ----
                                        user accesses memory using RW PTE
                                        [PTE now cached in TLB]
        try_to_unmap_one()
        ==&gt; ptep_get_and_clear()
        ==&gt; set_tlb_ubc_flush_pending()
                                        mprotect(addr, PROT_READ)
                                        ==&gt; change_pte_range()
                                        ==&gt; [ PTE non-present - no flush ]

                                        user writes using cached RW PTE
        ...

        try_to_unmap_flush()

The same type of race exists for reads when protecting for PROT_NONE and
also exists for operations that can leave an old TLB entry behind such
as munmap, mremap and madvise.

For some operations like mprotect, it's not necessarily a data integrity
issue but it is a correctness issue as there is a window where an
mprotect that limits access still allows access.  For munmap, it's
potentially a data integrity issue although the race is massive as an
munmap, mmap and return to userspace must all complete between the
window when reclaim drops the PTL and flushes the TLB.  However, it's
theoritically possible so handle this issue by flushing the mm if
reclaim is potentially currently batching TLB flushes.

Other instances where a flush is required for a present pte should be ok
as either the page lock is held preventing parallel reclaim or a page
reference count is elevated preventing a parallel free leading to
corruption.  In the case of page_mkclean there isn't an obvious path
that userspace could take advantage of without using the operations that
are guarded by this patch.  Other users such as gup as a race with
reclaim looks just at PTEs.  huge page variants should be ok as they
don't race with reclaim.  mincore only looks at PTEs.  userfault also
should be ok as if a parallel reclaim takes place, it will either fault
the page back in or read some of the data before the flush occurs
triggering a fault.

Note that a variant of this patch was acked by Andy Lutomirski but this
was for the x86 parts on top of his PCID work which didn't make the 4.13
merge window as expected.  His ack is dropped from this version and
there will be a follow-on patch on top of PCID that will include his
ack.

[akpm@linux-foundation.org: tweak comments]
[akpm@linux-foundation.org: fix spello]
Link: http://lkml.kernel.org/r/20170717155523.emckq2esjro6hf3z@suse.de
Reported-by: Nadav Amit &lt;nadav.amit@gmail.com&gt;
Signed-off-by: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Andy Lutomirski &lt;luto@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>x86/mm: Introduce mmap_compat_base() for 32-bit mmap()</title>
<updated>2017-03-13T13:59:22+00:00</updated>
<author>
<name>Dmitry Safonov</name>
<email>dsafonov@virtuozzo.com</email>
</author>
<published>2017-03-06T14:17:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=1b028f784e8c341e762c264f70dc0ca1418c8b7a'/>
<id>1b028f784e8c341e762c264f70dc0ca1418c8b7a</id>
<content type='text'>
mmap() uses a base address, from which it starts to look for a free space
for allocation.

The base address is stored in mm-&gt;mmap_base, which is calculated during
exec(). The address depends on task's size, set rlimit for stack, ASLR
randomization. The base depends on the task size and the number of random
bits which are different for 64-bit and 32bit applications.

Due to the fact, that the base address is fixed, its mmap() from a compat
(32bit) syscall issued by a 64bit task will return a address which is based
on the 64bit base address and does not fit into the 32bit address space
(4GB). The returned pointer is truncated to 32bit, which results in an
invalid address.

To solve store a seperate compat address base plus a compat legacy address
base in mm_struct. These bases are calculated at exec() time and can be
used later to address the 32bit compat mmap() issued by 64 bit
applications.

As a consequence of this change 32-bit applications issuing a 64-bit
syscall (after doing a long jump) will get a 64-bit mapping now. Before
this change 32-bit applications always got a 32bit mapping.

[ tglx: Massaged changelog and added a comment ]

Signed-off-by: Dmitry Safonov &lt;dsafonov@virtuozzo.com&gt;
Cc: 0x7f454c46@gmail.com
Cc: linux-mm@kvack.org
Cc: Andy Lutomirski &lt;luto@kernel.org&gt;
Cc: Cyrill Gorcunov &lt;gorcunov@openvz.org&gt;
Cc: Borislav Petkov &lt;bp@suse.de&gt;
Cc: "Kirill A. Shutemov" &lt;kirill.shutemov@linux.intel.com&gt;
Link: http://lkml.kernel.org/r/20170306141721.9188-4-dsafonov@virtuozzo.com
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
mmap() uses a base address, from which it starts to look for a free space
for allocation.

The base address is stored in mm-&gt;mmap_base, which is calculated during
exec(). The address depends on task's size, set rlimit for stack, ASLR
randomization. The base depends on the task size and the number of random
bits which are different for 64-bit and 32bit applications.

Due to the fact, that the base address is fixed, its mmap() from a compat
(32bit) syscall issued by a 64bit task will return a address which is based
on the 64bit base address and does not fit into the 32bit address space
(4GB). The returned pointer is truncated to 32bit, which results in an
invalid address.

To solve store a seperate compat address base plus a compat legacy address
base in mm_struct. These bases are calculated at exec() time and can be
used later to address the 32bit compat mmap() issued by 64 bit
applications.

As a consequence of this change 32-bit applications issuing a 64-bit
syscall (after doing a long jump) will get a 64-bit mapping now. Before
this change 32-bit applications always got a 32bit mapping.

[ tglx: Massaged changelog and added a comment ]

Signed-off-by: Dmitry Safonov &lt;dsafonov@virtuozzo.com&gt;
Cc: 0x7f454c46@gmail.com
Cc: linux-mm@kvack.org
Cc: Andy Lutomirski &lt;luto@kernel.org&gt;
Cc: Cyrill Gorcunov &lt;gorcunov@openvz.org&gt;
Cc: Borislav Petkov &lt;bp@suse.de&gt;
Cc: "Kirill A. Shutemov" &lt;kirill.shutemov@linux.intel.com&gt;
Link: http://lkml.kernel.org/r/20170306141721.9188-4-dsafonov@virtuozzo.com
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>mm/headers, sched/headers: Move task related MM types from &lt;linux/mm_types.&gt; to &lt;linux/mm_types_task.h&gt;</title>
<updated>2017-03-03T00:43:48+00:00</updated>
<author>
<name>Ingo Molnar</name>
<email>mingo@kernel.org</email>
</author>
<published>2017-02-03T23:12:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=9e7d2e44dd88ba7e29c165b6fca428e384afa5a8'/>
<id>9e7d2e44dd88ba7e29c165b6fca428e384afa5a8</id>
<content type='text'>
Separate all the MM types that are embedded directly in 'struct task_struct'
into the &lt;linux/mm_types_task.h&gt; header.

The goal is to include this header in &lt;linux/sched.h&gt;, not the full &lt;linux/mm_types.h&gt;
header, to reduce the size, complexity and coupling of &lt;linux/sched.h&gt;.

(This patch does not change &lt;linux/sched.h&gt; yet.)

Acked-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Mike Galbraith &lt;efault@gmx.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Separate all the MM types that are embedded directly in 'struct task_struct'
into the &lt;linux/mm_types_task.h&gt; header.

The goal is to include this header in &lt;linux/sched.h&gt;, not the full &lt;linux/mm_types.h&gt;
header, to reduce the size, complexity and coupling of &lt;linux/sched.h&gt;.

(This patch does not change &lt;linux/sched.h&gt; yet.)

Acked-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Mike Galbraith &lt;efault@gmx.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>sched/headers: Move the 'init_mm' declaration from &lt;linux/sched.h&gt; to &lt;linux/mm_types.h&gt;</title>
<updated>2017-03-03T00:43:39+00:00</updated>
<author>
<name>Ingo Molnar</name>
<email>mingo@kernel.org</email>
</author>
<published>2017-02-02T11:27:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=abe722a1c59728c6c0ea4e4d5efcfe397c8abebc'/>
<id>abe722a1c59728c6c0ea4e4d5efcfe397c8abebc</id>
<content type='text'>
Acked-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Mike Galbraith &lt;efault@gmx.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Acked-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Mike Galbraith &lt;efault@gmx.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm/headers, sched/headers: Prepare to split &lt;linux/mm_types_task.h&gt; out of &lt;linux/mm_types.h&gt;</title>
<updated>2017-03-02T07:42:37+00:00</updated>
<author>
<name>Ingo Molnar</name>
<email>mingo@kernel.org</email>
</author>
<published>2017-02-03T23:12:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=2e58f173ab89b29a1373088b8727133dbf7322b0'/>
<id>2e58f173ab89b29a1373088b8727133dbf7322b0</id>
<content type='text'>
We are going to separate all the MM types that are embedded directly in 'struct task_struct'
into the new &lt;linux/mm_types_task.h&gt; header.

Create a new &lt;linux/mm_types_task.h&gt; that only contains some includes from mm_types.h itself.

This should be trivially correct and easy to bisect to.

(This patch does not materially move the types yet.)

Acked-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Mike Galbraith &lt;efault@gmx.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We are going to separate all the MM types that are embedded directly in 'struct task_struct'
into the new &lt;linux/mm_types_task.h&gt; header.

Create a new &lt;linux/mm_types_task.h&gt; that only contains some includes from mm_types.h itself.

This should be trivially correct and easy to bisect to.

(This patch does not materially move the types yet.)

Acked-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Mike Galbraith &lt;efault@gmx.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm/vmacache, sched/headers: Introduce 'struct vmacache' and move it from &lt;linux/sched.h&gt; to &lt;linux/mm_types&gt;</title>
<updated>2017-03-02T07:42:25+00:00</updated>
<author>
<name>Ingo Molnar</name>
<email>mingo@kernel.org</email>
</author>
<published>2017-02-03T10:03:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=314ff7851fc8ea66cbf48eaa93d8ebfb5ca084a9'/>
<id>314ff7851fc8ea66cbf48eaa93d8ebfb5ca084a9</id>
<content type='text'>
The &lt;linux/sched.h&gt; header includes various vmacache related defines,
which are arguably misplaced.

Move them to mm_types.h and minimize the sched.h impact by putting
all task vmacache state into a new 'struct vmacache' structure.

No change in functionality.

Acked-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Mike Galbraith &lt;efault@gmx.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The &lt;linux/sched.h&gt; header includes various vmacache related defines,
which are arguably misplaced.

Move them to mm_types.h and minimize the sched.h impact by putting
all task vmacache state into a new 'struct vmacache' structure.

No change in functionality.

Acked-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Mike Galbraith &lt;efault@gmx.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm: clarify mm_struct.mm_{users,count} documentation</title>
<updated>2017-02-28T02:43:48+00:00</updated>
<author>
<name>Vegard Nossum</name>
<email>vegard.nossum@oracle.com</email>
</author>
<published>2017-02-27T22:30:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=b279ddc3382426f8e05068de3488a2993f68dc26'/>
<id>b279ddc3382426f8e05068de3488a2993f68dc26</id>
<content type='text'>
Clarify documentation relating to mm_users and mm_count, and switch to
kernel-doc syntax.

Link: http://lkml.kernel.org/r/20161218123229.22952-4-vegard.nossum@oracle.com
Signed-off-by: Vegard Nossum &lt;vegard.nossum@oracle.com&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.com&gt;
Acked-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Acked-by: David Rientjes &lt;rientjes@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Clarify documentation relating to mm_users and mm_count, and switch to
kernel-doc syntax.

Link: http://lkml.kernel.org/r/20161218123229.22952-4-vegard.nossum@oracle.com
Signed-off-by: Vegard Nossum &lt;vegard.nossum@oracle.com&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.com&gt;
Acked-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Acked-by: David Rientjes &lt;rientjes@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2016-12-18T19:12:53+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2016-12-18T19:12:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=1bbb05f52055c8b2fc1cbb2ac272b011593172f9'/>
<id>1bbb05f52055c8b2fc1cbb2ac272b011593172f9</id>
<content type='text'>
Pull x86 fixes and cleanups from Thomas Gleixner:
 "This set of updates contains:

   - Robustification for the logical package managment. Cures the AMD
     and virtualization issues.

   - Put the correct start_cpu() return address on the stack of the idle
     task.

   - Fixups for the fallout of the nodeid &lt;-&gt; cpuid persistent mapping
     modifciations

   - Move the x86/MPX specific mm_struct member to the arch specific
     mm_context where it belongs

   - Cleanups for C89 struct initializers and useless function
     arguments"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/floppy: Use designated initializers
  x86/mpx: Move bd_addr to mm_context_t
  x86/mm: Drop unused argument 'removed' from sync_global_pgds()
  ACPI/NUMA: Do not map pxm to node when NUMA is turned off
  x86/acpi: Use proper macro for invalid node
  x86/smpboot: Prevent false positive out of bounds cpumask access warning
  x86/boot/64: Push correct start_cpu() return address
  x86/boot/64: Use 'push' instead of 'call' in start_cpu()
  x86/smpboot: Make logical package management more robust
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull x86 fixes and cleanups from Thomas Gleixner:
 "This set of updates contains:

   - Robustification for the logical package managment. Cures the AMD
     and virtualization issues.

   - Put the correct start_cpu() return address on the stack of the idle
     task.

   - Fixups for the fallout of the nodeid &lt;-&gt; cpuid persistent mapping
     modifciations

   - Move the x86/MPX specific mm_struct member to the arch specific
     mm_context where it belongs

   - Cleanups for C89 struct initializers and useless function
     arguments"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/floppy: Use designated initializers
  x86/mpx: Move bd_addr to mm_context_t
  x86/mm: Drop unused argument 'removed' from sync_global_pgds()
  ACPI/NUMA: Do not map pxm to node when NUMA is turned off
  x86/acpi: Use proper macro for invalid node
  x86/smpboot: Prevent false positive out of bounds cpumask access warning
  x86/boot/64: Push correct start_cpu() return address
  x86/boot/64: Use 'push' instead of 'call' in start_cpu()
  x86/smpboot: Make logical package management more robust
</pre>
</div>
</content>
</entry>
<entry>
<title>x86/mpx: Move bd_addr to mm_context_t</title>
<updated>2016-12-17T11:29:56+00:00</updated>
<author>
<name>Mark Rutland</name>
<email>mark.rutland@arm.com</email>
</author>
<published>2016-12-16T12:40:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=cb02de96ec724b84373488dd349e53897ab432f5'/>
<id>cb02de96ec724b84373488dd349e53897ab432f5</id>
<content type='text'>
Currently bd_addr lives in mm_struct, which is otherwise architecture
independent. Architecture-specific data is supposed to live within
mm_context_t (itself contained in mm_struct).

Other x86-specific context like the pkey accounting data lives in
mm_context_t, and there's no readon the MPX data can't also live there.
So as to keep the arch-specific data togather, and to set a good example
for others, this patch moves bd_addr into x86's mm_context_t.

Signed-off-by: Mark Rutland &lt;mark.rutland@arm.com&gt;
Acked-by: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Link: http://lkml.kernel.org/r/1481892055-24596-1-git-send-email-mark.rutland@arm.com
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently bd_addr lives in mm_struct, which is otherwise architecture
independent. Architecture-specific data is supposed to live within
mm_context_t (itself contained in mm_struct).

Other x86-specific context like the pkey accounting data lives in
mm_context_t, and there's no readon the MPX data can't also live there.
So as to keep the arch-specific data togather, and to set a good example
for others, this patch moves bd_addr into x86's mm_context_t.

Signed-off-by: Mark Rutland &lt;mark.rutland@arm.com&gt;
Acked-by: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Link: http://lkml.kernel.org/r/1481892055-24596-1-git-send-email-mark.rutland@arm.com
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>mm: Add a user_ns owner to mm_struct and fix ptrace permission checks</title>
<updated>2016-11-22T17:49:48+00:00</updated>
<author>
<name>Eric W. Biederman</name>
<email>ebiederm@xmission.com</email>
</author>
<published>2016-10-14T02:23:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=bfedb589252c01fa505ac9f6f2a3d5d68d707ef4'/>
<id>bfedb589252c01fa505ac9f6f2a3d5d68d707ef4</id>
<content type='text'>
During exec dumpable is cleared if the file that is being executed is
not readable by the user executing the file.  A bug in
ptrace_may_access allows reading the file if the executable happens to
enter into a subordinate user namespace (aka clone(CLONE_NEWUSER),
unshare(CLONE_NEWUSER), or setns(fd, CLONE_NEWUSER).

This problem is fixed with only necessary userspace breakage by adding
a user namespace owner to mm_struct, captured at the time of exec, so
it is clear in which user namespace CAP_SYS_PTRACE must be present in
to be able to safely give read permission to the executable.

The function ptrace_may_access is modified to verify that the ptracer
has CAP_SYS_ADMIN in task-&gt;mm-&gt;user_ns instead of task-&gt;cred-&gt;user_ns.
This ensures that if the task changes it's cred into a subordinate
user namespace it does not become ptraceable.

The function ptrace_attach is modified to only set PT_PTRACE_CAP when
CAP_SYS_PTRACE is held over task-&gt;mm-&gt;user_ns.  The intent of
PT_PTRACE_CAP is to be a flag to note that whatever permission changes
the task might go through the tracer has sufficient permissions for
it not to be an issue.  task-&gt;cred-&gt;user_ns is always the same
as or descendent of mm-&gt;user_ns.  Which guarantees that having
CAP_SYS_PTRACE over mm-&gt;user_ns is the worst case for the tasks
credentials.

To prevent regressions mm-&gt;dumpable and mm-&gt;user_ns are not considered
when a task has no mm.  As simply failing ptrace_may_attach causes
regressions in privileged applications attempting to read things
such as /proc/&lt;pid&gt;/stat

Cc: stable@vger.kernel.org
Acked-by: Kees Cook &lt;keescook@chromium.org&gt;
Tested-by: Cyrill Gorcunov &lt;gorcunov@openvz.org&gt;
Fixes: 8409cca70561 ("userns: allow ptrace from non-init user namespaces")
Signed-off-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
During exec dumpable is cleared if the file that is being executed is
not readable by the user executing the file.  A bug in
ptrace_may_access allows reading the file if the executable happens to
enter into a subordinate user namespace (aka clone(CLONE_NEWUSER),
unshare(CLONE_NEWUSER), or setns(fd, CLONE_NEWUSER).

This problem is fixed with only necessary userspace breakage by adding
a user namespace owner to mm_struct, captured at the time of exec, so
it is clear in which user namespace CAP_SYS_PTRACE must be present in
to be able to safely give read permission to the executable.

The function ptrace_may_access is modified to verify that the ptracer
has CAP_SYS_ADMIN in task-&gt;mm-&gt;user_ns instead of task-&gt;cred-&gt;user_ns.
This ensures that if the task changes it's cred into a subordinate
user namespace it does not become ptraceable.

The function ptrace_attach is modified to only set PT_PTRACE_CAP when
CAP_SYS_PTRACE is held over task-&gt;mm-&gt;user_ns.  The intent of
PT_PTRACE_CAP is to be a flag to note that whatever permission changes
the task might go through the tracer has sufficient permissions for
it not to be an issue.  task-&gt;cred-&gt;user_ns is always the same
as or descendent of mm-&gt;user_ns.  Which guarantees that having
CAP_SYS_PTRACE over mm-&gt;user_ns is the worst case for the tasks
credentials.

To prevent regressions mm-&gt;dumpable and mm-&gt;user_ns are not considered
when a task has no mm.  As simply failing ptrace_may_attach causes
regressions in privileged applications attempting to read things
such as /proc/&lt;pid&gt;/stat

Cc: stable@vger.kernel.org
Acked-by: Kees Cook &lt;keescook@chromium.org&gt;
Tested-by: Cyrill Gorcunov &lt;gorcunov@openvz.org&gt;
Fixes: 8409cca70561 ("userns: allow ptrace from non-init user namespaces")
Signed-off-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
