<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/fs/proc/array.c, branch v3.2.97</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>ptrace: use fsuid, fsgid, effective creds for fs access checks</title>
<updated>2017-09-15T17:30:57+00:00</updated>
<author>
<name>Jann Horn</name>
<email>jann@thejh.net</email>
</author>
<published>2016-01-20T23:00:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=1c8d42255f4c55f9550f72bbcda758aeff3fd7c2'/>
<id>1c8d42255f4c55f9550f72bbcda758aeff3fd7c2</id>
<content type='text'>
commit caaee6234d05a58c5b4d05e7bf766131b810a657 upstream.

By checking the effective credentials instead of the real UID / permitted
capabilities, ensure that the calling process actually intended to use its
credentials.

To ensure that all ptrace checks use the correct caller credentials (e.g.
in case out-of-tree code or newly added code omits the PTRACE_MODE_*CREDS
flag), use two new flags and require one of them to be set.

The problem was that when a privileged task had temporarily dropped its
privileges, e.g.  by calling setreuid(0, user_uid), with the intent to
perform following syscalls with the credentials of a user, it still passed
ptrace access checks that the user would not be able to pass.

While an attacker should not be able to convince the privileged task to
perform a ptrace() syscall, this is a problem because the ptrace access
check is reused for things in procfs.

In particular, the following somewhat interesting procfs entries only rely
on ptrace access checks:

 /proc/$pid/stat - uses the check for determining whether pointers
     should be visible, useful for bypassing ASLR
 /proc/$pid/maps - also useful for bypassing ASLR
 /proc/$pid/cwd - useful for gaining access to restricted
     directories that contain files with lax permissions, e.g. in
     this scenario:
     lrwxrwxrwx root root /proc/13020/cwd -&gt; /root/foobar
     drwx------ root root /root
     drwxr-xr-x root root /root/foobar
     -rw-r--r-- root root /root/foobar/secret

Therefore, on a system where a root-owned mode 6755 binary changes its
effective credentials as described and then dumps a user-specified file,
this could be used by an attacker to reveal the memory layout of root's
processes or reveal the contents of files he is not allowed to access
(through /proc/$pid/cwd).

[akpm@linux-foundation.org: fix warning]
Signed-off-by: Jann Horn &lt;jann@thejh.net&gt;
Acked-by: Kees Cook &lt;keescook@chromium.org&gt;
Cc: Casey Schaufler &lt;casey@schaufler-ca.com&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: James Morris &lt;james.l.morris@oracle.com&gt;
Cc: "Serge E. Hallyn" &lt;serge.hallyn@ubuntu.com&gt;
Cc: Andy Shevchenko &lt;andriy.shevchenko@linux.intel.com&gt;
Cc: Andy Lutomirski &lt;luto@kernel.org&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Cc: Willy Tarreau &lt;w@1wt.eu&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
[bwh: Backported to 3.2:
 - Drop changes to kcmp, procfs map_files, procfs has_pid_permissions()
 - Keep using uid_t, gid_t and == operator for IDs
 - Adjust context]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit caaee6234d05a58c5b4d05e7bf766131b810a657 upstream.

By checking the effective credentials instead of the real UID / permitted
capabilities, ensure that the calling process actually intended to use its
credentials.

To ensure that all ptrace checks use the correct caller credentials (e.g.
in case out-of-tree code or newly added code omits the PTRACE_MODE_*CREDS
flag), use two new flags and require one of them to be set.

The problem was that when a privileged task had temporarily dropped its
privileges, e.g.  by calling setreuid(0, user_uid), with the intent to
perform following syscalls with the credentials of a user, it still passed
ptrace access checks that the user would not be able to pass.

While an attacker should not be able to convince the privileged task to
perform a ptrace() syscall, this is a problem because the ptrace access
check is reused for things in procfs.

In particular, the following somewhat interesting procfs entries only rely
on ptrace access checks:

 /proc/$pid/stat - uses the check for determining whether pointers
     should be visible, useful for bypassing ASLR
 /proc/$pid/maps - also useful for bypassing ASLR
 /proc/$pid/cwd - useful for gaining access to restricted
     directories that contain files with lax permissions, e.g. in
     this scenario:
     lrwxrwxrwx root root /proc/13020/cwd -&gt; /root/foobar
     drwx------ root root /root
     drwxr-xr-x root root /root/foobar
     -rw-r--r-- root root /root/foobar/secret

Therefore, on a system where a root-owned mode 6755 binary changes its
effective credentials as described and then dumps a user-specified file,
this could be used by an attacker to reveal the memory layout of root's
processes or reveal the contents of files he is not allowed to access
(through /proc/$pid/cwd).

[akpm@linux-foundation.org: fix warning]
Signed-off-by: Jann Horn &lt;jann@thejh.net&gt;
Acked-by: Kees Cook &lt;keescook@chromium.org&gt;
Cc: Casey Schaufler &lt;casey@schaufler-ca.com&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: James Morris &lt;james.l.morris@oracle.com&gt;
Cc: "Serge E. Hallyn" &lt;serge.hallyn@ubuntu.com&gt;
Cc: Andy Shevchenko &lt;andriy.shevchenko@linux.intel.com&gt;
Cc: Andy Lutomirski &lt;luto@kernel.org&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Cc: Willy Tarreau &lt;w@1wt.eu&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
[bwh: Backported to 3.2:
 - Drop changes to kcmp, procfs map_files, procfs has_pid_permissions()
 - Keep using uid_t, gid_t and == operator for IDs
 - Adjust context]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>proc: pid/status: show all supplementary groups</title>
<updated>2013-01-03T03:33:43+00:00</updated>
<author>
<name>Artem Bityutskiy</name>
<email>artem.bityutskiy@linux.intel.com</email>
</author>
<published>2012-12-18T00:03:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=4efbbb6ce0d1d46e57be495e791c109277f07a14'/>
<id>4efbbb6ce0d1d46e57be495e791c109277f07a14</id>
<content type='text'>
commit 8d238027b87e654be552eabdf492042a34c5c300 upstream.

We display a list of supplementary group for each process in
/proc/&lt;pid&gt;/status.  However, we show only the first 32 groups, not all of
them.

Although this is rare, but sometimes processes do have more than 32
supplementary groups, and this kernel limitation breaks user-space apps
that rely on the group list in /proc/&lt;pid&gt;/status.

Number 32 comes from the internal NGROUPS_SMALL macro which defines the
length for the internal kernel "small" groups buffer.  There is no
apparent reason to limit to this value.

This patch removes the 32 groups printing limit.

The Linux kernel limits the amount of supplementary groups by NGROUPS_MAX,
which is currently set to 65536.  And this is the maximum count of groups
we may possibly print.

Signed-off-by: Artem Bityutskiy &lt;artem.bityutskiy@linux.intel.com&gt;
Acked-by: Serge E. Hallyn &lt;serge.hallyn@ubuntu.com&gt;
Acked-by: Kees Cook &lt;keescook@chromium.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 8d238027b87e654be552eabdf492042a34c5c300 upstream.

We display a list of supplementary group for each process in
/proc/&lt;pid&gt;/status.  However, we show only the first 32 groups, not all of
them.

Although this is rare, but sometimes processes do have more than 32
supplementary groups, and this kernel limitation breaks user-space apps
that rely on the group list in /proc/&lt;pid&gt;/status.

Number 32 comes from the internal NGROUPS_SMALL macro which defines the
length for the internal kernel "small" groups buffer.  There is no
apparent reason to limit to this value.

This patch removes the 32 groups printing limit.

The Linux kernel limits the amount of supplementary groups by NGROUPS_MAX,
which is currently set to 65536.  And this is the maximum count of groups
we may possibly print.

Signed-off-by: Artem Bityutskiy &lt;artem.bityutskiy@linux.intel.com&gt;
Acked-by: Serge E. Hallyn &lt;serge.hallyn@ubuntu.com&gt;
Acked-by: Kees Cook &lt;keescook@chromium.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ptrace: s/tracehook_tracer_task()/ptrace_parent()/</title>
<updated>2011-06-22T17:26:29+00:00</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2011-06-17T14:50:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=06d984737bac0545fe20bb5447ee488b95adb531'/>
<id>06d984737bac0545fe20bb5447ee488b95adb531</id>
<content type='text'>
tracehook.h is on the way out.  Rename tracehook_tracer_task() to
ptrace_parent() and move it from tracehook.h to ptrace.h.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Christoph Hellwig &lt;hch@infradead.org&gt;
Cc: John Johansen &lt;john.johansen@canonical.com&gt;
Cc: Stephen Smalley &lt;sds@tycho.nsa.gov&gt;
Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
tracehook.h is on the way out.  Rename tracehook_tracer_task() to
ptrace_parent() and move it from tracehook.h to ptrace.h.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Christoph Hellwig &lt;hch@infradead.org&gt;
Cc: John Johansen &lt;john.johansen@canonical.com&gt;
Cc: Stephen Smalley &lt;sds@tycho.nsa.gov&gt;
Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>proc: constify status array</title>
<updated>2011-05-27T00:12:36+00:00</updated>
<author>
<name>Mike Frysinger</name>
<email>vapier@gentoo.org</email>
</author>
<published>2011-05-26T23:25:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=e130aa70f438855b4a0e13a5249951da001798d4'/>
<id>e130aa70f438855b4a0e13a5249951da001798d4</id>
<content type='text'>
No need for this local array to be writable, so mark it const.

Signed-off-by: Mike Frysinger &lt;vapier@gentoo.org&gt;
Cc: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
No need for this local array to be writable, so mark it const.

Signed-off-by: Mike Frysinger &lt;vapier@gentoo.org&gt;
Cc: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>proc: protect mm start_code/end_code in /proc/pid/stat</title>
<updated>2011-03-24T02:46:37+00:00</updated>
<author>
<name>Kees Cook</name>
<email>kees.cook@canonical.com</email>
</author>
<published>2011-03-23T23:42:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=5883f57ca0008ffc93e09cbb9847a1928e50c6f3'/>
<id>5883f57ca0008ffc93e09cbb9847a1928e50c6f3</id>
<content type='text'>
While mm-&gt;start_stack was protected from cross-uid viewing (commit
f83ce3e6b02d5 ("proc: avoid information leaks to non-privileged
processes")), the start_code and end_code values were not.  This would
allow the text location of a PIE binary to leak, defeating ASLR.

Note that the value "1" is used instead of "0" for a protected value since
"ps", "killall", and likely other readers of /proc/pid/stat, take
start_code of "0" to mean a kernel thread and will misbehave.  Thanks to
Brad Spengler for pointing this out.

Addresses CVE-2011-0726

Signed-off-by: Kees Cook &lt;kees.cook@canonical.com&gt;
Cc: &lt;stable@kernel.org&gt;
Cc: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Cc: David Howells &lt;dhowells@redhat.com&gt;
Cc: Eugene Teo &lt;eugeneteo@kernel.sg&gt;
Cc: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
Cc: Brad Spengler &lt;spender@grsecurity.net&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
While mm-&gt;start_stack was protected from cross-uid viewing (commit
f83ce3e6b02d5 ("proc: avoid information leaks to non-privileged
processes")), the start_code and end_code values were not.  This would
allow the text location of a PIE binary to leak, defeating ASLR.

Note that the value "1" is used instead of "0" for a protected value since
"ps", "killall", and likely other readers of /proc/pid/stat, take
start_code of "0" to mean a kernel thread and will misbehave.  Thanks to
Brad Spengler for pointing this out.

Addresses CVE-2011-0726

Signed-off-by: Kees Cook &lt;kees.cook@canonical.com&gt;
Cc: &lt;stable@kernel.org&gt;
Cc: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Cc: David Howells &lt;dhowells@redhat.com&gt;
Cc: Eugene Teo &lt;eugeneteo@kernel.sg&gt;
Cc: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
Cc: Brad Spengler &lt;spender@grsecurity.net&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>s390: remove task_show_regs</title>
<updated>2011-02-15T15:34:16+00:00</updated>
<author>
<name>Martin Schwidefsky</name>
<email>schwidefsky@de.ibm.com</email>
</author>
<published>2011-02-15T08:43:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=261cd298a8c363d7985e3482946edb4bfedacf98'/>
<id>261cd298a8c363d7985e3482946edb4bfedacf98</id>
<content type='text'>
task_show_regs used to be a debugging aid in the early bringup days
of Linux on s390. /proc/&lt;pid&gt;/status is a world readable file, it
is not a good idea to show the registers of a process. The only
correct fix is to remove task_show_regs.

Reported-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Signed-off-by: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
task_show_regs used to be a debugging aid in the early bringup days
of Linux on s390. /proc/&lt;pid&gt;/status is a world readable file, it
is not a good idea to show the registers of a process. The only
correct fix is to remove task_show_regs.

Reported-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Signed-off-by: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>proc: use seq_puts()/seq_putc() where possible</title>
<updated>2011-01-13T16:03:16+00:00</updated>
<author>
<name>Alexey Dobriyan</name>
<email>adobriyan@gmail.com</email>
</author>
<published>2011-01-13T01:00:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=9d6de12f70d2fb1487c4f482a21fed25fe74e0fd'/>
<id>9d6de12f70d2fb1487c4f482a21fed25fe74e0fd</id>
<content type='text'>
For string without format specifiers, use seq_puts().
For seq_printf("\n"), use seq_putc('\n').

   text	   data	    bss	    dec	    hex	filename
  61866	    488	    112	  62466	   f402	fs/proc/proc.o
  61729	    488	    112	  62329	   f379	fs/proc/proc.o
  ----------------------------------------------------
  			   -139

Signed-off-by: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
For string without format specifiers, use seq_puts().
For seq_printf("\n"), use seq_putc('\n').

   text	   data	    bss	    dec	    hex	filename
  61866	    488	    112	  62466	   f402	fs/proc/proc.o
  61729	    488	    112	  62329	   f379	fs/proc/proc.o
  ----------------------------------------------------
  			   -139

Signed-off-by: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>proc: use unsigned long inside /proc/*/statm</title>
<updated>2011-01-13T16:03:16+00:00</updated>
<author>
<name>Alexey Dobriyan</name>
<email>adobriyan@gmail.com</email>
</author>
<published>2011-01-13T01:00:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=a2ade7b6ca37c808128810687cd56e8a44443e65'/>
<id>a2ade7b6ca37c808128810687cd56e8a44443e65</id>
<content type='text'>
/proc/*/statm code needlessly truncates data from unsigned long to int.
One needs only 8+ TB of RAM to make truncation visible.

Signed-off-by: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Reviewed-by: WANG Cong &lt;xiyou.wangcong@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
/proc/*/statm code needlessly truncates data from unsigned long to int.
One needs only 8+ TB of RAM to make truncation visible.

Signed-off-by: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Reviewed-by: WANG Cong &lt;xiyou.wangcong@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>CRED: Fix get_task_cred() and task_state() to not resurrect dead credentials</title>
<updated>2010-07-29T22:16:17+00:00</updated>
<author>
<name>David Howells</name>
<email>dhowells@redhat.com</email>
</author>
<published>2010-07-29T11:45:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=de09a9771a5346029f4d11e4ac886be7f9bfdd75'/>
<id>de09a9771a5346029f4d11e4ac886be7f9bfdd75</id>
<content type='text'>
It's possible for get_task_cred() as it currently stands to 'corrupt' a set of
credentials by incrementing their usage count after their replacement by the
task being accessed.

What happens is that get_task_cred() can race with commit_creds():

	TASK_1			TASK_2			RCU_CLEANER
	--&gt;get_task_cred(TASK_2)
	rcu_read_lock()
	__cred = __task_cred(TASK_2)
				--&gt;commit_creds()
				old_cred = TASK_2-&gt;real_cred
				TASK_2-&gt;real_cred = ...
				put_cred(old_cred)
				  call_rcu(old_cred)
		[__cred-&gt;usage == 0]
	get_cred(__cred)
		[__cred-&gt;usage == 1]
	rcu_read_unlock()
							--&gt;put_cred_rcu()
							[__cred-&gt;usage == 1]
							panic()

However, since a tasks credentials are generally not changed very often, we can
reasonably make use of a loop involving reading the creds pointer and using
atomic_inc_not_zero() to attempt to increment it if it hasn't already hit zero.

If successful, we can safely return the credentials in the knowledge that, even
if the task we're accessing has released them, they haven't gone to the RCU
cleanup code.

We then change task_state() in procfs to use get_task_cred() rather than
calling get_cred() on the result of __task_cred(), as that suffers from the
same problem.

Without this change, a BUG_ON in __put_cred() or in put_cred_rcu() can be
tripped when it is noticed that the usage count is not zero as it ought to be,
for example:

kernel BUG at kernel/cred.c:168!
invalid opcode: 0000 [#1] SMP
last sysfs file: /sys/kernel/mm/ksm/run
CPU 0
Pid: 2436, comm: master Not tainted 2.6.33.3-85.fc13.x86_64 #1 0HR330/OptiPlex
745
RIP: 0010:[&lt;ffffffff81069881&gt;]  [&lt;ffffffff81069881&gt;] __put_cred+0xc/0x45
RSP: 0018:ffff88019e7e9eb8  EFLAGS: 00010202
RAX: 0000000000000001 RBX: ffff880161514480 RCX: 00000000ffffffff
RDX: 00000000ffffffff RSI: ffff880140c690c0 RDI: ffff880140c690c0
RBP: ffff88019e7e9eb8 R08: 00000000000000d0 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000040 R12: ffff880140c690c0
R13: ffff88019e77aea0 R14: 00007fff336b0a5c R15: 0000000000000001
FS:  00007f12f50d97c0(0000) GS:ffff880007400000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f8f461bc000 CR3: 00000001b26ce000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process master (pid: 2436, threadinfo ffff88019e7e8000, task ffff88019e77aea0)
Stack:
 ffff88019e7e9ec8 ffffffff810698cd ffff88019e7e9ef8 ffffffff81069b45
&lt;0&gt; ffff880161514180 ffff880161514480 ffff880161514180 0000000000000000
&lt;0&gt; ffff88019e7e9f28 ffffffff8106aace 0000000000000001 0000000000000246
Call Trace:
 [&lt;ffffffff810698cd&gt;] put_cred+0x13/0x15
 [&lt;ffffffff81069b45&gt;] commit_creds+0x16b/0x175
 [&lt;ffffffff8106aace&gt;] set_current_groups+0x47/0x4e
 [&lt;ffffffff8106ac89&gt;] sys_setgroups+0xf6/0x105
 [&lt;ffffffff81009b02&gt;] system_call_fastpath+0x16/0x1b
Code: 48 8d 71 ff e8 7e 4e 15 00 85 c0 78 0b 8b 75 ec 48 89 df e8 ef 4a 15 00
48 83 c4 18 5b c9 c3 55 8b 07 8b 07 48 89 e5 85 c0 74 04 &lt;0f&gt; 0b eb fe 65 48 8b
04 25 00 cc 00 00 48 3b b8 58 04 00 00 75
RIP  [&lt;ffffffff81069881&gt;] __put_cred+0xc/0x45
 RSP &lt;ffff88019e7e9eb8&gt;
---[ end trace df391256a100ebdd ]---

Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
Acked-by: Jiri Olsa &lt;jolsa@redhat.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
It's possible for get_task_cred() as it currently stands to 'corrupt' a set of
credentials by incrementing their usage count after their replacement by the
task being accessed.

What happens is that get_task_cred() can race with commit_creds():

	TASK_1			TASK_2			RCU_CLEANER
	--&gt;get_task_cred(TASK_2)
	rcu_read_lock()
	__cred = __task_cred(TASK_2)
				--&gt;commit_creds()
				old_cred = TASK_2-&gt;real_cred
				TASK_2-&gt;real_cred = ...
				put_cred(old_cred)
				  call_rcu(old_cred)
		[__cred-&gt;usage == 0]
	get_cred(__cred)
		[__cred-&gt;usage == 1]
	rcu_read_unlock()
							--&gt;put_cred_rcu()
							[__cred-&gt;usage == 1]
							panic()

However, since a tasks credentials are generally not changed very often, we can
reasonably make use of a loop involving reading the creds pointer and using
atomic_inc_not_zero() to attempt to increment it if it hasn't already hit zero.

If successful, we can safely return the credentials in the knowledge that, even
if the task we're accessing has released them, they haven't gone to the RCU
cleanup code.

We then change task_state() in procfs to use get_task_cred() rather than
calling get_cred() on the result of __task_cred(), as that suffers from the
same problem.

Without this change, a BUG_ON in __put_cred() or in put_cred_rcu() can be
tripped when it is noticed that the usage count is not zero as it ought to be,
for example:

kernel BUG at kernel/cred.c:168!
invalid opcode: 0000 [#1] SMP
last sysfs file: /sys/kernel/mm/ksm/run
CPU 0
Pid: 2436, comm: master Not tainted 2.6.33.3-85.fc13.x86_64 #1 0HR330/OptiPlex
745
RIP: 0010:[&lt;ffffffff81069881&gt;]  [&lt;ffffffff81069881&gt;] __put_cred+0xc/0x45
RSP: 0018:ffff88019e7e9eb8  EFLAGS: 00010202
RAX: 0000000000000001 RBX: ffff880161514480 RCX: 00000000ffffffff
RDX: 00000000ffffffff RSI: ffff880140c690c0 RDI: ffff880140c690c0
RBP: ffff88019e7e9eb8 R08: 00000000000000d0 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000040 R12: ffff880140c690c0
R13: ffff88019e77aea0 R14: 00007fff336b0a5c R15: 0000000000000001
FS:  00007f12f50d97c0(0000) GS:ffff880007400000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f8f461bc000 CR3: 00000001b26ce000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process master (pid: 2436, threadinfo ffff88019e7e8000, task ffff88019e77aea0)
Stack:
 ffff88019e7e9ec8 ffffffff810698cd ffff88019e7e9ef8 ffffffff81069b45
&lt;0&gt; ffff880161514180 ffff880161514480 ffff880161514180 0000000000000000
&lt;0&gt; ffff88019e7e9f28 ffffffff8106aace 0000000000000001 0000000000000246
Call Trace:
 [&lt;ffffffff810698cd&gt;] put_cred+0x13/0x15
 [&lt;ffffffff81069b45&gt;] commit_creds+0x16b/0x175
 [&lt;ffffffff8106aace&gt;] set_current_groups+0x47/0x4e
 [&lt;ffffffff8106ac89&gt;] sys_setgroups+0xf6/0x105
 [&lt;ffffffff81009b02&gt;] system_call_fastpath+0x16/0x1b
Code: 48 8d 71 ff e8 7e 4e 15 00 85 c0 78 0b 8b 75 ec 48 89 df e8 ef 4a 15 00
48 83 c4 18 5b c9 c3 55 8b 07 8b 07 48 89 e5 85 c0 74 04 &lt;0f&gt; 0b eb fe 65 48 8b
04 25 00 cc 00 00 48 3b b8 58 04 00 00 75
RIP  [&lt;ffffffff81069881&gt;] __put_cred+0xc/0x45
 RSP &lt;ffff88019e7e9eb8&gt;
---[ end trace df391256a100ebdd ]---

Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
Acked-by: Jiri Olsa &lt;jolsa@redhat.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>proc: get_nr_threads() doesn't need -&gt;siglock any longer</title>
<updated>2010-05-27T16:12:47+00:00</updated>
<author>
<name>Oleg Nesterov</name>
<email>oleg@redhat.com</email>
</author>
<published>2010-05-26T21:43:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=7e49827cc937a742ae02078b483e3eb78f791a2a'/>
<id>7e49827cc937a742ae02078b483e3eb78f791a2a</id>
<content type='text'>
Now that task-&gt;signal can't go away get_nr_threads() doesn't need
-&gt;siglock to read signal-&gt;count.

Also, make it inline, move into sched.h, and convert 2 other proc users of
signal-&gt;count to use this (now trivial) helper.

Henceforth get_nr_threads() is the only valid user of signal-&gt;count, we
are ready to turn it into "int nr_threads" or, perhaps, kill it.

Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Cc: David Howells &lt;dhowells@redhat.com&gt;
Cc: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Acked-by: Roland McGrath &lt;roland@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Now that task-&gt;signal can't go away get_nr_threads() doesn't need
-&gt;siglock to read signal-&gt;count.

Also, make it inline, move into sched.h, and convert 2 other proc users of
signal-&gt;count to use this (now trivial) helper.

Henceforth get_nr_threads() is the only valid user of signal-&gt;count, we
are ready to turn it into "int nr_threads" or, perhaps, kill it.

Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Cc: David Howells &lt;dhowells@redhat.com&gt;
Cc: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Acked-by: Roland McGrath &lt;roland@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
