<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/fs/compat.c, branch v2.6.32</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>x86, fs: Fix x86 procfs stack information for threads on 64-bit</title>
<updated>2009-11-04T12:25:03+00:00</updated>
<author>
<name>Stefani Seibold</name>
<email>stefani@seibold.net</email>
</author>
<published>2009-11-03T09:22:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=89240ba059ca468ae7a8346edf7f95082458c2fc'/>
<id>89240ba059ca468ae7a8346edf7f95082458c2fc</id>
<content type='text'>
This patch fixes two issues in the procfs stack information on
x86-64 linux.

The 32 bit loader compat_do_execve did not store stack
start. (this was figured out by Alexey Dobriyan).

The stack information on a x64_64 kernel always shows 0 kbyte
stack usage, because of a missing implementation of the KSTK_ESP
macro which always returned -1.

The new implementation now returns the right value.

Signed-off-by: Stefani Seibold &lt;stefani@seibold.net&gt;
Cc: Americo Wang &lt;xiyou.wangcong@gmail.com&gt;
Cc: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
LKML-Reference: &lt;1257240160.4889.24.camel@wall-e&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch fixes two issues in the procfs stack information on
x86-64 linux.

The 32 bit loader compat_do_execve did not store stack
start. (this was figured out by Alexey Dobriyan).

The stack information on a x64_64 kernel always shows 0 kbyte
stack usage, because of a missing implementation of the KSTK_ESP
macro which always returned -1.

The new implementation now returns the right value.

Signed-off-by: Stefani Seibold &lt;stefani@seibold.net&gt;
Cc: Americo Wang &lt;xiyou.wangcong@gmail.com&gt;
Cc: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
LKML-Reference: &lt;1257240160.4889.24.camel@wall-e&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>fs: fix overflow in sys_mount() for in-kernel calls</title>
<updated>2009-09-24T12:40:15+00:00</updated>
<author>
<name>Vegard Nossum</name>
<email>vegard.nossum@gmail.com</email>
</author>
<published>2009-09-18T20:05:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=eca6f534e61919b28fb21aafbd1c2983deae75be'/>
<id>eca6f534e61919b28fb21aafbd1c2983deae75be</id>
<content type='text'>
sys_mount() reads/copies a whole page for its "type" parameter.  When
do_mount_root() passes a kernel address that points to an object which is
smaller than a whole page, copy_mount_options() will happily go past this
memory object, possibly dereferencing "wild" pointers that could be in any
state (hence the kmemcheck warning, which shows that parts of the next
page are not even allocated).

(The likelihood of something going wrong here is pretty low -- first of
all this only applies to kernel calls to sys_mount(), which are mostly
found in the boot code.  Secondly, I guess if the page was not mapped,
exact_copy_from_user() _would_ in fact handle it correctly because of its
access_ok(), etc.  checks.)

But it is much nicer to avoid the dubious reads altogether, by stopping as
soon as we find a NUL byte.  Is there a good reason why we can't do
something like this, using the already existing strndup_from_user()?

[akpm@linux-foundation.org: make copy_mount_string() static]
[AV: fix compat mount breakage, which involves undoing akpm's change above]

Reported-by: Ingo Molnar &lt;mingo@elte.hu&gt;
Signed-off-by: Vegard Nossum &lt;vegard.nossum@gmail.com&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Pekka Enberg &lt;penberg@cs.helsinki.fi&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: al &lt;al@dizzy.pdmi.ras.ru&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
sys_mount() reads/copies a whole page for its "type" parameter.  When
do_mount_root() passes a kernel address that points to an object which is
smaller than a whole page, copy_mount_options() will happily go past this
memory object, possibly dereferencing "wild" pointers that could be in any
state (hence the kmemcheck warning, which shows that parts of the next
page are not even allocated).

(The likelihood of something going wrong here is pretty low -- first of
all this only applies to kernel calls to sys_mount(), which are mostly
found in the boot code.  Secondly, I guess if the page was not mapped,
exact_copy_from_user() _would_ in fact handle it correctly because of its
access_ok(), etc.  checks.)

But it is much nicer to avoid the dubious reads altogether, by stopping as
soon as we find a NUL byte.  Is there a good reason why we can't do
something like this, using the already existing strndup_from_user()?

[akpm@linux-foundation.org: make copy_mount_string() static]
[AV: fix compat mount breakage, which involves undoing akpm's change above]

Reported-by: Ingo Molnar &lt;mingo@elte.hu&gt;
Signed-off-by: Vegard Nossum &lt;vegard.nossum@gmail.com&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Pekka Enberg &lt;penberg@cs.helsinki.fi&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: al &lt;al@dizzy.pdmi.ras.ru&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>fix compat_sys_utimensat()</title>
<updated>2009-09-23T14:39:30+00:00</updated>
<author>
<name>Suzuki Poulose</name>
<email>suzuki@in.ibm.com</email>
</author>
<published>2009-09-22T23:44:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=d7d7561c908afa001ab0fc8212eee94731a213a6'/>
<id>d7d7561c908afa001ab0fc8212eee94731a213a6</id>
<content type='text'>
Compat utimensat() returns EINVAL when the tv_nsec is one of UTIME_OMIT or
UTIME_NOW and the tv_sec is set to non-zero.  As per man pages, the tv_sec
field should be ignored.

sys_utimensat() works fine in this case.

Test case:

#define _GNU_SOURCE
#define _ATFILE_SOURCE
#include &lt;stdio.h&gt;
#include &lt;fcntl.h&gt;
#include &lt;unistd.h&gt;
#include &lt;sys/stat.h&gt;
#include &lt;stdlib.h&gt;

main(int argc, char *argv[])
{
	struct timespec ts[2];
	struct timespec *tsp;

	if (argc &lt; 2) {
		fprintf(stderr, "Usage : %s filename\n", argv[0]);
		exit (-1);
	}

	ts[0].tv_nsec = ts[1].tv_nsec = UTIME_NOW;
	ts[0].tv_sec = ts[1].tv_sec = 1;

	tsp = ts;

	if (utimensat(AT_FDCWD, argv[1],tsp,0) == -1)
		perror("utimensat");
	else
		fprintf(stdout, "utimensat success\n");
	return 0;
}
mjs22lp5:~ # cc -m64 utimensat-test.c -o utimensat_test64
mjs22lp5:~ # cc -m32 utimensat-test.c -o utimensat_test32
mjs22lp5:~ # ./utimensat_test32 /tmp/utimensat_test
utimensat: Invalid argument
mjs22lp5:~ # ./utimensat_test64 /tmp/utimensat_test
utimensat success
mjs22lp5:~ # uname -r
2.6.31-rc8

With the patch :

mjs22lp5:~ # ./utimensat_test64 /tmp/utimensat_test
utimensat success
mjs22lp5:~ # ./utimensat_test32 /tmp/utimensat_test
utimensat success
mjs22lp5:~ # uname -r
2.6.31-rc8utimensat

Signed-off-by: Suzuki K P &lt;suzuki@in.ibm.com&gt;
Cc: Ulrich Drepper &lt;drepper@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Compat utimensat() returns EINVAL when the tv_nsec is one of UTIME_OMIT or
UTIME_NOW and the tv_sec is set to non-zero.  As per man pages, the tv_sec
field should be ignored.

sys_utimensat() works fine in this case.

Test case:

#define _GNU_SOURCE
#define _ATFILE_SOURCE
#include &lt;stdio.h&gt;
#include &lt;fcntl.h&gt;
#include &lt;unistd.h&gt;
#include &lt;sys/stat.h&gt;
#include &lt;stdlib.h&gt;

main(int argc, char *argv[])
{
	struct timespec ts[2];
	struct timespec *tsp;

	if (argc &lt; 2) {
		fprintf(stderr, "Usage : %s filename\n", argv[0]);
		exit (-1);
	}

	ts[0].tv_nsec = ts[1].tv_nsec = UTIME_NOW;
	ts[0].tv_sec = ts[1].tv_sec = 1;

	tsp = ts;

	if (utimensat(AT_FDCWD, argv[1],tsp,0) == -1)
		perror("utimensat");
	else
		fprintf(stdout, "utimensat success\n");
	return 0;
}
mjs22lp5:~ # cc -m64 utimensat-test.c -o utimensat_test64
mjs22lp5:~ # cc -m32 utimensat-test.c -o utimensat_test32
mjs22lp5:~ # ./utimensat_test32 /tmp/utimensat_test
utimensat: Invalid argument
mjs22lp5:~ # ./utimensat_test64 /tmp/utimensat_test
utimensat success
mjs22lp5:~ # uname -r
2.6.31-rc8

With the patch :

mjs22lp5:~ # ./utimensat_test64 /tmp/utimensat_test
utimensat success
mjs22lp5:~ # ./utimensat_test32 /tmp/utimensat_test
utimensat success
mjs22lp5:~ # uname -r
2.6.31-rc8utimensat

Signed-off-by: Suzuki K P &lt;suzuki@in.ibm.com&gt;
Cc: Ulrich Drepper &lt;drepper@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>exec: do not sleep in TASK_TRACED under -&gt;cred_guard_mutex</title>
<updated>2009-09-05T18:30:42+00:00</updated>
<author>
<name>Oleg Nesterov</name>
<email>oleg@redhat.com</email>
</author>
<published>2009-09-05T18:17:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=a2a8474c3fff88d8dd52d05cb450563fb26fd26c'/>
<id>a2a8474c3fff88d8dd52d05cb450563fb26fd26c</id>
<content type='text'>
Tom Horsley reports that his debugger hangs when it tries to read
/proc/pid_of_tracee/maps, this happens since

	"mm_for_maps: take -&gt;cred_guard_mutex to fix the race with exec"
	04b836cbf19e885f8366bccb2e4b0474346c02d

commit in 2.6.31.

But the root of the problem lies in the fact that do_execve() path calls
tracehook_report_exec() which can stop if the tracer sets PT_TRACE_EXEC.

The tracee must not sleep in TASK_TRACED holding this mutex.  Even if we
remove -&gt;cred_guard_mutex from mm_for_maps() and proc_pid_attr_write(),
another task doing PTRACE_ATTACH should not hang until it is killed or the
tracee resumes.

With this patch do_execve() does not use -&gt;cred_guard_mutex directly and
we do not hold it throughout, instead:

	- introduce prepare_bprm_creds() helper, it locks the mutex
	  and calls prepare_exec_creds() to initialize bprm-&gt;cred.

	- install_exec_creds() drops the mutex after commit_creds(),
	  and thus before tracehook_report_exec()-&gt;ptrace_stop().

	  or, if exec fails,

	  free_bprm() drops this mutex when bprm-&gt;cred != NULL which
	  indicates install_exec_creds() was not called.

Reported-by: Tom Horsley &lt;tom.horsley@att.net&gt;
Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Acked-by: David Howells &lt;dhowells@redhat.com&gt;
Cc: Roland McGrath &lt;roland@redhat.com&gt;
Cc: James Morris &lt;jmorris@namei.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Tom Horsley reports that his debugger hangs when it tries to read
/proc/pid_of_tracee/maps, this happens since

	"mm_for_maps: take -&gt;cred_guard_mutex to fix the race with exec"
	04b836cbf19e885f8366bccb2e4b0474346c02d

commit in 2.6.31.

But the root of the problem lies in the fact that do_execve() path calls
tracehook_report_exec() which can stop if the tracer sets PT_TRACE_EXEC.

The tracee must not sleep in TASK_TRACED holding this mutex.  Even if we
remove -&gt;cred_guard_mutex from mm_for_maps() and proc_pid_attr_write(),
another task doing PTRACE_ATTACH should not hang until it is killed or the
tracee resumes.

With this patch do_execve() does not use -&gt;cred_guard_mutex directly and
we do not hold it throughout, instead:

	- introduce prepare_bprm_creds() helper, it locks the mutex
	  and calls prepare_exec_creds() to initialize bprm-&gt;cred.

	- install_exec_creds() drops the mutex after commit_creds(),
	  and thus before tracehook_report_exec()-&gt;ptrace_stop().

	  or, if exec fails,

	  free_bprm() drops this mutex when bprm-&gt;cred != NULL which
	  indicates install_exec_creds() was not called.

Reported-by: Tom Horsley &lt;tom.horsley@att.net&gt;
Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Acked-by: David Howells &lt;dhowells@redhat.com&gt;
Cc: Roland McGrath &lt;roland@redhat.com&gt;
Cc: James Morris &lt;jmorris@namei.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>headers: smp_lock.h redux</title>
<updated>2009-07-12T19:22:34+00:00</updated>
<author>
<name>Alexey Dobriyan</name>
<email>adobriyan@gmail.com</email>
</author>
<published>2009-07-11T18:08:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=405f55712dfe464b3240d7816cc4fe4174831be2'/>
<id>405f55712dfe464b3240d7816cc4fe4174831be2</id>
<content type='text'>
* Remove smp_lock.h from files which don't need it (including some headers!)
* Add smp_lock.h to files which do need it
* Make smp_lock.h include conditional in hardirq.h
  It's needed only for one kernel_locked() usage which is under CONFIG_PREEMPT

  This will make hardirq.h inclusion cheaper for every PREEMPT=n config
  (which includes allmodconfig/allyesconfig, BTW)

Signed-off-by: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* Remove smp_lock.h from files which don't need it (including some headers!)
* Add smp_lock.h to files which do need it
* Make smp_lock.h include conditional in hardirq.h
  It's needed only for one kernel_locked() usage which is under CONFIG_PREEMPT

  This will make hardirq.h inclusion cheaper for every PREEMPT=n config
  (which includes allmodconfig/allyesconfig, BTW)

Signed-off-by: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cred_guard_mutex: do not return -EINTR to user-space</title>
<updated>2009-07-06T20:57:04+00:00</updated>
<author>
<name>Oleg Nesterov</name>
<email>oleg@redhat.com</email>
</author>
<published>2009-07-05T19:08:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=793285fcafce4719a05e0c99fa74b188157fe7fe'/>
<id>793285fcafce4719a05e0c99fa74b188157fe7fe</id>
<content type='text'>
do_execve() and ptrace_attach() return -EINTR if
mutex_lock_interruptible(-&gt;cred_guard_mutex) fails.

This is not right, change the code to return ERESTARTNOINTR.

Perhaps we should also change proc_pid_attr_write().

Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: David Howells &lt;dhowells@redhat.com&gt;
Acked-by: Roland McGrath &lt;roland@redhat.com&gt;
Cc: James Morris &lt;jmorris@namei.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
do_execve() and ptrace_attach() return -EINTR if
mutex_lock_interruptible(-&gt;cred_guard_mutex) fails.

This is not right, change the code to return ERESTARTNOINTR.

Perhaps we should also change proc_pid_attr_write().

Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: David Howells &lt;dhowells@redhat.com&gt;
Acked-by: Roland McGrath &lt;roland@redhat.com&gt;
Cc: James Morris &lt;jmorris@namei.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>trivial: fix comment typo in fs/compat.c</title>
<updated>2009-06-12T16:01:44+00:00</updated>
<author>
<name>Nikanth Karthikesan</name>
<email>knikanth@suse.de</email>
</author>
<published>2009-04-01T09:10:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=ff677f8d10a7b7dea6fbfc48d5ceeb3018cabb23'/>
<id>ff677f8d10a7b7dea6fbfc48d5ceeb3018cabb23</id>
<content type='text'>
Fix a typo in fs/compat.c

Signed-off-by: Nikanth Karthikesan &lt;knikanth@suse.de&gt;
Signed-off-by: Jiri Kosina &lt;jkosina@suse.cz&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Fix a typo in fs/compat.c

Signed-off-by: Nikanth Karthikesan &lt;knikanth@suse.de&gt;
Signed-off-by: Jiri Kosina &lt;jkosina@suse.cz&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Push BKL into do_mount()</title>
<updated>2009-06-12T01:36:08+00:00</updated>
<author>
<name>Al Viro</name>
<email>viro@zeniv.linux.org.uk</email>
</author>
<published>2009-05-08T17:31:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=6fac98dd218653c6aff8a0f56305c424930cea2a'/>
<id>6fac98dd218653c6aff8a0f56305c424930cea2a</id>
<content type='text'>
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>CRED: Rename cred_exec_mutex to reflect that it's a guard against ptrace</title>
<updated>2009-05-10T22:15:36+00:00</updated>
<author>
<name>David Howells</name>
<email>dhowells@redhat.com</email>
</author>
<published>2009-05-08T12:55:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=5e751e992f3fb08ba35e1ca8095ec8fbf9eda523'/>
<id>5e751e992f3fb08ba35e1ca8095ec8fbf9eda523</id>
<content type='text'>
Rename cred_exec_mutex to reflect that it's a guard against foreign
intervention on a process's credential state, such as is made by ptrace().  The
attachment of a debugger to a process affects execve()'s calculation of the new
credential state - _and_ also setprocattr()'s calculation of that state.

Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
Signed-off-by: James Morris &lt;jmorris@namei.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Rename cred_exec_mutex to reflect that it's a guard against foreign
intervention on a process's credential state, such as is made by ptrace().  The
attachment of a debugger to a process affects execve()'s calculation of the new
credential state - _and_ also setprocattr()'s calculation of that state.

Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
Signed-off-by: James Morris &lt;jmorris@namei.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>do_execve() must not clear fs-&gt;in_exec if it was set by another thread</title>
<updated>2009-04-24T14:39:45+00:00</updated>
<author>
<name>Oleg Nesterov</name>
<email>oleg@redhat.com</email>
</author>
<published>2009-04-23T23:01:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=8c652f96d3852b97a49c331cd0bb02d22f3cb31b'/>
<id>8c652f96d3852b97a49c331cd0bb02d22f3cb31b</id>
<content type='text'>
If do_execve() fails after check_unsafe_exec(), it clears fs-&gt;in_exec
unconditionally. This is wrong if we race with our sub-thread which
also does do_execve:

	Two threads T1 and T2 and another process P, all share the same
	-&gt;fs.

	T1 starts do_execve(BAD_FILE). It calls check_unsafe_exec(), since
	-&gt;fs is shared, we set LSM_UNSAFE but not -&gt;in_exec.

	P exits and decrements fs-&gt;users.

	T2 starts do_execve(), calls check_unsafe_exec(), now -&gt;fs is not
	shared, we set fs-&gt;in_exec.

	T1 continues, open_exec(BAD_FILE) fails, we clear -&gt;in_exec and
	return to the user-space.

	T1 does clone(CLONE_FS /* without CLONE_THREAD */).

	T2 continues without LSM_UNSAFE_SHARE while -&gt;fs is shared with
	another process.

Change check_unsafe_exec() to return res = 1 if we set -&gt;in_exec, and change
do_execve() to clear -&gt;in_exec depending on res.

When do_execve() suceeds, it is safe to clear -&gt;in_exec unconditionally.
It can be set only if we don't share -&gt;fs with another process, and since
we already killed all sub-threads either -&gt;in_exec == 0 or we are the
only user of this -&gt;fs.

Also, we do not need fs-&gt;lock to clear fs-&gt;in_exec.

Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Acked-by: Roland McGrath &lt;roland@redhat.com&gt;
Acked-by: Hugh Dickins &lt;hugh@veritas.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
If do_execve() fails after check_unsafe_exec(), it clears fs-&gt;in_exec
unconditionally. This is wrong if we race with our sub-thread which
also does do_execve:

	Two threads T1 and T2 and another process P, all share the same
	-&gt;fs.

	T1 starts do_execve(BAD_FILE). It calls check_unsafe_exec(), since
	-&gt;fs is shared, we set LSM_UNSAFE but not -&gt;in_exec.

	P exits and decrements fs-&gt;users.

	T2 starts do_execve(), calls check_unsafe_exec(), now -&gt;fs is not
	shared, we set fs-&gt;in_exec.

	T1 continues, open_exec(BAD_FILE) fails, we clear -&gt;in_exec and
	return to the user-space.

	T1 does clone(CLONE_FS /* without CLONE_THREAD */).

	T2 continues without LSM_UNSAFE_SHARE while -&gt;fs is shared with
	another process.

Change check_unsafe_exec() to return res = 1 if we set -&gt;in_exec, and change
do_execve() to clear -&gt;in_exec depending on res.

When do_execve() suceeds, it is safe to clear -&gt;in_exec unconditionally.
It can be set only if we don't share -&gt;fs with another process, and since
we already killed all sub-threads either -&gt;in_exec == 0 or we are the
only user of this -&gt;fs.

Also, we do not need fs-&gt;lock to clear fs-&gt;in_exec.

Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Acked-by: Roland McGrath &lt;roland@redhat.com&gt;
Acked-by: Hugh Dickins &lt;hugh@veritas.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
