<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/fs/proc, branch v3.16.40</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>sysctl: Drop reference added by grab_header in proc_sys_readdir</title>
<updated>2017-02-23T03:54:47+00:00</updated>
<author>
<name>Zhou Chengming</name>
<email>zhouchengming1@huawei.com</email>
</author>
<published>2017-01-06T01:32:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=0b66ea3bca021aea839c526d7643df085c5dadbc'/>
<id>0b66ea3bca021aea839c526d7643df085c5dadbc</id>
<content type='text'>
commit 93362fa47fe98b62e4a34ab408c4a418432e7939 upstream.

Fixes CVE-2016-9191, proc_sys_readdir doesn't drop reference
added by grab_header when return from !dir_emit_dots path.
It can cause any path called unregister_sysctl_table will
wait forever.

The calltrace of CVE-2016-9191:

[ 5535.960522] Call Trace:
[ 5535.963265]  [&lt;ffffffff817cdaaf&gt;] schedule+0x3f/0xa0
[ 5535.968817]  [&lt;ffffffff817d33fb&gt;] schedule_timeout+0x3db/0x6f0
[ 5535.975346]  [&lt;ffffffff817cf055&gt;] ? wait_for_completion+0x45/0x130
[ 5535.982256]  [&lt;ffffffff817cf0d3&gt;] wait_for_completion+0xc3/0x130
[ 5535.988972]  [&lt;ffffffff810d1fd0&gt;] ? wake_up_q+0x80/0x80
[ 5535.994804]  [&lt;ffffffff8130de64&gt;] drop_sysctl_table+0xc4/0xe0
[ 5536.001227]  [&lt;ffffffff8130de17&gt;] drop_sysctl_table+0x77/0xe0
[ 5536.007648]  [&lt;ffffffff8130decd&gt;] unregister_sysctl_table+0x4d/0xa0
[ 5536.014654]  [&lt;ffffffff8130deff&gt;] unregister_sysctl_table+0x7f/0xa0
[ 5536.021657]  [&lt;ffffffff810f57f5&gt;] unregister_sched_domain_sysctl+0x15/0x40
[ 5536.029344]  [&lt;ffffffff810d7704&gt;] partition_sched_domains+0x44/0x450
[ 5536.036447]  [&lt;ffffffff817d0761&gt;] ? __mutex_unlock_slowpath+0x111/0x1f0
[ 5536.043844]  [&lt;ffffffff81167684&gt;] rebuild_sched_domains_locked+0x64/0xb0
[ 5536.051336]  [&lt;ffffffff8116789d&gt;] update_flag+0x11d/0x210
[ 5536.057373]  [&lt;ffffffff817cf61f&gt;] ? mutex_lock_nested+0x2df/0x450
[ 5536.064186]  [&lt;ffffffff81167acb&gt;] ? cpuset_css_offline+0x1b/0x60
[ 5536.070899]  [&lt;ffffffff810fce3d&gt;] ? trace_hardirqs_on+0xd/0x10
[ 5536.077420]  [&lt;ffffffff817cf61f&gt;] ? mutex_lock_nested+0x2df/0x450
[ 5536.084234]  [&lt;ffffffff8115a9f5&gt;] ? css_killed_work_fn+0x25/0x220
[ 5536.091049]  [&lt;ffffffff81167ae5&gt;] cpuset_css_offline+0x35/0x60
[ 5536.097571]  [&lt;ffffffff8115aa2c&gt;] css_killed_work_fn+0x5c/0x220
[ 5536.104207]  [&lt;ffffffff810bc83f&gt;] process_one_work+0x1df/0x710
[ 5536.110736]  [&lt;ffffffff810bc7c0&gt;] ? process_one_work+0x160/0x710
[ 5536.117461]  [&lt;ffffffff810bce9b&gt;] worker_thread+0x12b/0x4a0
[ 5536.123697]  [&lt;ffffffff810bcd70&gt;] ? process_one_work+0x710/0x710
[ 5536.130426]  [&lt;ffffffff810c3f7e&gt;] kthread+0xfe/0x120
[ 5536.135991]  [&lt;ffffffff817d4baf&gt;] ret_from_fork+0x1f/0x40
[ 5536.142041]  [&lt;ffffffff810c3e80&gt;] ? kthread_create_on_node+0x230/0x230

One cgroup maintainer mentioned that "cgroup is trying to offline
a cpuset css, which takes place under cgroup_mutex.  The offlining
ends up trying to drain active usages of a sysctl table which apprently
is not happening."
The real reason is that proc_sys_readdir doesn't drop reference added
by grab_header when return from !dir_emit_dots path. So this cpuset
offline path will wait here forever.

See here for details: http://www.openwall.com/lists/oss-security/2016/11/04/13

Fixes: f0c3b5093add ("[readdir] convert procfs")
Reported-by: CAI Qian &lt;caiqian@redhat.com&gt;
Tested-by: Yang Shukui &lt;yangshukui@huawei.com&gt;
Signed-off-by: Zhou Chengming &lt;zhouchengming1@huawei.com&gt;
Acked-by: Al Viro &lt;viro@ZenIV.linux.org.uk&gt;
Signed-off-by: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 93362fa47fe98b62e4a34ab408c4a418432e7939 upstream.

Fixes CVE-2016-9191, proc_sys_readdir doesn't drop reference
added by grab_header when return from !dir_emit_dots path.
It can cause any path called unregister_sysctl_table will
wait forever.

The calltrace of CVE-2016-9191:

[ 5535.960522] Call Trace:
[ 5535.963265]  [&lt;ffffffff817cdaaf&gt;] schedule+0x3f/0xa0
[ 5535.968817]  [&lt;ffffffff817d33fb&gt;] schedule_timeout+0x3db/0x6f0
[ 5535.975346]  [&lt;ffffffff817cf055&gt;] ? wait_for_completion+0x45/0x130
[ 5535.982256]  [&lt;ffffffff817cf0d3&gt;] wait_for_completion+0xc3/0x130
[ 5535.988972]  [&lt;ffffffff810d1fd0&gt;] ? wake_up_q+0x80/0x80
[ 5535.994804]  [&lt;ffffffff8130de64&gt;] drop_sysctl_table+0xc4/0xe0
[ 5536.001227]  [&lt;ffffffff8130de17&gt;] drop_sysctl_table+0x77/0xe0
[ 5536.007648]  [&lt;ffffffff8130decd&gt;] unregister_sysctl_table+0x4d/0xa0
[ 5536.014654]  [&lt;ffffffff8130deff&gt;] unregister_sysctl_table+0x7f/0xa0
[ 5536.021657]  [&lt;ffffffff810f57f5&gt;] unregister_sched_domain_sysctl+0x15/0x40
[ 5536.029344]  [&lt;ffffffff810d7704&gt;] partition_sched_domains+0x44/0x450
[ 5536.036447]  [&lt;ffffffff817d0761&gt;] ? __mutex_unlock_slowpath+0x111/0x1f0
[ 5536.043844]  [&lt;ffffffff81167684&gt;] rebuild_sched_domains_locked+0x64/0xb0
[ 5536.051336]  [&lt;ffffffff8116789d&gt;] update_flag+0x11d/0x210
[ 5536.057373]  [&lt;ffffffff817cf61f&gt;] ? mutex_lock_nested+0x2df/0x450
[ 5536.064186]  [&lt;ffffffff81167acb&gt;] ? cpuset_css_offline+0x1b/0x60
[ 5536.070899]  [&lt;ffffffff810fce3d&gt;] ? trace_hardirqs_on+0xd/0x10
[ 5536.077420]  [&lt;ffffffff817cf61f&gt;] ? mutex_lock_nested+0x2df/0x450
[ 5536.084234]  [&lt;ffffffff8115a9f5&gt;] ? css_killed_work_fn+0x25/0x220
[ 5536.091049]  [&lt;ffffffff81167ae5&gt;] cpuset_css_offline+0x35/0x60
[ 5536.097571]  [&lt;ffffffff8115aa2c&gt;] css_killed_work_fn+0x5c/0x220
[ 5536.104207]  [&lt;ffffffff810bc83f&gt;] process_one_work+0x1df/0x710
[ 5536.110736]  [&lt;ffffffff810bc7c0&gt;] ? process_one_work+0x160/0x710
[ 5536.117461]  [&lt;ffffffff810bce9b&gt;] worker_thread+0x12b/0x4a0
[ 5536.123697]  [&lt;ffffffff810bcd70&gt;] ? process_one_work+0x710/0x710
[ 5536.130426]  [&lt;ffffffff810c3f7e&gt;] kthread+0xfe/0x120
[ 5536.135991]  [&lt;ffffffff817d4baf&gt;] ret_from_fork+0x1f/0x40
[ 5536.142041]  [&lt;ffffffff810c3e80&gt;] ? kthread_create_on_node+0x230/0x230

One cgroup maintainer mentioned that "cgroup is trying to offline
a cpuset css, which takes place under cgroup_mutex.  The offlining
ends up trying to drain active usages of a sysctl table which apprently
is not happening."
The real reason is that proc_sys_readdir doesn't drop reference added
by grab_header when return from !dir_emit_dots path. So this cpuset
offline path will wait here forever.

See here for details: http://www.openwall.com/lists/oss-security/2016/11/04/13

Fixes: f0c3b5093add ("[readdir] convert procfs")
Reported-by: CAI Qian &lt;caiqian@redhat.com&gt;
Tested-by: Yang Shukui &lt;yangshukui@huawei.com&gt;
Signed-off-by: Zhou Chengming &lt;zhouchengming1@huawei.com&gt;
Acked-by: Al Viro &lt;viro@ZenIV.linux.org.uk&gt;
Signed-off-by: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>fs: Give dentry to inode_change_ok() instead of inode</title>
<updated>2017-02-23T03:53:52+00:00</updated>
<author>
<name>Jan Kara</name>
<email>jack@suse.cz</email>
</author>
<published>2016-05-26T14:55:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=50b070e8224f7bf86622ede1abee9fa3d3dc2f10'/>
<id>50b070e8224f7bf86622ede1abee9fa3d3dc2f10</id>
<content type='text'>
commit 31051c85b5e2aaaf6315f74c72a732673632a905 upstream.

inode_change_ok() will be resposible for clearing capabilities and IMA
extended attributes and as such will need dentry. Give it as an argument
to inode_change_ok() instead of an inode. Also rename inode_change_ok()
to setattr_prepare() to better relect that it does also some
modifications in addition to checks.

Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
[bwh: Backported to 3.16:
 - Drop changes to orangefs, overlayfs
 - Adjust filenames, context
 - In nfsd, pass dentry to nfsd_sanitize_attrs()
 - Update ext3 as well]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 31051c85b5e2aaaf6315f74c72a732673632a905 upstream.

inode_change_ok() will be resposible for clearing capabilities and IMA
extended attributes and as such will need dentry. Give it as an argument
to inode_change_ok() instead of an inode. Also rename inode_change_ok()
to setattr_prepare() to better relect that it does also some
modifications in addition to checks.

Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
[bwh: Backported to 3.16:
 - Drop changes to orangefs, overlayfs
 - Adjust filenames, context
 - In nfsd, pass dentry to nfsd_sanitize_attrs()
 - Update ext3 as well]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Revert "fs: Give dentry to inode_change_ok() instead of inode"</title>
<updated>2017-02-23T03:53:52+00:00</updated>
<author>
<name>Ben Hutchings</name>
<email>ben@decadent.org.uk</email>
</author>
<published>2016-11-30T23:13:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=1c608c2d1aefca2bf63497663e17cfb49e6b022c'/>
<id>1c608c2d1aefca2bf63497663e17cfb49e6b022c</id>
<content type='text'>
This reverts commit be9df699432235753c3824b0f5a27d46de7fdc9e, which was
commit 31051c85b5e2aaaf6315f74c72a732673632a905 upstream.  The backport
breaks fuse and makes a mess of xfs, which can be improved by picking
further upstream commits as I should have done in the first place.

Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This reverts commit be9df699432235753c3824b0f5a27d46de7fdc9e, which was
commit 31051c85b5e2aaaf6315f74c72a732673632a905 upstream.  The backport
breaks fuse and makes a mess of xfs, which can be improved by picking
further upstream commits as I should have done in the first place.

Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>fs: Give dentry to inode_change_ok() instead of inode</title>
<updated>2016-11-20T01:17:38+00:00</updated>
<author>
<name>Jan Kara</name>
<email>jack@suse.cz</email>
</author>
<published>2016-05-26T14:55:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=be9df699432235753c3824b0f5a27d46de7fdc9e'/>
<id>be9df699432235753c3824b0f5a27d46de7fdc9e</id>
<content type='text'>
commit 31051c85b5e2aaaf6315f74c72a732673632a905 upstream.

inode_change_ok() will be resposible for clearing capabilities and IMA
extended attributes and as such will need dentry. Give it as an argument
to inode_change_ok() instead of an inode. Also rename inode_change_ok()
to setattr_prepare() to better relect that it does also some
modifications in addition to checks.

Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
[bwh: Backported to 3.16:
 - Drop changes to orangefs, overlayfs
 - Adjust filenames, context
 - In fuse, pass dentry to fuse_do_setattr()
 - In nfsd, pass dentry to nfsd_sanitize_attrs()
 - In xfs, pass dentry to xfs_setattr_nonsize() and xfs_setattr_size()
 - Update ext3 as well]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 31051c85b5e2aaaf6315f74c72a732673632a905 upstream.

inode_change_ok() will be resposible for clearing capabilities and IMA
extended attributes and as such will need dentry. Give it as an argument
to inode_change_ok() instead of an inode. Also rename inode_change_ok()
to setattr_prepare() to better relect that it does also some
modifications in addition to checks.

Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
[bwh: Backported to 3.16:
 - Drop changes to orangefs, overlayfs
 - Adjust filenames, context
 - In fuse, pass dentry to fuse_do_setattr()
 - In nfsd, pass dentry to nfsd_sanitize_attrs()
 - In xfs, pass dentry to xfs_setattr_nonsize() and xfs_setattr_size()
 - Update ext3 as well]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>proc: prevent stacking filesystems on top</title>
<updated>2016-08-22T21:38:27+00:00</updated>
<author>
<name>Jann Horn</name>
<email>jannh@google.com</email>
</author>
<published>2016-06-01T09:55:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=a0b5c04dfca69e9728b1c454c6f9fde9f8f38613'/>
<id>a0b5c04dfca69e9728b1c454c6f9fde9f8f38613</id>
<content type='text'>
commit e54ad7f1ee263ffa5a2de9c609d58dfa27b21cd9 upstream.

This prevents stacking filesystems (ecryptfs and overlayfs) from using
procfs as lower filesystem.  There is too much magic going on inside
procfs, and there is no good reason to stack stuff on top of procfs.

(For example, procfs does access checks in VFS open handlers, and
ecryptfs by design calls open handlers from a kernel thread that doesn't
drop privileges or so.)

Signed-off-by: Jann Horn &lt;jannh@google.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit e54ad7f1ee263ffa5a2de9c609d58dfa27b21cd9 upstream.

This prevents stacking filesystems (ecryptfs and overlayfs) from using
procfs as lower filesystem.  There is too much magic going on inside
procfs, and there is no good reason to stack stuff on top of procfs.

(For example, procfs does access checks in VFS open handlers, and
ecryptfs by design calls open handlers from a kernel thread that doesn't
drop privileges or so.)

Signed-off-by: Jann Horn &lt;jannh@google.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>proc: prevent accessing /proc/&lt;PID&gt;/environ until it's ready</title>
<updated>2016-06-15T20:29:31+00:00</updated>
<author>
<name>Mathias Krause</name>
<email>minipli@googlemail.com</email>
</author>
<published>2016-05-05T23:22:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=eee69c92529e82e48e088f21137220c516c4b8ed'/>
<id>eee69c92529e82e48e088f21137220c516c4b8ed</id>
<content type='text'>
commit 8148a73c9901a8794a50f950083c00ccf97d43b3 upstream.

If /proc/&lt;PID&gt;/environ gets read before the envp[] array is fully set up
in create_{aout,elf,elf_fdpic,flat}_tables(), we might end up trying to
read more bytes than are actually written, as env_start will already be
set but env_end will still be zero, making the range calculation
underflow, allowing to read beyond the end of what has been written.

Fix this as it is done for /proc/&lt;PID&gt;/cmdline by testing env_end for
zero.  It is, apparently, intentionally set last in create_*_tables().

This bug was found by the PaX size_overflow plugin that detected the
arithmetic underflow of 'this_len = env_end - (env_start + src)' when
env_end is still zero.

The expected consequence is that userland trying to access
/proc/&lt;PID&gt;/environ of a not yet fully set up process may get
inconsistent data as we're in the middle of copying in the environment
variables.

Fixes: https://forums.grsecurity.net/viewtopic.php?f=3&amp;t=4363
Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=116461
Signed-off-by: Mathias Krause &lt;minipli@googlemail.com&gt;
Cc: Emese Revfy &lt;re.emese@gmail.com&gt;
Cc: Pax Team &lt;pageexec@freemail.hu&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Mateusz Guzik &lt;mguzik@redhat.com&gt;
Cc: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Cc: Cyrill Gorcunov &lt;gorcunov@openvz.org&gt;
Cc: Jarod Wilson &lt;jarod@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 8148a73c9901a8794a50f950083c00ccf97d43b3 upstream.

If /proc/&lt;PID&gt;/environ gets read before the envp[] array is fully set up
in create_{aout,elf,elf_fdpic,flat}_tables(), we might end up trying to
read more bytes than are actually written, as env_start will already be
set but env_end will still be zero, making the range calculation
underflow, allowing to read beyond the end of what has been written.

Fix this as it is done for /proc/&lt;PID&gt;/cmdline by testing env_end for
zero.  It is, apparently, intentionally set last in create_*_tables().

This bug was found by the PaX size_overflow plugin that detected the
arithmetic underflow of 'this_len = env_end - (env_start + src)' when
env_end is still zero.

The expected consequence is that userland trying to access
/proc/&lt;PID&gt;/environ of a not yet fully set up process may get
inconsistent data as we're in the middle of copying in the environment
variables.

Fixes: https://forums.grsecurity.net/viewtopic.php?f=3&amp;t=4363
Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=116461
Signed-off-by: Mathias Krause &lt;minipli@googlemail.com&gt;
Cc: Emese Revfy &lt;re.emese@gmail.com&gt;
Cc: Pax Team &lt;pageexec@freemail.hu&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Mateusz Guzik &lt;mguzik@redhat.com&gt;
Cc: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Cc: Cyrill Gorcunov &lt;gorcunov@openvz.org&gt;
Cc: Jarod Wilson &lt;jarod@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>fs/proc, core/debug: Don't expose absolute kernel addresses via wchan</title>
<updated>2015-12-13T17:49:51+00:00</updated>
<author>
<name>Ingo Molnar</name>
<email>mingo@kernel.org</email>
</author>
<published>2015-09-30T13:59:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=a9ba345c1aecd2b912762ad9d5c2432e3fbf1968'/>
<id>a9ba345c1aecd2b912762ad9d5c2432e3fbf1968</id>
<content type='text'>
commit b2f73922d119686323f14fbbe46587f863852328 upstream.

So the /proc/PID/stat 'wchan' field (the 30th field, which contains
the absolute kernel address of the kernel function a task is blocked in)
leaks absolute kernel addresses to unprivileged user-space:

        seq_put_decimal_ull(m, ' ', wchan);

The absolute address might also leak via /proc/PID/wchan as well, if
KALLSYMS is turned off or if the symbol lookup fails for some reason:

static int proc_pid_wchan(struct seq_file *m, struct pid_namespace *ns,
                          struct pid *pid, struct task_struct *task)
{
        unsigned long wchan;
        char symname[KSYM_NAME_LEN];

        wchan = get_wchan(task);

        if (lookup_symbol_name(wchan, symname) &lt; 0) {
                if (!ptrace_may_access(task, PTRACE_MODE_READ))
                        return 0;
                seq_printf(m, "%lu", wchan);
        } else {
                seq_printf(m, "%s", symname);
        }

        return 0;
}

This isn't ideal, because for example it trivially leaks the KASLR offset
to any local attacker:

  fomalhaut:~&gt; printf "%016lx\n" $(cat /proc/$$/stat | cut -d' ' -f35)
  ffffffff8123b380

Most real-life uses of wchan are symbolic:

  ps -eo pid:10,tid:10,wchan:30,comm

and procps uses /proc/PID/wchan, not the absolute address in /proc/PID/stat:

  triton:~/tip&gt; strace -f ps -eo pid:10,tid:10,wchan:30,comm 2&gt;&amp;1 | grep wchan | tail -1
  open("/proc/30833/wchan", O_RDONLY)     = 6

There's one compatibility quirk here: procps relies on whether the
absolute value is non-zero - and we can provide that functionality
by outputing "0" or "1" depending on whether the task is blocked
(whether there's a wchan address).

These days there appears to be very little legitimate reason
user-space would be interested in  the absolute address. The
absolute address is mostly historic: from the days when we
didn't have kallsyms and user-space procps had to do the
decoding itself via the System.map.

So this patch sets all numeric output to "0" or "1" and keeps only
symbolic output, in /proc/PID/wchan.

( The absolute sleep address can generally still be profiled via
  perf, by tasks with sufficient privileges. )

Reviewed-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Acked-by: Kees Cook &lt;keescook@chromium.org&gt;
Acked-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Alexander Potapenko &lt;glider@google.com&gt;
Cc: Andrey Konovalov &lt;andreyknvl@google.com&gt;
Cc: Andrey Ryabinin &lt;ryabinin.a.a@gmail.com&gt;
Cc: Andy Lutomirski &lt;luto@amacapital.net&gt;
Cc: Andy Lutomirski &lt;luto@kernel.org&gt;
Cc: Borislav Petkov &lt;bp@alien8.de&gt;
Cc: Denys Vlasenko &lt;dvlasenk@redhat.com&gt;
Cc: Dmitry Vyukov &lt;dvyukov@google.com&gt;
Cc: Kostya Serebryany &lt;kcc@google.com&gt;
Cc: Mike Galbraith &lt;efault@gmx.de&gt;
Cc: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Sasha Levin &lt;sasha.levin@oracle.com&gt;
Cc: kasan-dev &lt;kasan-dev@googlegroups.com&gt;
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/20150930135917.GA3285@gmail.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
[ kamal: backport to 3.16-stable: proc_pid_wchan context ]
Signed-off-by: Kamal Mostafa &lt;kamal@canonical.com&gt;
Signed-off-by: Luis Henriques &lt;luis.henriques@canonical.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit b2f73922d119686323f14fbbe46587f863852328 upstream.

So the /proc/PID/stat 'wchan' field (the 30th field, which contains
the absolute kernel address of the kernel function a task is blocked in)
leaks absolute kernel addresses to unprivileged user-space:

        seq_put_decimal_ull(m, ' ', wchan);

The absolute address might also leak via /proc/PID/wchan as well, if
KALLSYMS is turned off or if the symbol lookup fails for some reason:

static int proc_pid_wchan(struct seq_file *m, struct pid_namespace *ns,
                          struct pid *pid, struct task_struct *task)
{
        unsigned long wchan;
        char symname[KSYM_NAME_LEN];

        wchan = get_wchan(task);

        if (lookup_symbol_name(wchan, symname) &lt; 0) {
                if (!ptrace_may_access(task, PTRACE_MODE_READ))
                        return 0;
                seq_printf(m, "%lu", wchan);
        } else {
                seq_printf(m, "%s", symname);
        }

        return 0;
}

This isn't ideal, because for example it trivially leaks the KASLR offset
to any local attacker:

  fomalhaut:~&gt; printf "%016lx\n" $(cat /proc/$$/stat | cut -d' ' -f35)
  ffffffff8123b380

Most real-life uses of wchan are symbolic:

  ps -eo pid:10,tid:10,wchan:30,comm

and procps uses /proc/PID/wchan, not the absolute address in /proc/PID/stat:

  triton:~/tip&gt; strace -f ps -eo pid:10,tid:10,wchan:30,comm 2&gt;&amp;1 | grep wchan | tail -1
  open("/proc/30833/wchan", O_RDONLY)     = 6

There's one compatibility quirk here: procps relies on whether the
absolute value is non-zero - and we can provide that functionality
by outputing "0" or "1" depending on whether the task is blocked
(whether there's a wchan address).

These days there appears to be very little legitimate reason
user-space would be interested in  the absolute address. The
absolute address is mostly historic: from the days when we
didn't have kallsyms and user-space procps had to do the
decoding itself via the System.map.

So this patch sets all numeric output to "0" or "1" and keeps only
symbolic output, in /proc/PID/wchan.

( The absolute sleep address can generally still be profiled via
  perf, by tasks with sufficient privileges. )

Reviewed-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Acked-by: Kees Cook &lt;keescook@chromium.org&gt;
Acked-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Alexander Potapenko &lt;glider@google.com&gt;
Cc: Andrey Konovalov &lt;andreyknvl@google.com&gt;
Cc: Andrey Ryabinin &lt;ryabinin.a.a@gmail.com&gt;
Cc: Andy Lutomirski &lt;luto@amacapital.net&gt;
Cc: Andy Lutomirski &lt;luto@kernel.org&gt;
Cc: Borislav Petkov &lt;bp@alien8.de&gt;
Cc: Denys Vlasenko &lt;dvlasenk@redhat.com&gt;
Cc: Dmitry Vyukov &lt;dvyukov@google.com&gt;
Cc: Kostya Serebryany &lt;kcc@google.com&gt;
Cc: Mike Galbraith &lt;efault@gmx.de&gt;
Cc: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Sasha Levin &lt;sasha.levin@oracle.com&gt;
Cc: kasan-dev &lt;kasan-dev@googlegroups.com&gt;
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/20150930135917.GA3285@gmail.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
[ kamal: backport to 3.16-stable: proc_pid_wchan context ]
Signed-off-by: Kamal Mostafa &lt;kamal@canonical.com&gt;
Signed-off-by: Luis Henriques &lt;luis.henriques@canonical.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>proc: actually make proc_fd_permission() thread-friendly</title>
<updated>2015-12-13T17:49:12+00:00</updated>
<author>
<name>Oleg Nesterov</name>
<email>oleg@redhat.com</email>
</author>
<published>2015-11-07T00:30:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=720ae734627d2fbf8d36c27c04602c42e718f5ac'/>
<id>720ae734627d2fbf8d36c27c04602c42e718f5ac</id>
<content type='text'>
commit 54708d2858e79a2bdda10bf8a20c80eb96c20613 upstream.

The commit 96d0df79f264 ("proc: make proc_fd_permission() thread-friendly")
fixed the access to /proc/self/fd from sub-threads, but introduced another
problem: a sub-thread can't access /proc/&lt;tid&gt;/fd/ or /proc/thread-self/fd
if generic_permission() fails.

Change proc_fd_permission() to check same_thread_group(pid_task(), current).

Fixes: 96d0df79f264 ("proc: make proc_fd_permission() thread-friendly")
Reported-by: "Jin, Yihua" &lt;yihua.jin@intel.com&gt;
Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Luis Henriques &lt;luis.henriques@canonical.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 54708d2858e79a2bdda10bf8a20c80eb96c20613 upstream.

The commit 96d0df79f264 ("proc: make proc_fd_permission() thread-friendly")
fixed the access to /proc/self/fd from sub-threads, but introduced another
problem: a sub-thread can't access /proc/&lt;tid&gt;/fd/ or /proc/thread-self/fd
if generic_permission() fails.

Change proc_fd_permission() to check same_thread_group(pid_task(), current).

Fixes: 96d0df79f264 ("proc: make proc_fd_permission() thread-friendly")
Reported-by: "Jin, Yihua" &lt;yihua.jin@intel.com&gt;
Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Luis Henriques &lt;luis.henriques@canonical.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>proc: Allow creating permanently empty directories that serve as mount points</title>
<updated>2015-07-15T09:01:06+00:00</updated>
<author>
<name>Eric W. Biederman</name>
<email>ebiederm@xmission.com</email>
</author>
<published>2015-05-11T21:44:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=ece47480151b7310a05556aa24ea394212704dfe'/>
<id>ece47480151b7310a05556aa24ea394212704dfe</id>
<content type='text'>
commit eb6d38d5427b3ad42f5268da0f1dd31bb0af1264 upstream.

Add a new function proc_create_mount_point that when used to creates a
directory that can not be added to.

Add a new function is_empty_pde to test if a function is a mount
point.

Update the code to use make_empty_dir_inode when reporting
a permanently empty directory to the vfs.

Update the code to not allow adding to permanently empty directories.

Update /proc/openprom and /proc/fs/nfsd to be permanently empty directories.

Signed-off-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
[ luis: backported to 3.16: adjusted context ]
Signed-off-by: Luis Henriques &lt;luis.henriques@canonical.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit eb6d38d5427b3ad42f5268da0f1dd31bb0af1264 upstream.

Add a new function proc_create_mount_point that when used to creates a
directory that can not be added to.

Add a new function is_empty_pde to test if a function is a mount
point.

Update the code to use make_empty_dir_inode when reporting
a permanently empty directory to the vfs.

Update the code to not allow adding to permanently empty directories.

Update /proc/openprom and /proc/fs/nfsd to be permanently empty directories.

Signed-off-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
[ luis: backported to 3.16: adjusted context ]
Signed-off-by: Luis Henriques &lt;luis.henriques@canonical.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>sysctl: Allow creating permanently empty directories that serve as mountpoints.</title>
<updated>2015-07-15T09:01:06+00:00</updated>
<author>
<name>Eric W. Biederman</name>
<email>ebiederm@xmission.com</email>
</author>
<published>2015-05-10T03:09:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=452452beb88f93d280c6b6540fd434f98930eab6'/>
<id>452452beb88f93d280c6b6540fd434f98930eab6</id>
<content type='text'>
commit f9bd6733d3f11e24f3949becf277507d422ee1eb upstream.

Add a magic sysctl table sysctl_mount_point that when used to
create a directory forces that directory to be permanently empty.

Update the code to use make_empty_dir_inode when accessing permanently
empty directories.

Update the code to not allow adding to permanently empty directories.

Update /proc/sys/fs/binfmt_misc to be a permanently empty directory.

Signed-off-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Signed-off-by: Luis Henriques &lt;luis.henriques@canonical.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit f9bd6733d3f11e24f3949becf277507d422ee1eb upstream.

Add a magic sysctl table sysctl_mount_point that when used to
create a directory forces that directory to be permanently empty.

Update the code to use make_empty_dir_inode when accessing permanently
empty directories.

Update the code to not allow adding to permanently empty directories.

Update /proc/sys/fs/binfmt_misc to be a permanently empty directory.

Signed-off-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Signed-off-by: Luis Henriques &lt;luis.henriques@canonical.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
