linux-stable.git/fs/proc, branch linux-2.6.30.y

/proc/kcore: work around a BUG()

2009-10-05T15:28:05+00:00

Not upstream due to other fixes in .32


Works around a BUG() which is triggered when the kernel accesses holes in
vmalloc regions.

BUG: unable to handle kernel paging request at fa54c000
IP: [] read_kcore+0x260/0x31a
*pde = 3540b067 *pte = 00000000
Oops: 0000 [#1] SMP
last sysfs file: /sys/devices/pci0000:00/0000:00:1c.2/0000:03:00.0/ieee80211/phy0/rfkill0/state
Modules linked in: fuse sco bridge stp llc bnep l2cap bluetooth sunrpc nf_conntrack_ftp ip6t_REJECT nf_conntrack_ipv6 ip6table_filter ip6_tables ipv6 cpufreq_ondemand acpi_cpufreq dm_multipath uinput usb_storage arc4 ecb snd_hda_codec_realtek snd_hda_intel ath5k snd_hda_codec snd_hwdep iTCO_wdt snd_pcm iTCO_vendor_support pcspkr i2c_i801 mac80211 joydev snd_timer serio_raw r8169 snd soundcore mii snd_page_alloc ath cfg80211 ata_generic i915 drm i2c_algo_bit i2c_core video output [last unloaded: scsi_wait_scan]
Sep  4 12:45:16 tuxedu kernel: Pid: 2266, comm: cat Not tainted (2.6.31-rc8 #2) Joybook Lite U101
EIP: 0060:[] EFLAGS: 00010286 CPU: 0
EIP is at read_kcore+0x260/0x31a
EAX: f5e5ea00 EBX: fa54d000 ECX: 00000400 EDX: 00001000
ESI: fa54c000 EDI: f44ad000 EBP: e4533f4c ESP: e4533f24
DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Process cat (pid: 2266, ti=e4532000 task=f09d19a0 task.ti=e4532000)
Stack:
00005000 00000000 f44ad000 09d9c000 00003000 fa54c000 00001000 f6d16f60
 e4520b80 fffffffb e4533f70 c04ef8eb e4533f98 00008000 09d97000 c04f661a
 e4520b80 09d97000 c04ef88c e4533f8c c04ba531 e4533f98 c04c0930 e4520b80
Call Trace:
[] ? proc_reg_read+0x5f/0x73
[] ? read_kcore+0x0/0x31a
[] ? proc_reg_read+0x0/0x73
[] ? vfs_read+0x82/0xe1
[] ? path_put+0x1a/0x1d
[] ? sys_read+0x40/0x62
[] ? sysenter_do_call+0x12/0x2d
Code: 39 f3 89 ca 0f 43 f3 89 fb 29 f2 29 f3 39 cf 0f 46 d3 29 55 dc 8d 1c 32 f6 40 0c 01 75 18 89 d1 89 f7 c1 e9 02 2b 7d ec 03 7d e0  a5 89 d1 83 e1 03 74 02 f3 a4 8b 00 83 7d dc 00 74 04 85 c0
EIP: [] read_kcore+0x260/0x31a SS:ESP 0068:e4533f24
CR2: 00000000fa54c000


To access vmalloc area which may have memory holes, copy_from_user is
useful.  So this:

 # cat /proc/kcore > /dev/null

will not panic.

This is a minimal fix, suitable for 2.6.30.x and 2.6.31.  More extensive
/proc/kcore changes are planned for 2.6.32.

Signed-off-by: KAMEZAWA Hiroyuki 
Tested-by: Nick Craig-Wood 
Cc: Pekka Enberg 
Reported-by: 
Signed-off-by: Andrew Morton 
Signed-off-by: Greg Kroah-Hartman

Fix idle time field in /proc/uptime

2009-10-05T15:28:03+00:00

commit 96830a57de1197519b62af6a4c9ceea556c18c3d upstream.

Git commit 79741dd changes idle cputime accounting, but unfortunately
the /proc/uptime file hasn't caught up.  Here the idle time calculation
from /proc/stat is copied over.

Signed-off-by: Michael Abbott 
Signed-off-by: Martin Schwidefsky 
Signed-off-by: Greg Kroah-Hartman

mm_for_maps: take ->cred_guard_mutex to fix the race with exec

2009-08-16T21:19:14+00:00

commit 704b836cbf19e885f8366bccb2e4b0474346c02d upstream.

The problem is minor, but without ->cred_guard_mutex held we can race
with exec() and get the new ->mm but check old creds.

Now we do not need to re-check task->mm after ptrace_may_access(), it
can't be changed to the new mm under us.

Strictly speaking, this also fixes another very minor problem. Unless
security check fails or the task exits mm_for_maps() should never
return NULL, the caller should get either old or new ->mm.

Signed-off-by: Oleg Nesterov 
Acked-by: Serge Hallyn 
Signed-off-by: James Morris 
Signed-off-by: Greg Kroah-Hartman

mm_for_maps: shift down_read(mmap_sem) to the caller

2009-08-16T21:19:13+00:00

commit 00f89d218523b9bf6b522349c039d5ac80aa536d upstream.

mm_for_maps() takes ->mmap_sem after security checks, this looks
strange and obfuscates the locking rules. Move this lock to its
single caller, m_start().

Signed-off-by: Oleg Nesterov 
Acked-by: Serge Hallyn 
Signed-off-by: James Morris 
Signed-off-by: Greg Kroah-Hartman

mm_for_maps: simplify, use ptrace_may_access()

2009-08-16T21:19:08+00:00

commit 13f0feafa6b8aead57a2a328e2fca6a5828bf286 upstream.

It would be nice to kill __ptrace_may_access(). It requires task_lock(),
but this lock is only needed to read mm->flags in the middle.

Convert mm_for_maps() to use ptrace_may_access(), this also simplifies
the code a little bit.

Also, we do not need to take ->mmap_sem in advance. In fact I think
mm_for_maps() should not play with ->mmap_sem at all, the caller should
take this lock.

With or without this patch, without ->cred_guard_mutex held we can race
with exec() and get the new ->mm but check old creds.

Signed-off-by: Oleg Nesterov 
Reviewed-by: Serge Hallyn 
Signed-off-by: James Morris 
Signed-off-by: Greg Kroah-Hartman

procfs: make errno values consistent when open pident vs exit(2) race occurs

2009-05-29T15:40:02+00:00

proc_pident_instantiate() has following call flow.

proc_pident_lookup()
  proc_pident_instantiate()
    proc_pid_make_inode()

And, proc_pident_lookup() has following error handling.

	const struct pid_entry *p, *last;
	error = ERR_PTR(-ENOENT);
	if (!task)
		goto out_no_task;

Then, proc_pident_instantiate should return ENOENT too when racing against
exit(2) occur.

EINAL has two bad reason.
  - it implies caller is wrong. bad the race isn't caller's mistake.
  - man 2 open don't explain EINVAL. user often don't handle it.

Note: Other proc_pid_make_inode() caller already use ENOENT properly.

Acked-by: Eric W. Biederman 
Cc: Alexey Dobriyan 
Signed-off-by: KOSAKI Motohiro 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

Convert obvious places to deactivate_locked_super()

2009-05-09T14:49:40+00:00

Signed-off-by: Al Viro

proc: avoid information leaks to non-privileged processes

2009-05-04T22:14:23+00:00

By using the same test as is used for /proc/pid/maps and /proc/pid/smaps,
only allow processes that can ptrace() a given process to see information
that might be used to bypass address space layout randomization (ASLR).
These include eip, esp, wchan, and start_stack in /proc/pid/stat as well
as the non-symbolic output from /proc/pid/wchan.

ASLR can be bypassed by sampling eip as shown by the proof-of-concept
code at http://code.google.com/p/fuzzyaslr/ As part of a presentation
(http://www.cr0.org/paper/to-jt-linux-alsr-leak.pdf) esp and wchan were
also noted as possibly usable information leaks as well.  The
start_stack address also leaks potentially useful information.

Cc: Stable Team 
Signed-off-by: Jake Edge 
Acked-by: Arjan van de Ven 
Acked-by: "Eric W. Biederman" 
Signed-off-by: Linus Torvalds

mm: fix Committed_AS underflow on large NR_CPUS environment

2009-05-02T22:36:10+00:00

The Committed_AS field can underflow in certain situations:

>         # while true; do cat /proc/meminfo  | grep _AS; sleep 1; done | uniq -c
>               1 Committed_AS: 18446744073709323392 kB
>              11 Committed_AS: 18446744073709455488 kB
>               6 Committed_AS:    35136 kB
>               5 Committed_AS: 18446744073709454400 kB
>               7 Committed_AS:    35904 kB
>               3 Committed_AS: 18446744073709453248 kB
>               2 Committed_AS:    34752 kB
>               9 Committed_AS: 18446744073709453248 kB
>               8 Committed_AS:    34752 kB
>               3 Committed_AS: 18446744073709320960 kB
>               7 Committed_AS: 18446744073709454080 kB
>               3 Committed_AS: 18446744073709320960 kB
>               5 Committed_AS: 18446744073709454080 kB
>               6 Committed_AS: 18446744073709320960 kB

Because NR_CPUS can be greater than 1000 and meminfo_proc_show() does
not check for underflow.

But NR_CPUS proportional isn't good calculation.  In general,
possibility of lock contention is proportional to the number of online
cpus, not theorical maximum cpus (NR_CPUS).

The current kernel has generic percpu-counter stuff.  using it is right
way.  it makes code simplify and percpu_counter_read_positive() don't
make underflow issue.

Reported-by: Dave Hansen 
Signed-off-by: KOSAKI Motohiro 
Cc: Eric B Munson 
Cc: Mel Gorman 
Cc: Christoph Lameter 
Cc: 		[All kernel versions]
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

pagemap: require aligned-length, non-null reads of /proc/pid/pagemap

2009-05-02T22:36:09+00:00

The intention of commit aae8679b0ebcaa92f99c1c3cb0cd651594a43915
("pagemap: fix bug in add_to_pagemap, require aligned-length reads of
/proc/pid/pagemap") was to force reads of /proc/pid/pagemap to be a
multiple of 8 bytes, but now it allows to read 0 bytes, which actually
puts some data to user's buffer.  According to POSIX, if count is zero,
read() should return zero and has no other results.

Signed-off-by: Vitaly Mayatskikh 
Cc: Thomas Tuttle 
Acked-by: Matt Mackall 
Cc: Alexey Dobriyan 
Cc: 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds