<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/fs/proc/generic.c, branch v2.6.26</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>netns: assign PDE-&gt;data before gluing entry into /proc tree</title>
<updated>2008-05-02T11:12:41+00:00</updated>
<author>
<name>Denis V. Lunev</name>
<email>den@openvz.org</email>
</author>
<published>2008-05-02T11:12:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=78e92b99ec4eb73755abd4e357b0b211eadafd88'/>
<id>78e92b99ec4eb73755abd4e357b0b211eadafd88</id>
<content type='text'>
In this unfortunate case, proc_mkdir_mode wrapper can't be used anymore and
this is no way to reuse proc_create_data due to nlinks assignment. So,
copy the code from proc_mkdir and assign PDE-&gt;data at the appropriate
moment.

Signed-off-by: Denis V. Lunev &lt;den@openvz.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In this unfortunate case, proc_mkdir_mode wrapper can't be used anymore and
this is no way to reuse proc_create_data due to nlinks assignment. So,
copy the code from proc_mkdir and assign PDE-&gt;data at the appropriate
moment.

Signed-off-by: Denis V. Lunev &lt;den@openvz.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>proc: introduce proc_create_data to setup de-&gt;data</title>
<updated>2008-04-29T15:06:20+00:00</updated>
<author>
<name>Denis V. Lunev</name>
<email>den@openvz.org</email>
</author>
<published>2008-04-29T08:02:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=59b7435149eab2dd06dd678742faff6049cb655f'/>
<id>59b7435149eab2dd06dd678742faff6049cb655f</id>
<content type='text'>
This set of patches fixes an proc -&gt;open'less usage due to -&gt;proc_fops flip in
the most part of the kernel code.  The original OOPS is described in the
commit 2d3a4e3666325a9709cc8ea2e88151394e8f20fc:

    Typical PDE creation code looks like:

    	pde = create_proc_entry("foo", 0, NULL);
    	if (pde)
    		pde-&gt;proc_fops = &amp;foo_proc_fops;

    Notice that PDE is first created, only then -&gt;proc_fops is set up to
    final value. This is a problem because right after creation
    a) PDE is fully visible in /proc , and
    b) -&gt;proc_fops are proc_file_operations which do not have -&gt;open callback. So, it's
       possible to -&gt;read without -&gt;open (see one class of oopses below).

    The fix is new API called proc_create() which makes sure -&gt;proc_fops are
    set up before gluing PDE to main tree. Typical new code looks like:

    	pde = proc_create("foo", 0, NULL, &amp;foo_proc_fops);
    	if (!pde)
    		return -ENOMEM;

    Fix most networking users for a start.

    In the long run, create_proc_entry() for regular files will go.

In addition to this, proc_create_data is introduced to fix reading from
proc without PDE-&gt;data. The race is basically the same as above.

create_proc_entries is replaced in the entire kernel code as new method
is also simply better.

This patch:

The problem is the same as for de-&gt;proc_fops.  Right now PDE becomes visible
without data set.  So, the entry could be looked up without data.  This, in
most cases, will simply OOPS.

proc_create_data call is created to address this issue.  proc_create now
becomes a wrapper around it.

Signed-off-by: Denis V. Lunev &lt;den@openvz.org&gt;
Cc: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Cc: "J. Bruce Fields" &lt;bfields@fieldses.org&gt;
Cc: Alessandro Zummo &lt;a.zummo@towertech.it&gt;
Cc: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Cc: Bartlomiej Zolnierkiewicz &lt;bzolnier@gmail.com&gt;
Cc: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Cc: Bjorn Helgaas &lt;bjorn.helgaas@hp.com&gt;
Cc: Chris Mason &lt;chris.mason@oracle.com&gt;
Acked-by: David Howells &lt;dhowells@redhat.com&gt;
Cc: Dmitry Torokhov &lt;dtor@mail.ru&gt;
Cc: Geert Uytterhoeven &lt;geert@linux-m68k.org&gt;
Cc: Grant Grundler &lt;grundler@parisc-linux.org&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;
Cc: Haavard Skinnemoen &lt;hskinnemoen@atmel.com&gt;
Cc: Heiko Carstens &lt;heiko.carstens@de.ibm.com&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Cc: James Bottomley &lt;James.Bottomley@HansenPartnership.com&gt;
Cc: Jaroslav Kysela &lt;perex@suse.cz&gt;
Cc: Jeff Garzik &lt;jgarzik@pobox.com&gt;
Cc: Jeff Mahoney &lt;jeffm@suse.com&gt;
Cc: Jesper Nilsson &lt;jesper.nilsson@axis.com&gt;
Cc: Karsten Keil &lt;kkeil@suse.de&gt;
Cc: Kyle McMartin &lt;kyle@parisc-linux.org&gt;
Cc: Len Brown &lt;lenb@kernel.org&gt;
Cc: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
Cc: Mathieu Desnoyers &lt;mathieu.desnoyers@polymtl.ca&gt;
Cc: Matthew Wilcox &lt;matthew@wil.cx&gt;
Cc: Mauro Carvalho Chehab &lt;mchehab@infradead.org&gt;
Cc: Mikael Starvik &lt;starvik@axis.com&gt;
Cc: Nadia Derbey &lt;Nadia.Derbey@bull.net&gt;
Cc: Neil Brown &lt;neilb@suse.de&gt;
Cc: Paul Mackerras &lt;paulus@samba.org&gt;
Cc: Peter Osterlund &lt;petero2@telia.com&gt;
Cc: Pierre Peiffer &lt;peifferp@gmail.com&gt;
Cc: Russell King &lt;rmk@arm.linux.org.uk&gt;
Cc: Takashi Iwai &lt;tiwai@suse.de&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Cc: Trond Myklebust &lt;trond.myklebust@fys.uio.no&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This set of patches fixes an proc -&gt;open'less usage due to -&gt;proc_fops flip in
the most part of the kernel code.  The original OOPS is described in the
commit 2d3a4e3666325a9709cc8ea2e88151394e8f20fc:

    Typical PDE creation code looks like:

    	pde = create_proc_entry("foo", 0, NULL);
    	if (pde)
    		pde-&gt;proc_fops = &amp;foo_proc_fops;

    Notice that PDE is first created, only then -&gt;proc_fops is set up to
    final value. This is a problem because right after creation
    a) PDE is fully visible in /proc , and
    b) -&gt;proc_fops are proc_file_operations which do not have -&gt;open callback. So, it's
       possible to -&gt;read without -&gt;open (see one class of oopses below).

    The fix is new API called proc_create() which makes sure -&gt;proc_fops are
    set up before gluing PDE to main tree. Typical new code looks like:

    	pde = proc_create("foo", 0, NULL, &amp;foo_proc_fops);
    	if (!pde)
    		return -ENOMEM;

    Fix most networking users for a start.

    In the long run, create_proc_entry() for regular files will go.

In addition to this, proc_create_data is introduced to fix reading from
proc without PDE-&gt;data. The race is basically the same as above.

create_proc_entries is replaced in the entire kernel code as new method
is also simply better.

This patch:

The problem is the same as for de-&gt;proc_fops.  Right now PDE becomes visible
without data set.  So, the entry could be looked up without data.  This, in
most cases, will simply OOPS.

proc_create_data call is created to address this issue.  proc_create now
becomes a wrapper around it.

Signed-off-by: Denis V. Lunev &lt;den@openvz.org&gt;
Cc: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Cc: "J. Bruce Fields" &lt;bfields@fieldses.org&gt;
Cc: Alessandro Zummo &lt;a.zummo@towertech.it&gt;
Cc: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Cc: Bartlomiej Zolnierkiewicz &lt;bzolnier@gmail.com&gt;
Cc: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Cc: Bjorn Helgaas &lt;bjorn.helgaas@hp.com&gt;
Cc: Chris Mason &lt;chris.mason@oracle.com&gt;
Acked-by: David Howells &lt;dhowells@redhat.com&gt;
Cc: Dmitry Torokhov &lt;dtor@mail.ru&gt;
Cc: Geert Uytterhoeven &lt;geert@linux-m68k.org&gt;
Cc: Grant Grundler &lt;grundler@parisc-linux.org&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;
Cc: Haavard Skinnemoen &lt;hskinnemoen@atmel.com&gt;
Cc: Heiko Carstens &lt;heiko.carstens@de.ibm.com&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Cc: James Bottomley &lt;James.Bottomley@HansenPartnership.com&gt;
Cc: Jaroslav Kysela &lt;perex@suse.cz&gt;
Cc: Jeff Garzik &lt;jgarzik@pobox.com&gt;
Cc: Jeff Mahoney &lt;jeffm@suse.com&gt;
Cc: Jesper Nilsson &lt;jesper.nilsson@axis.com&gt;
Cc: Karsten Keil &lt;kkeil@suse.de&gt;
Cc: Kyle McMartin &lt;kyle@parisc-linux.org&gt;
Cc: Len Brown &lt;lenb@kernel.org&gt;
Cc: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
Cc: Mathieu Desnoyers &lt;mathieu.desnoyers@polymtl.ca&gt;
Cc: Matthew Wilcox &lt;matthew@wil.cx&gt;
Cc: Mauro Carvalho Chehab &lt;mchehab@infradead.org&gt;
Cc: Mikael Starvik &lt;starvik@axis.com&gt;
Cc: Nadia Derbey &lt;Nadia.Derbey@bull.net&gt;
Cc: Neil Brown &lt;neilb@suse.de&gt;
Cc: Paul Mackerras &lt;paulus@samba.org&gt;
Cc: Peter Osterlund &lt;petero2@telia.com&gt;
Cc: Pierre Peiffer &lt;peifferp@gmail.com&gt;
Cc: Russell King &lt;rmk@arm.linux.org.uk&gt;
Cc: Takashi Iwai &lt;tiwai@suse.de&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Cc: Trond Myklebust &lt;trond.myklebust@fys.uio.no&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>proc: remove -&gt;get_info infrastructure</title>
<updated>2008-04-29T15:06:19+00:00</updated>
<author>
<name>Alexey Dobriyan</name>
<email>adobriyan@sw.ru</email>
</author>
<published>2008-04-29T08:01:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=8731f14d37825b54ad0c4c309cba2bc8fdf13a86'/>
<id>8731f14d37825b54ad0c4c309cba2bc8fdf13a86</id>
<content type='text'>
Now that last dozen or so users of -&gt;get_info were removed, ditch it too.
Everyone sane shouldd have switched to seq_file interface long ago.

P.S.: Co-existing 3 interfaces (-&gt;get_info/-&gt;read_proc/-&gt;proc_fops) for proc
      is long-standing crap, BTW, thus
      a) put -&gt;read_proc/-&gt;write_proc/read_proc_entry() users on death row,
      b) new such users should be rejected,
      c) everyone is encouraged to convert his favourite -&gt;read_proc user or
         I'll do it, lazy bastards.

Signed-off-by: Alexey Dobriyan &lt;adobriyan@sw.ru&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Now that last dozen or so users of -&gt;get_info were removed, ditch it too.
Everyone sane shouldd have switched to seq_file interface long ago.

P.S.: Co-existing 3 interfaces (-&gt;get_info/-&gt;read_proc/-&gt;proc_fops) for proc
      is long-standing crap, BTW, thus
      a) put -&gt;read_proc/-&gt;write_proc/read_proc_entry() users on death row,
      b) new such users should be rejected,
      c) everyone is encouraged to convert his favourite -&gt;read_proc user or
         I'll do it, lazy bastards.

Signed-off-by: Alexey Dobriyan &lt;adobriyan@sw.ru&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>proc: drop several "PDE valid/invalid" checks</title>
<updated>2008-04-29T15:06:18+00:00</updated>
<author>
<name>Alexey Dobriyan</name>
<email>adobriyan@gmail.com</email>
</author>
<published>2008-04-29T08:01:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=5e971dce0b2f6896e02372512df0d1fb0bfe2d55'/>
<id>5e971dce0b2f6896e02372512df0d1fb0bfe2d55</id>
<content type='text'>
proc-misc code is noticeably full of "if (de)" checks when PDE passed is
always valid.  Remove them.

Addition of such check in proc_lookup_de() is for failed lookup case.

Signed-off-by: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
proc-misc code is noticeably full of "if (de)" checks when PDE passed is
always valid.  Remove them.

Addition of such check in proc_lookup_de() is for failed lookup case.

Signed-off-by: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>proc: less special case in xlate code</title>
<updated>2008-04-29T15:06:17+00:00</updated>
<author>
<name>Alexey Dobriyan</name>
<email>adobriyan@gmail.com</email>
</author>
<published>2008-04-29T08:01:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=7cee4e00e0f8aa7290266382ea903a5a1b92c9a1'/>
<id>7cee4e00e0f8aa7290266382ea903a5a1b92c9a1</id>
<content type='text'>
If valid "parent" is passed to proc_create/remove_proc_entry(), then name of
PDE should consist of only one path component, otherwise creation or or
removal will fail.  However, if NULL is passed as parent then create/remove
accept full path as a argument.  This is arbitrary restriction -- all
infrastructure is in place.

So, patch allows the following to succeed:

	create_proc_entry("foo/bar", 0, pde_baz);
	remove_proc_entry("baz/foo/bar", &amp;proc_root);

Also makes the following to behave identically:

	create_proc_entry("foo/bar", 0, NULL);
	create_proc_entry("foo/bar", 0, &amp;proc_root);

Discrepancy noticed by Den Lunev (IIRC).

Signed-off-by: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
If valid "parent" is passed to proc_create/remove_proc_entry(), then name of
PDE should consist of only one path component, otherwise creation or or
removal will fail.  However, if NULL is passed as parent then create/remove
accept full path as a argument.  This is arbitrary restriction -- all
infrastructure is in place.

So, patch allows the following to succeed:

	create_proc_entry("foo/bar", 0, pde_baz);
	remove_proc_entry("baz/foo/bar", &amp;proc_root);

Also makes the following to behave identically:

	create_proc_entry("foo/bar", 0, NULL);
	create_proc_entry("foo/bar", 0, &amp;proc_root);

Discrepancy noticed by Den Lunev (IIRC).

Signed-off-by: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>proc: simplify locking in remove_proc_entry()</title>
<updated>2008-04-29T15:06:17+00:00</updated>
<author>
<name>Alexey Dobriyan</name>
<email>adobriyan@gmail.com</email>
</author>
<published>2008-04-29T08:01:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=f649d6d32605c7573884613289fb3b9fbd4f99a1'/>
<id>f649d6d32605c7573884613289fb3b9fbd4f99a1</id>
<content type='text'>
proc_subdir_lock protects only modifying and walking through PDE lists, so
after we've found PDE to remove and actually removed it from lists, there is
no need to hold proc_subdir_lock for the rest of operation.

Signed-off-by: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
proc_subdir_lock protects only modifying and walking through PDE lists, so
after we've found PDE to remove and actually removed it from lists, there is
no need to hold proc_subdir_lock for the rest of operation.

Signed-off-by: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>proc: print more information when removing non-empty directories</title>
<updated>2008-04-29T15:06:17+00:00</updated>
<author>
<name>Alexey Dobriyan</name>
<email>adobriyan@sw.ru</email>
</author>
<published>2008-04-29T08:01:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=e93b4ea20adb20f1f1f07f10ba5d7dd739d2843e'/>
<id>e93b4ea20adb20f1f1f07f10ba5d7dd739d2843e</id>
<content type='text'>
This usually saves one recompile to insert similar printk like below. :)

Sample nastygram:

remove_proc_entry: removing non-empty directory '/proc/foo', leaking at least 'bar'
------------[ cut here ]------------
WARNING: at fs/proc/generic.c:776 remove_proc_entry+0x18a/0x200()
Modules linked in: foo(-) container fan battery dock sbs ac sbshc backlight ipv6 loop af_packet amd_rng sr_mod i2c_amd8111 i2c_amd756 cdrom i2c_core button thermal processor
Pid: 3034, comm: rmmod Tainted: G   M     2.6.25-rc1 #5

Call Trace:
 [&lt;ffffffff80231974&gt;] warn_on_slowpath+0x64/0x90
 [&lt;ffffffff80232a6e&gt;] printk+0x4e/0x60
 [&lt;ffffffff802d6c8a&gt;] remove_proc_entry+0x18a/0x200
 [&lt;ffffffff8045cd88&gt;] mutex_lock_nested+0x1c8/0x2d0
 [&lt;ffffffff8025f0f0&gt;] __try_stop_module+0x0/0x40
 [&lt;ffffffff8025effd&gt;] sys_delete_module+0x14d/0x200
 [&lt;ffffffff8045df3d&gt;] lockdep_sys_exit_thunk+0x35/0x67
 [&lt;ffffffff8031c307&gt;] __up_read+0x27/0xa0
 [&lt;ffffffff8045decc&gt;] trace_hardirqs_on_thunk+0x35/0x3a
 [&lt;ffffffff8020b6ab&gt;] system_call_after_swapgs+0x7b/0x80

---[ end trace 10ef850597e89c54 ]---

Signed-off-by: Alexey Dobriyan &lt;adobriyan@sw.ru&gt;
Cc: Arjan van de Ven &lt;arjan@linux.intel.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This usually saves one recompile to insert similar printk like below. :)

Sample nastygram:

remove_proc_entry: removing non-empty directory '/proc/foo', leaking at least 'bar'
------------[ cut here ]------------
WARNING: at fs/proc/generic.c:776 remove_proc_entry+0x18a/0x200()
Modules linked in: foo(-) container fan battery dock sbs ac sbshc backlight ipv6 loop af_packet amd_rng sr_mod i2c_amd8111 i2c_amd756 cdrom i2c_core button thermal processor
Pid: 3034, comm: rmmod Tainted: G   M     2.6.25-rc1 #5

Call Trace:
 [&lt;ffffffff80231974&gt;] warn_on_slowpath+0x64/0x90
 [&lt;ffffffff80232a6e&gt;] printk+0x4e/0x60
 [&lt;ffffffff802d6c8a&gt;] remove_proc_entry+0x18a/0x200
 [&lt;ffffffff8045cd88&gt;] mutex_lock_nested+0x1c8/0x2d0
 [&lt;ffffffff8025f0f0&gt;] __try_stop_module+0x0/0x40
 [&lt;ffffffff8025effd&gt;] sys_delete_module+0x14d/0x200
 [&lt;ffffffff8045df3d&gt;] lockdep_sys_exit_thunk+0x35/0x67
 [&lt;ffffffff8031c307&gt;] __up_read+0x27/0xa0
 [&lt;ffffffff8045decc&gt;] trace_hardirqs_on_thunk+0x35/0x3a
 [&lt;ffffffff8020b6ab&gt;] system_call_after_swapgs+0x7b/0x80

---[ end trace 10ef850597e89c54 ]---

Signed-off-by: Alexey Dobriyan &lt;adobriyan@sw.ru&gt;
Cc: Arjan van de Ven &lt;arjan@linux.intel.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[NET]: Make /proc/net a symlink on /proc/self/net (v3)</title>
<updated>2008-03-07T19:08:40+00:00</updated>
<author>
<name>Pavel Emelyanov</name>
<email>xemul@openvz.org</email>
</author>
<published>2008-03-07T19:08:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=e9720acd728a46cb40daa52c99a979f7c4ff195c'/>
<id>e9720acd728a46cb40daa52c99a979f7c4ff195c</id>
<content type='text'>
Current /proc/net is done with so called "shadows", but current
implementation is broken and has little chances to get fixed.

The problem is that dentries subtree of /proc/net directory has
fancy revalidation rules to make processes living in different
net namespaces see different entries in /proc/net subtree, but
currently, tasks see in the /proc/net subdir the contents of any
other namespace, depending on who opened the file first.

The proposed fix is to turn /proc/net into a symlink, which points
to /proc/self/net, which in turn shows what previously was in
/proc/net - the network-related info, from the net namespace the
appropriate task lives in.

# ls -l /proc/net
lrwxrwxrwx  1 root root 8 Mar  5 15:17 /proc/net -&gt; self/net

In other words - this behaves like /proc/mounts, but unlike
"mounts", "net" is not a file, but a directory.

Changes from v2:
* Fixed discrepancy of /proc/net nlink count and selinux labeling
  screwup pointed out by Stephen.

  To get the correct nlink count the -&gt;getattr callback for /proc/net
  is overridden to read one from the net-&gt;proc_net entry.

  To make selinux still work the net-&gt;proc_net entry is initialized
  properly, i.e. with the "net" name and the proc_net parent.

Selinux fixes are
Acked-by:  Stephen Smalley &lt;sds@tycho.nsa.gov&gt;

Changes from v1:
* Fixed a task_struct leak in get_proc_task_net, pointed out by Paul.

Signed-off-by: Pavel Emelyanov &lt;xemul@openvz.org&gt;
Acked-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Current /proc/net is done with so called "shadows", but current
implementation is broken and has little chances to get fixed.

The problem is that dentries subtree of /proc/net directory has
fancy revalidation rules to make processes living in different
net namespaces see different entries in /proc/net subtree, but
currently, tasks see in the /proc/net subdir the contents of any
other namespace, depending on who opened the file first.

The proposed fix is to turn /proc/net into a symlink, which points
to /proc/self/net, which in turn shows what previously was in
/proc/net - the network-related info, from the net namespace the
appropriate task lives in.

# ls -l /proc/net
lrwxrwxrwx  1 root root 8 Mar  5 15:17 /proc/net -&gt; self/net

In other words - this behaves like /proc/mounts, but unlike
"mounts", "net" is not a file, but a directory.

Changes from v2:
* Fixed discrepancy of /proc/net nlink count and selinux labeling
  screwup pointed out by Stephen.

  To get the correct nlink count the -&gt;getattr callback for /proc/net
  is overridden to read one from the net-&gt;proc_net entry.

  To make selinux still work the net-&gt;proc_net entry is initialized
  properly, i.e. with the "net" name and the proc_net parent.

Selinux fixes are
Acked-by:  Stephen Smalley &lt;sds@tycho.nsa.gov&gt;

Changes from v1:
* Fixed a task_struct leak in get_proc_task_net, pointed out by Paul.

Signed-off-by: Pavel Emelyanov &lt;xemul@openvz.org&gt;
Acked-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>proc: fix -&gt;open'less usage due to -&gt;proc_fops flip</title>
<updated>2008-02-08T17:22:24+00:00</updated>
<author>
<name>Alexey Dobriyan</name>
<email>adobriyan@sw.ru</email>
</author>
<published>2008-02-08T12:18:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=2d3a4e3666325a9709cc8ea2e88151394e8f20fc'/>
<id>2d3a4e3666325a9709cc8ea2e88151394e8f20fc</id>
<content type='text'>
Typical PDE creation code looks like:

	pde = create_proc_entry("foo", 0, NULL);
	if (pde)
		pde-&gt;proc_fops = &amp;foo_proc_fops;

Notice that PDE is first created, only then -&gt;proc_fops is set up to
final value. This is a problem because right after creation
a) PDE is fully visible in /proc , and
b) -&gt;proc_fops are proc_file_operations which do not have -&gt;open callback. So, it's
   possible to -&gt;read without -&gt;open (see one class of oopses below).

The fix is new API called proc_create() which makes sure -&gt;proc_fops are
set up before gluing PDE to main tree. Typical new code looks like:

	pde = proc_create("foo", 0, NULL, &amp;foo_proc_fops);
	if (!pde)
		return -ENOMEM;

Fix most networking users for a start.

In the long run, create_proc_entry() for regular files will go.

BUG: unable to handle kernel NULL pointer dereference at virtual address 00000024
printing eip: c1188c1b *pdpt = 000000002929e001 *pde = 0000000000000000
Oops: 0002 [#1] PREEMPT SMP DEBUG_PAGEALLOC
last sysfs file: /sys/block/sda/sda1/dev
Modules linked in: foo af_packet ipv6 cpufreq_ondemand loop serio_raw psmouse k8temp hwmon sr_mod cdrom

Pid: 24679, comm: cat Not tainted (2.6.24-rc3-mm1 #2)
EIP: 0060:[&lt;c1188c1b&gt;] EFLAGS: 00210002 CPU: 0
EIP is at mutex_lock_nested+0x75/0x25d
EAX: 000006fe EBX: fffffffb ECX: 00001000 EDX: e9340570
ESI: 00000020 EDI: 00200246 EBP: e9340570 ESP: e8ea1ef8
 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Process cat (pid: 24679, ti=E8EA1000 task=E9340570 task.ti=E8EA1000)
Stack: 00000000 c106f7ce e8ee05b4 00000000 00000001 458003d0 f6fb6f20 fffffffb
       00000000 c106f7aa 00001000 c106f7ce 08ae9000 f6db53f0 00000020 00200246
       00000000 00000002 00000000 00200246 00200246 e8ee05a0 fffffffb e8ee0550
Call Trace:
 [&lt;c106f7ce&gt;] seq_read+0x24/0x28a
 [&lt;c106f7aa&gt;] seq_read+0x0/0x28a
 [&lt;c106f7ce&gt;] seq_read+0x24/0x28a
 [&lt;c106f7aa&gt;] seq_read+0x0/0x28a
 [&lt;c10818b8&gt;] proc_reg_read+0x60/0x73
 [&lt;c1081858&gt;] proc_reg_read+0x0/0x73
 [&lt;c105a34f&gt;] vfs_read+0x6c/0x8b
 [&lt;c105a6f3&gt;] sys_read+0x3c/0x63
 [&lt;c10025f2&gt;] sysenter_past_esp+0x5f/0xa5
 [&lt;c10697a7&gt;] destroy_inode+0x24/0x33
 =======================
INFO: lockdep is turned off.
Code: 75 21 68 e1 1a 19 c1 68 87 00 00 00 68 b8 e8 1f c1 68 25 73 1f c1 e8 84 06 e9 ff e8 52 b8 e7 ff 83 c4 10 9c 5f fa e8 28 89 ea ff &lt;f0&gt; fe 4e 04 79 0a f3 90 80 7e 04 00 7e f8 eb f0 39 76 34 74 33
EIP: [&lt;c1188c1b&gt;] mutex_lock_nested+0x75/0x25d SS:ESP 0068:e8ea1ef8

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Alexey Dobriyan &lt;adobriyan@sw.ru&gt;
Cc: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Cc: Oleg Nesterov &lt;oleg@tv-sign.ru&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Typical PDE creation code looks like:

	pde = create_proc_entry("foo", 0, NULL);
	if (pde)
		pde-&gt;proc_fops = &amp;foo_proc_fops;

Notice that PDE is first created, only then -&gt;proc_fops is set up to
final value. This is a problem because right after creation
a) PDE is fully visible in /proc , and
b) -&gt;proc_fops are proc_file_operations which do not have -&gt;open callback. So, it's
   possible to -&gt;read without -&gt;open (see one class of oopses below).

The fix is new API called proc_create() which makes sure -&gt;proc_fops are
set up before gluing PDE to main tree. Typical new code looks like:

	pde = proc_create("foo", 0, NULL, &amp;foo_proc_fops);
	if (!pde)
		return -ENOMEM;

Fix most networking users for a start.

In the long run, create_proc_entry() for regular files will go.

BUG: unable to handle kernel NULL pointer dereference at virtual address 00000024
printing eip: c1188c1b *pdpt = 000000002929e001 *pde = 0000000000000000
Oops: 0002 [#1] PREEMPT SMP DEBUG_PAGEALLOC
last sysfs file: /sys/block/sda/sda1/dev
Modules linked in: foo af_packet ipv6 cpufreq_ondemand loop serio_raw psmouse k8temp hwmon sr_mod cdrom

Pid: 24679, comm: cat Not tainted (2.6.24-rc3-mm1 #2)
EIP: 0060:[&lt;c1188c1b&gt;] EFLAGS: 00210002 CPU: 0
EIP is at mutex_lock_nested+0x75/0x25d
EAX: 000006fe EBX: fffffffb ECX: 00001000 EDX: e9340570
ESI: 00000020 EDI: 00200246 EBP: e9340570 ESP: e8ea1ef8
 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Process cat (pid: 24679, ti=E8EA1000 task=E9340570 task.ti=E8EA1000)
Stack: 00000000 c106f7ce e8ee05b4 00000000 00000001 458003d0 f6fb6f20 fffffffb
       00000000 c106f7aa 00001000 c106f7ce 08ae9000 f6db53f0 00000020 00200246
       00000000 00000002 00000000 00200246 00200246 e8ee05a0 fffffffb e8ee0550
Call Trace:
 [&lt;c106f7ce&gt;] seq_read+0x24/0x28a
 [&lt;c106f7aa&gt;] seq_read+0x0/0x28a
 [&lt;c106f7ce&gt;] seq_read+0x24/0x28a
 [&lt;c106f7aa&gt;] seq_read+0x0/0x28a
 [&lt;c10818b8&gt;] proc_reg_read+0x60/0x73
 [&lt;c1081858&gt;] proc_reg_read+0x0/0x73
 [&lt;c105a34f&gt;] vfs_read+0x6c/0x8b
 [&lt;c105a6f3&gt;] sys_read+0x3c/0x63
 [&lt;c10025f2&gt;] sysenter_past_esp+0x5f/0xa5
 [&lt;c10697a7&gt;] destroy_inode+0x24/0x33
 =======================
INFO: lockdep is turned off.
Code: 75 21 68 e1 1a 19 c1 68 87 00 00 00 68 b8 e8 1f c1 68 25 73 1f c1 e8 84 06 e9 ff e8 52 b8 e7 ff 83 c4 10 9c 5f fa e8 28 89 ea ff &lt;f0&gt; fe 4e 04 79 0a f3 90 80 7e 04 00 7e f8 eb f0 39 76 34 74 33
EIP: [&lt;c1188c1b&gt;] mutex_lock_nested+0x75/0x25d SS:ESP 0068:e8ea1ef8

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Alexey Dobriyan &lt;adobriyan@sw.ru&gt;
Cc: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Cc: Oleg Nesterov &lt;oleg@tv-sign.ru&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>proc: detect duplicate names on registration</title>
<updated>2008-02-08T17:22:23+00:00</updated>
<author>
<name>Zhang Rui</name>
<email>rui.zhang@intel.com</email>
</author>
<published>2008-02-08T12:18:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=94413d8807a3c511a3675be4ce27a4d16d6408ee'/>
<id>94413d8807a3c511a3675be4ce27a4d16d6408ee</id>
<content type='text'>
Print a warning if PDE is registered with a name which already exists in
target directory.

Bug report and a simple fix can be found here:
http://bugzilla.kernel.org/show_bug.cgi?id=8798

[\n fixlet and no undescriptive variable usage --adobriyan]
[akpm@linux-foundation.org: make printk comprehensible]
Signed-off-by: Zhang Rui &lt;rui.zhang@intel.com&gt;
Signed-off-by: Alexey Dobriyan &lt;adobriyan@sw.ru&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Print a warning if PDE is registered with a name which already exists in
target directory.

Bug report and a simple fix can be found here:
http://bugzilla.kernel.org/show_bug.cgi?id=8798

[\n fixlet and no undescriptive variable usage --adobriyan]
[akpm@linux-foundation.org: make printk comprehensible]
Signed-off-by: Zhang Rui &lt;rui.zhang@intel.com&gt;
Signed-off-by: Alexey Dobriyan &lt;adobriyan@sw.ru&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
