<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/net/netlink, branch v3.11-rc2</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>netlink: fix splat in skb_clone with large messages</title>
<updated>2013-06-28T05:44:16+00:00</updated>
<author>
<name>Pablo Neira</name>
<email>pablo@netfilter.org</email>
</author>
<published>2013-06-28T01:04:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=3a36515f729458c8efa0c124c7262d5843ad5c37'/>
<id>3a36515f729458c8efa0c124c7262d5843ad5c37</id>
<content type='text'>
Since (c05cdb1 netlink: allow large data transfers from user-space),
netlink splats if it invokes skb_clone on large netlink skbs since:

* skb_shared_info was not correctly initialized.
* skb-&gt;destructor is not set in the cloned skb.

This was spotted by trinity:

[  894.990671] BUG: unable to handle kernel paging request at ffffc9000047b001
[  894.991034] IP: [&lt;ffffffff81a212c4&gt;] skb_clone+0x24/0xc0
[...]
[  894.991034] Call Trace:
[  894.991034]  [&lt;ffffffff81ad299a&gt;] nl_fib_input+0x6a/0x240
[  894.991034]  [&lt;ffffffff81c3b7e6&gt;] ? _raw_read_unlock+0x26/0x40
[  894.991034]  [&lt;ffffffff81a5f189&gt;] netlink_unicast+0x169/0x1e0
[  894.991034]  [&lt;ffffffff81a601e1&gt;] netlink_sendmsg+0x251/0x3d0

Fix it by:

1) introducing a new netlink_skb_clone function that is used in nl_fib_input,
   that sets our special skb-&gt;destructor in the cloned skb. Moreover, handle
   the release of the large cloned skb head area in the destructor path.

2) not allowing large skbuffs in the netlink broadcast path. I cannot find
   any reasonable use of the large data transfer using netlink in that path,
   moreover this helps to skip extra skb_clone handling.

I found two more netlink clients that are cloning the skbs, but they are
not in the sendmsg path. Therefore, the sole client cloning that I found
seems to be the fib frontend.

Thanks to Eric Dumazet for helping to address this issue.

Reported-by: Fengguang Wu &lt;fengguang.wu@intel.com&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Since (c05cdb1 netlink: allow large data transfers from user-space),
netlink splats if it invokes skb_clone on large netlink skbs since:

* skb_shared_info was not correctly initialized.
* skb-&gt;destructor is not set in the cloned skb.

This was spotted by trinity:

[  894.990671] BUG: unable to handle kernel paging request at ffffc9000047b001
[  894.991034] IP: [&lt;ffffffff81a212c4&gt;] skb_clone+0x24/0xc0
[...]
[  894.991034] Call Trace:
[  894.991034]  [&lt;ffffffff81ad299a&gt;] nl_fib_input+0x6a/0x240
[  894.991034]  [&lt;ffffffff81c3b7e6&gt;] ? _raw_read_unlock+0x26/0x40
[  894.991034]  [&lt;ffffffff81a5f189&gt;] netlink_unicast+0x169/0x1e0
[  894.991034]  [&lt;ffffffff81a601e1&gt;] netlink_sendmsg+0x251/0x3d0

Fix it by:

1) introducing a new netlink_skb_clone function that is used in nl_fib_input,
   that sets our special skb-&gt;destructor in the cloned skb. Moreover, handle
   the release of the large cloned skb head area in the destructor path.

2) not allowing large skbuffs in the netlink broadcast path. I cannot find
   any reasonable use of the large data transfer using netlink in that path,
   moreover this helps to skip extra skb_clone handling.

I found two more netlink clients that are cloning the skbs, but they are
not in the sendmsg path. Therefore, the sole client cloning that I found
seems to be the fib frontend.

Thanks to Eric Dumazet for helping to address this issue.

Reported-by: Fengguang Wu &lt;fengguang.wu@intel.com&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: netlink: virtual tap device management</title>
<updated>2013-06-24T23:39:05+00:00</updated>
<author>
<name>Daniel Borkmann</name>
<email>dborkman@redhat.com</email>
</author>
<published>2013-06-21T17:38:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=bcbde0d449eda7afa8f63280b165c8300dbd00e2'/>
<id>bcbde0d449eda7afa8f63280b165c8300dbd00e2</id>
<content type='text'>
Similarly to the networking receive path with ptype_all taps, we add
the possibility to register netdevices that are for ARPHRD_NETLINK to
the netlink subsystem, so that those can be used for netlink analyzers
resp. debuggers. We do not offer a direct callback function as out-of-tree
modules could do crap with it. Instead, a netdevice must be registered
properly and only receives a clone, managed by the netlink layer. Symbols
are exported as GPL-only.

Signed-off-by: Daniel Borkmann &lt;dborkman@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Similarly to the networking receive path with ptype_all taps, we add
the possibility to register netdevices that are for ARPHRD_NETLINK to
the netlink subsystem, so that those can be used for netlink analyzers
resp. debuggers. We do not offer a direct callback function as out-of-tree
modules could do crap with it. Instead, a netdevice must be registered
properly and only receives a clone, managed by the netlink layer. Symbols
are exported as GPL-only.

Signed-off-by: Daniel Borkmann &lt;dborkman@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net</title>
<updated>2013-06-19T23:49:39+00:00</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2013-06-19T23:49:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=d98cae64e4a733ff377184d78aa0b1f2b54faede'/>
<id>d98cae64e4a733ff377184d78aa0b1f2b54faede</id>
<content type='text'>
Conflicts:
	drivers/net/wireless/ath/ath9k/Kconfig
	drivers/net/xen-netback/netback.c
	net/batman-adv/bat_iv_ogm.c
	net/wireless/nl80211.c

The ath9k Kconfig conflict was a change of a Kconfig option name right
next to the deletion of another option.

The xen-netback conflict was overlapping changes involving the
handling of the notify list in xen_netbk_rx_action().

Batman conflict resolution provided by Antonio Quartulli, basically
keep everything in both conflict hunks.

The nl80211 conflict is a little more involved.  In 'net' we added a
dynamic memory allocation to nl80211_dump_wiphy() to fix a race that
Linus reported.  Meanwhile in 'net-next' the handlers were converted
to use pre and post doit handlers which use a flag to determine
whether to hold the RTNL mutex around the operation.

However, the dump handlers to not use this logic.  Instead they have
to explicitly do the locking.  There were apparent bugs in the
conversion of nl80211_dump_wiphy() in that we were not dropping the
RTNL mutex in all the return paths, and it seems we very much should
be doing so.  So I fixed that whilst handling the overlapping changes.

To simplify the initial returns, I take the RTNL mutex after we try
to allocate 'tb'.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Conflicts:
	drivers/net/wireless/ath/ath9k/Kconfig
	drivers/net/xen-netback/netback.c
	net/batman-adv/bat_iv_ogm.c
	net/wireless/nl80211.c

The ath9k Kconfig conflict was a change of a Kconfig option name right
next to the deletion of another option.

The xen-netback conflict was overlapping changes involving the
handling of the notify list in xen_netbk_rx_action().

Batman conflict resolution provided by Antonio Quartulli, basically
keep everything in both conflict hunks.

The nl80211 conflict is a little more involved.  In 'net' we added a
dynamic memory allocation to nl80211_dump_wiphy() to fix a race that
Linus reported.  Meanwhile in 'net-next' the handlers were converted
to use pre and post doit handlers which use a flag to determine
whether to hold the RTNL mutex around the operation.

However, the dump handlers to not use this logic.  Instead they have
to explicitly do the locking.  There were apparent bugs in the
conversion of nl80211_dump_wiphy() in that we were not dropping the
RTNL mutex in all the return paths, and it seems we very much should
be doing so.  So I fixed that whilst handling the overlapping changes.

To simplify the initial returns, I take the RTNL mutex after we try
to allocate 'tb'.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>netlink: make compare exist all the time</title>
<updated>2013-06-13T07:45:48+00:00</updated>
<author>
<name>Gao feng</name>
<email>gaofeng@cn.fujitsu.com</email>
</author>
<published>2013-06-13T02:05:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=ca15febfe98f7c681ac345fc1d2ee1b8decaa493'/>
<id>ca15febfe98f7c681ac345fc1d2ee1b8decaa493</id>
<content type='text'>
Commit da12c90e099789a63073fc82a19542ce54d4efb9
"netlink: Add compare function for netlink_table"
only set compare at the time we create kernel netlink,
and reset compare to NULL at the time we finially
release netlink socket, but netlink_lookup wants
the compare exist always.

So we should set compare after we allocate nl_table,
and never reset it. make comapre exist all the time.

Reported-by: Fengguang Wu &lt;fengguang.wu@intel.com&gt;
Signed-off-by: Gao feng &lt;gaofeng@cn.fujitsu.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Commit da12c90e099789a63073fc82a19542ce54d4efb9
"netlink: Add compare function for netlink_table"
only set compare at the time we create kernel netlink,
and reset compare to NULL at the time we finially
release netlink socket, but netlink_lookup wants
the compare exist always.

So we should set compare after we allocate nl_table,
and never reset it. make comapre exist all the time.

Reported-by: Fengguang Wu &lt;fengguang.wu@intel.com&gt;
Signed-off-by: Gao feng &lt;gaofeng@cn.fujitsu.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>netlink: fix error propagation in netlink_mmap()</title>
<updated>2013-06-11T09:52:47+00:00</updated>
<author>
<name>Patrick McHardy</name>
<email>kaber@trash.net</email>
</author>
<published>2013-06-11T09:52:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=7cdbac71f911494aa7d0343be23c092ca84a5ed4'/>
<id>7cdbac71f911494aa7d0343be23c092ca84a5ed4</id>
<content type='text'>
Return the error if something went wrong instead of unconditionally
returning 0.

Signed-off-by: Patrick McHardy &lt;kaber@trash.net&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Return the error if something went wrong instead of unconditionally
returning 0.

Signed-off-by: Patrick McHardy &lt;kaber@trash.net&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>netlink: Add compare function for netlink_table</title>
<updated>2013-06-11T09:39:42+00:00</updated>
<author>
<name>Gao feng</name>
<email>gaofeng@cn.fujitsu.com</email>
</author>
<published>2013-06-06T06:49:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=da12c90e099789a63073fc82a19542ce54d4efb9'/>
<id>da12c90e099789a63073fc82a19542ce54d4efb9</id>
<content type='text'>
As we know, netlink sockets are private resource of
net namespace, they can communicate with each other
only when they in the same net namespace. this works
well until we try to add namespace support for other
subsystems which use netlink.

Don't like ipv4 and route table.., it is not suited to
make these subsytems belong to net namespace, Such as
audit and crypto subsystems,they are more suitable to
user namespace.

So we must have the ability to make the netlink sockets
in same user namespace can communicate with each other.

This patch adds a new function pointer "compare" for
netlink_table, we can decide if the netlink sockets can
communicate with each other through this netlink_table
self-defined compare function.

The behavior isn't changed if we don't provide the compare
function for netlink_table.

Signed-off-by: Gao feng &lt;gaofeng@cn.fujitsu.com&gt;
Acked-by: Serge E. Hallyn &lt;serge.hallyn@ubuntu.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
As we know, netlink sockets are private resource of
net namespace, they can communicate with each other
only when they in the same net namespace. this works
well until we try to add namespace support for other
subsystems which use netlink.

Don't like ipv4 and route table.., it is not suited to
make these subsytems belong to net namespace, Such as
audit and crypto subsystems,they are more suitable to
user namespace.

So we must have the ability to make the netlink sockets
in same user namespace can communicate with each other.

This patch adds a new function pointer "compare" for
netlink_table, we can decide if the netlink sockets can
communicate with each other through this netlink_table
self-defined compare function.

The behavior isn't changed if we don't provide the compare
function for netlink_table.

Signed-off-by: Gao feng &lt;gaofeng@cn.fujitsu.com&gt;
Acked-by: Serge E. Hallyn &lt;serge.hallyn@ubuntu.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>netlink: allow large data transfers from user-space</title>
<updated>2013-06-07T23:26:34+00:00</updated>
<author>
<name>Pablo Neira Ayuso</name>
<email>pablo@netfilter.org</email>
</author>
<published>2013-06-03T09:46:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=c05cdb1b864f548c0c3d8ae3b51264e6739a69b1'/>
<id>c05cdb1b864f548c0c3d8ae3b51264e6739a69b1</id>
<content type='text'>
I can hit ENOBUFS in the sendmsg() path with a large batch that is
composed of many netlink messages. Here that limit is 8 MBytes of
skbuff data area as kmalloc does not manage to get more than that.

While discussing atomic rule-set for nftables with Patrick McHardy,
we decided to put all rule-set updates that need to be applied
atomically in one single batch to simplify the existing approach.
However, as explained above, the existing netlink code limits us
to a maximum of ~20000 rules that fit in one single batch without
hitting ENOBUFS. iptables does not have such limitation as it is
using vmalloc.

This patch adds netlink_alloc_large_skb() which is only used in
the netlink_sendmsg() path. It uses alloc_skb if the memory
requested is &lt;= one memory page, that should be the common case
for most subsystems, else vmalloc for higher memory allocations.

Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
I can hit ENOBUFS in the sendmsg() path with a large batch that is
composed of many netlink messages. Here that limit is 8 MBytes of
skbuff data area as kmalloc does not manage to get more than that.

While discussing atomic rule-set for nftables with Patrick McHardy,
we decided to put all rule-set updates that need to be applied
atomically in one single batch to simplify the existing approach.
However, as explained above, the existing netlink code limits us
to a maximum of ~20000 rules that fit in one single batch without
hitting ENOBUFS. iptables does not have such limitation as it is
using vmalloc.

This patch adds netlink_alloc_large_skb() which is only used in
the netlink_sendmsg() path. It uses alloc_skb if the memory
requested is &lt;= one memory page, that should be the common case
for most subsystems, else vmalloc for higher memory allocations.

Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: fix sk_buff head without data area</title>
<updated>2013-06-05T00:26:49+00:00</updated>
<author>
<name>Pablo Neira</name>
<email>pablo@netfilter.org</email>
</author>
<published>2013-06-03T09:28:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=5e71d9d77c07fa7d4c42287a177f7b738d0cd4b9'/>
<id>5e71d9d77c07fa7d4c42287a177f7b738d0cd4b9</id>
<content type='text'>
Eric Dumazet spotted that we have to check skb-&gt;head instead
of skb-&gt;data as skb-&gt;head points to the beginning of the
data area of the skbuff. Similarly, we have to initialize the
skb-&gt;head pointer, not skb-&gt;data in __alloc_skb_head.

After this fix, netlink crashes in the release path of the
sk_buff, so let's fix that as well.

This bug was introduced in (0ebd0ac net: add function to
allocate sk_buff head without data area).

Reported-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Eric Dumazet spotted that we have to check skb-&gt;head instead
of skb-&gt;data as skb-&gt;head points to the beginning of the
data area of the skbuff. Similarly, we have to initialize the
skb-&gt;head pointer, not skb-&gt;data in __alloc_skb_head.

After this fix, netlink crashes in the release path of the
sk_buff, so let's fix that as well.

This bug was introduced in (0ebd0ac net: add function to
allocate sk_buff head without data area).

Reported-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>netlink: kconfig: move mmap i/o into netlink kconfig</title>
<updated>2013-05-01T19:02:42+00:00</updated>
<author>
<name>Daniel Borkmann</name>
<email>dborkman@redhat.com</email>
</author>
<published>2013-05-01T01:37:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=ee1bec9b3b49ea05e97ef40a3d38b70628cb947a'/>
<id>ee1bec9b3b49ea05e97ef40a3d38b70628cb947a</id>
<content type='text'>
Currently, in menuconfig, Netlink's new mmaped IO is the very first
entry under the ``Networking support'' item and comes even before
``Networking options'':

  [ ]   Netlink: mmaped IO
  Networking options  ---&gt;
  ...

Lets move this into ``Networking options'' under netlink's Kconfig,
since this might be more appropriate. Introduced by commit ccdfcc398
(``netlink: mmaped netlink: ring setup'').

Signed-off-by: Daniel Borkmann &lt;dborkman@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently, in menuconfig, Netlink's new mmaped IO is the very first
entry under the ``Networking support'' item and comes even before
``Networking options'':

  [ ]   Netlink: mmaped IO
  Networking options  ---&gt;
  ...

Lets move this into ``Networking options'' under netlink's Kconfig,
since this might be more appropriate. Introduced by commit ccdfcc398
(``netlink: mmaped netlink: ring setup'').

Signed-off-by: Daniel Borkmann &lt;dborkman@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>netlink: Fix skb ref counting.</title>
<updated>2013-05-01T18:57:03+00:00</updated>
<author>
<name>Pravin B Shelar</name>
<email>pshelar@nicira.com</email>
</author>
<published>2013-04-29T20:52:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=ae6164adeb559db1828d4abd917971b61130f72e'/>
<id>ae6164adeb559db1828d4abd917971b61130f72e</id>
<content type='text'>
Commit f9c2288837ba072b21dba955f04a4c97eaa77b1e (netlink:
implement memory mapped recvmsg) increamented skb-&gt;users
ref count twice for a dump op which does not look right.

Following patch fixes that.

CC: Patrick McHardy &lt;kaber@trash.net&gt;
Signed-off-by: Pravin B Shelar &lt;pshelar@nicira.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Commit f9c2288837ba072b21dba955f04a4c97eaa77b1e (netlink:
implement memory mapped recvmsg) increamented skb-&gt;users
ref count twice for a dump op which does not look right.

Following patch fixes that.

CC: Patrick McHardy &lt;kaber@trash.net&gt;
Signed-off-by: Pravin B Shelar &lt;pshelar@nicira.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
</feed>
