<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/net/netfilter/ipvs, branch v4.2-rc2</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>ipv6: Set FLOWI_FLAG_KNOWN_NH at flowi6_flags</title>
<updated>2015-05-25T17:25:34+00:00</updated>
<author>
<name>Martin KaFai Lau</name>
<email>kafai@fb.com</email>
</author>
<published>2015-05-23T03:56:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=48e8aa6e3137692d38f20e8bfff100e408c6bc53'/>
<id>48e8aa6e3137692d38f20e8bfff100e408c6bc53</id>
<content type='text'>
The neighbor look-up used to depend on the rt6i_gateway (if
there is a gateway) or the rt6i_dst (if it is a RTF_CACHE clone)
as the nexthop address.  Note that rt6i_dst is set to fl6-&gt;daddr
for the RTF_CACHE clone where fl6-&gt;daddr is the one used to do
the route look-up.

Now, we only create RTF_CACHE clone after encountering exception.
When doing the neighbor look-up with a route that is neither a gateway
nor a RTF_CACHE clone, the daddr in skb will be used as the nexthop.

In some cases, the daddr in skb is not the one used to do
the route look-up.  One example is in ip_vs_dr_xmit_v6() where the
real nexthop server address is different from the one in the skb.

This patch is going to follow the IPv4 approach and ask the
ip6_pol_route() callers to set the FLOWI_FLAG_KNOWN_NH properly.

In the next patch, ip6_pol_route() will honor the FLOWI_FLAG_KNOWN_NH
and create a RTF_CACHE clone.

Signed-off-by: Martin KaFai Lau &lt;kafai@fb.com&gt;
Acked-by: Julian Anastasov &lt;ja@ssi.bg&gt;
Tested-by: Julian Anastasov &lt;ja@ssi.bg&gt;
Cc: Hannes Frederic Sowa &lt;hannes@stressinduktion.org&gt;
Cc: Steffen Klassert &lt;steffen.klassert@secunet.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The neighbor look-up used to depend on the rt6i_gateway (if
there is a gateway) or the rt6i_dst (if it is a RTF_CACHE clone)
as the nexthop address.  Note that rt6i_dst is set to fl6-&gt;daddr
for the RTF_CACHE clone where fl6-&gt;daddr is the one used to do
the route look-up.

Now, we only create RTF_CACHE clone after encountering exception.
When doing the neighbor look-up with a route that is neither a gateway
nor a RTF_CACHE clone, the daddr in skb will be used as the nexthop.

In some cases, the daddr in skb is not the one used to do
the route look-up.  One example is in ip_vs_dr_xmit_v6() where the
real nexthop server address is different from the one in the skb.

This patch is going to follow the IPv4 approach and ask the
ip6_pol_route() callers to set the FLOWI_FLAG_KNOWN_NH properly.

In the next patch, ip6_pol_route() will honor the FLOWI_FLAG_KNOWN_NH
and create a RTF_CACHE clone.

Signed-off-by: Martin KaFai Lau &lt;kafai@fb.com&gt;
Acked-by: Julian Anastasov &lt;ja@ssi.bg&gt;
Tested-by: Julian Anastasov &lt;ja@ssi.bg&gt;
Cc: Hannes Frederic Sowa &lt;hannes@stressinduktion.org&gt;
Cc: Steffen Klassert &lt;steffen.klassert@secunet.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ipv6: Add rt6_get_cookie() function</title>
<updated>2015-05-25T17:25:34+00:00</updated>
<author>
<name>Martin KaFai Lau</name>
<email>kafai@fb.com</email>
</author>
<published>2015-05-23T03:56:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=b197df4f0f3782782e9ea8996e91b65ae33e8dd9'/>
<id>b197df4f0f3782782e9ea8996e91b65ae33e8dd9</id>
<content type='text'>
Instead of doing the rt6-&gt;rt6i_node check whenever we need
to get the route's cookie.  Refactor it into rt6_get_cookie().
It is a prep work to handle FLOWI_FLAG_KNOWN_NH and also
percpu rt6_info later.

Signed-off-by: Martin KaFai Lau &lt;kafai@fb.com&gt;
Cc: Hannes Frederic Sowa &lt;hannes@stressinduktion.org&gt;
Cc: Steffen Klassert &lt;steffen.klassert@secunet.com&gt;
Cc: Julian Anastasov &lt;ja@ssi.bg&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Instead of doing the rt6-&gt;rt6i_node check whenever we need
to get the route's cookie.  Refactor it into rt6_get_cookie().
It is a prep work to handle FLOWI_FLAG_KNOWN_NH and also
percpu rt6_info later.

Signed-off-by: Martin KaFai Lau &lt;kafai@fb.com&gt;
Cc: Hannes Frederic Sowa &lt;hannes@stressinduktion.org&gt;
Cc: Steffen Klassert &lt;steffen.klassert@secunet.com&gt;
Cc: Julian Anastasov &lt;ja@ssi.bg&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ipv6: Remove external dependency on rt6i_dst and rt6i_src</title>
<updated>2015-05-25T17:25:32+00:00</updated>
<author>
<name>Martin KaFai Lau</name>
<email>kafai@fb.com</email>
</author>
<published>2015-05-23T03:55:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=fd0273d7939f2ce3247f6aac5f6b9a0135d4cd39'/>
<id>fd0273d7939f2ce3247f6aac5f6b9a0135d4cd39</id>
<content type='text'>
This patch removes the assumptions that the returned rt is always
a RTF_CACHE entry with the rt6i_dst and rt6i_src containing the
destination and source address.  The dst and src can be recovered from
the calling site.

We may consider to rename (rt6i_dst, rt6i_src) to
(rt6i_key_dst, rt6i_key_src) later.

Signed-off-by: Martin KaFai Lau &lt;kafai@fb.com&gt;
Reviewed-by: Hannes Frederic Sowa &lt;hannes@stressinduktion.org&gt;
Cc: Steffen Klassert &lt;steffen.klassert@secunet.com&gt;
Cc: Julian Anastasov &lt;ja@ssi.bg&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch removes the assumptions that the returned rt is always
a RTF_CACHE entry with the rt6i_dst and rt6i_src containing the
destination and source address.  The dst and src can be recovered from
the calling site.

We may consider to rename (rt6i_dst, rt6i_src) to
(rt6i_key_dst, rt6i_key_src) later.

Signed-off-by: Martin KaFai Lau &lt;kafai@fb.com&gt;
Reviewed-by: Hannes Frederic Sowa &lt;hannes@stressinduktion.org&gt;
Cc: Steffen Klassert &lt;steffen.klassert@secunet.com&gt;
Cc: Julian Anastasov &lt;ja@ssi.bg&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net</title>
<updated>2015-05-23T05:22:35+00:00</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2015-05-23T05:22:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=36583eb54d46c36a447afd6c379839f292397429'/>
<id>36583eb54d46c36a447afd6c379839f292397429</id>
<content type='text'>
Conflicts:
	drivers/net/ethernet/cadence/macb.c
	drivers/net/phy/phy.c
	include/linux/skbuff.h
	net/ipv4/tcp.c
	net/switchdev/switchdev.c

Switchdev was a case of RTNH_H_{EXTERNAL --&gt; OFFLOAD}
renaming overlapping with net-next changes of various
sorts.

phy.c was a case of two changes, one adding a local
variable to a function whilst the second was removing
one.

tcp.c overlapped a deadlock fix with the addition of new tcp_info
statistic values.

macb.c involved the addition of two zyncq device entries.

skbuff.h involved adding back ipv4_daddr to nf_bridge_info
whilst net-next changes put two other existing members of
that struct into a union.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Conflicts:
	drivers/net/ethernet/cadence/macb.c
	drivers/net/phy/phy.c
	include/linux/skbuff.h
	net/ipv4/tcp.c
	net/switchdev/switchdev.c

Switchdev was a case of RTNH_H_{EXTERNAL --&gt; OFFLOAD}
renaming overlapping with net-next changes of various
sorts.

phy.c was a case of two changes, one adding a local
variable to a function whilst the second was removing
one.

tcp.c overlapped a deadlock fix with the addition of new tcp_info
statistic values.

macb.c involved the addition of two zyncq device entries.

skbuff.h involved adding back ipv4_daddr to nf_bridge_info
whilst net-next changes put two other existing members of
that struct into a union.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: Modify sk_alloc to not reference count the netns of kernel sockets.</title>
<updated>2015-05-11T14:50:18+00:00</updated>
<author>
<name>Eric W. Biederman</name>
<email>ebiederm@xmission.com</email>
</author>
<published>2015-05-09T02:10:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=26abe14379f8e2fa3fd1bcf97c9a7ad9364886fe'/>
<id>26abe14379f8e2fa3fd1bcf97c9a7ad9364886fe</id>
<content type='text'>
Now that sk_alloc knows when a kernel socket is being allocated modify
it to not reference count the network namespace of kernel sockets.

Keep track of if a socket needs reference counting by adding a flag to
struct sock called sk_net_refcnt.

Update all of the callers of sock_create_kern to stop using
sk_change_net and sk_release_kernel as those hacks are no longer
needed, to avoid reference counting a kernel socket.

Signed-off-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Now that sk_alloc knows when a kernel socket is being allocated modify
it to not reference count the network namespace of kernel sockets.

Keep track of if a socket needs reference counting by adding a flag to
struct sock called sk_net_refcnt.

Update all of the callers of sock_create_kern to stop using
sk_change_net and sk_release_kernel as those hacks are no longer
needed, to avoid reference counting a kernel socket.

Signed-off-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: Add a struct net parameter to sock_create_kern</title>
<updated>2015-05-11T14:50:17+00:00</updated>
<author>
<name>Eric W. Biederman</name>
<email>ebiederm@xmission.com</email>
</author>
<published>2015-05-09T02:08:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=eeb1bd5c40edb0e2fd925c8535e2fdebdbc5cef2'/>
<id>eeb1bd5c40edb0e2fd925c8535e2fdebdbc5cef2</id>
<content type='text'>
This is long overdue, and is part of cleaning up how we allocate kernel
sockets that don't reference count struct net.

Signed-off-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This is long overdue, and is part of cleaning up how we allocate kernel
sockets that don't reference count struct net.

Signed-off-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ipvs: fix memory leak in ip_vs_ctl.c</title>
<updated>2015-05-08T08:58:54+00:00</updated>
<author>
<name>Tommi Rantala</name>
<email>tt.rantala@gmail.com</email>
</author>
<published>2015-05-07T12:12:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=f30bf2a5cac6c60ab366c4bc6db913597bf4d6ab'/>
<id>f30bf2a5cac6c60ab366c4bc6db913597bf4d6ab</id>
<content type='text'>
Fix memory leak introduced in commit a0840e2e165a ("IPVS: netns,
ip_vs_ctl local vars moved to ipvs struct."):

unreferenced object 0xffff88005785b800 (size 2048):
  comm "(-localed)", pid 1434, jiffies 4294755650 (age 1421.089s)
  hex dump (first 32 bytes):
    bb 89 0b 83 ff ff ff ff b0 78 f0 4e 00 88 ff ff  .........x.N....
    04 00 00 00 a4 01 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [&lt;ffffffff8262ea8e&gt;] kmemleak_alloc+0x4e/0xb0
    [&lt;ffffffff811fba74&gt;] __kmalloc_track_caller+0x244/0x430
    [&lt;ffffffff811b88a0&gt;] kmemdup+0x20/0x50
    [&lt;ffffffff823276b7&gt;] ip_vs_control_net_init+0x1f7/0x510
    [&lt;ffffffff8231d630&gt;] __ip_vs_init+0x100/0x250
    [&lt;ffffffff822363a1&gt;] ops_init+0x41/0x190
    [&lt;ffffffff82236583&gt;] setup_net+0x93/0x150
    [&lt;ffffffff82236cc2&gt;] copy_net_ns+0x82/0x140
    [&lt;ffffffff810ab13d&gt;] create_new_namespaces+0xfd/0x190
    [&lt;ffffffff810ab49a&gt;] unshare_nsproxy_namespaces+0x5a/0xc0
    [&lt;ffffffff810833e3&gt;] SyS_unshare+0x173/0x310
    [&lt;ffffffff8265cbd7&gt;] system_call_fastpath+0x12/0x6f
    [&lt;ffffffffffffffff&gt;] 0xffffffffffffffff

Fixes: a0840e2e165a ("IPVS: netns, ip_vs_ctl local vars moved to ipvs struct.")
Signed-off-by: Tommi Rantala &lt;tt.rantala@gmail.com&gt;
Acked-by: Julian Anastasov &lt;ja@ssi.bg&gt;
Signed-off-by: Simon Horman &lt;horms@verge.net.au&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Fix memory leak introduced in commit a0840e2e165a ("IPVS: netns,
ip_vs_ctl local vars moved to ipvs struct."):

unreferenced object 0xffff88005785b800 (size 2048):
  comm "(-localed)", pid 1434, jiffies 4294755650 (age 1421.089s)
  hex dump (first 32 bytes):
    bb 89 0b 83 ff ff ff ff b0 78 f0 4e 00 88 ff ff  .........x.N....
    04 00 00 00 a4 01 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [&lt;ffffffff8262ea8e&gt;] kmemleak_alloc+0x4e/0xb0
    [&lt;ffffffff811fba74&gt;] __kmalloc_track_caller+0x244/0x430
    [&lt;ffffffff811b88a0&gt;] kmemdup+0x20/0x50
    [&lt;ffffffff823276b7&gt;] ip_vs_control_net_init+0x1f7/0x510
    [&lt;ffffffff8231d630&gt;] __ip_vs_init+0x100/0x250
    [&lt;ffffffff822363a1&gt;] ops_init+0x41/0x190
    [&lt;ffffffff82236583&gt;] setup_net+0x93/0x150
    [&lt;ffffffff82236cc2&gt;] copy_net_ns+0x82/0x140
    [&lt;ffffffff810ab13d&gt;] create_new_namespaces+0xfd/0x190
    [&lt;ffffffff810ab49a&gt;] unshare_nsproxy_namespaces+0x5a/0xc0
    [&lt;ffffffff810833e3&gt;] SyS_unshare+0x173/0x310
    [&lt;ffffffff8265cbd7&gt;] system_call_fastpath+0x12/0x6f
    [&lt;ffffffffffffffff&gt;] 0xffffffffffffffff

Fixes: a0840e2e165a ("IPVS: netns, ip_vs_ctl local vars moved to ipvs struct.")
Signed-off-by: Tommi Rantala &lt;tt.rantala@gmail.com&gt;
Acked-by: Julian Anastasov &lt;ja@ssi.bg&gt;
Signed-off-by: Simon Horman &lt;horms@verge.net.au&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>netfilter: Pass socket pointer down through okfn().</title>
<updated>2015-04-07T19:25:55+00:00</updated>
<author>
<name>David Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2015-04-06T02:19:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=7026b1ddb6b8d4e6ee33dc2bd06c0ca8746fa7ab'/>
<id>7026b1ddb6b8d4e6ee33dc2bd06c0ca8746fa7ab</id>
<content type='text'>
On the output paths in particular, we have to sometimes deal with two
socket contexts.  First, and usually skb-&gt;sk, is the local socket that
generated the frame.

And second, is potentially the socket used to control a tunneling
socket, such as one the encapsulates using UDP.

We do not want to disassociate skb-&gt;sk when encapsulating in order
to fix this, because that would break socket memory accounting.

The most extreme case where this can cause huge problems is an
AF_PACKET socket transmitting over a vxlan device.  We hit code
paths doing checks that assume they are dealing with an ipv4
socket, but are actually operating upon the AF_PACKET one.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
On the output paths in particular, we have to sometimes deal with two
socket contexts.  First, and usually skb-&gt;sk, is the local socket that
generated the frame.

And second, is potentially the socket used to control a tunneling
socket, such as one the encapsulates using UDP.

We do not want to disassociate skb-&gt;sk when encapsulating in order
to fix this, because that would break socket memory accounting.

The most extreme case where this can cause huge problems is an
AF_PACKET socket transmitting over a vxlan device.  We hit code
paths doing checks that assume they are dealing with an ipv4
socket, but are actually operating upon the AF_PACKET one.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>netfilter: Make nf_hookfn use nf_hook_state.</title>
<updated>2015-04-04T16:31:38+00:00</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2015-04-04T00:32:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=238e54c9cb9385a1ba99e92801f3615a2fb398b6'/>
<id>238e54c9cb9385a1ba99e92801f3615a2fb398b6</id>
<content type='text'>
Pass the nf_hook_state all the way down into the hook
functions themselves.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pass the nf_hook_state all the way down into the hook
functions themselves.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ipv4: hash net ptr into fragmentation bucket selection</title>
<updated>2015-03-25T18:07:04+00:00</updated>
<author>
<name>Hannes Frederic Sowa</name>
<email>hannes@stressinduktion.org</email>
</author>
<published>2015-03-25T16:07:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=b6a7719aedd7e5c0f2df7641aa47386111682df4'/>
<id>b6a7719aedd7e5c0f2df7641aa47386111682df4</id>
<content type='text'>
As namespaces are sometimes used with overlapping ip address ranges,
we should also use the namespace as input to the hash to select the ip
fragmentation counter bucket.

Cc: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Flavio Leitner &lt;fbl@redhat.com&gt;
Signed-off-by: Hannes Frederic Sowa &lt;hannes@stressinduktion.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
As namespaces are sometimes used with overlapping ip address ranges,
we should also use the namespace as input to the hash to select the ip
fragmentation counter bucket.

Cc: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Flavio Leitner &lt;fbl@redhat.com&gt;
Signed-off-by: Hannes Frederic Sowa &lt;hannes@stressinduktion.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
</feed>
