<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/include/uapi/linux/netfilter, branch v3.16.78</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>netfilter: ctnetlink: make it safer when updating ct-&gt;status</title>
<updated>2017-08-26T01:14:38+00:00</updated>
<author>
<name>Liping Zhang</name>
<email>zlpnobody@gmail.com</email>
</author>
<published>2017-04-17T13:18:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=00834185e96c19fd8bf1106e5b4d045678d70394'/>
<id>00834185e96c19fd8bf1106e5b4d045678d70394</id>
<content type='text'>
commit 53b56da83d7899de375a9de153fd7f5397de85e6 upstream.

After converting to use rcu for conntrack hash, one CPU may update
the ct-&gt;status via ctnetlink, while another CPU may process the
packets and update the ct-&gt;status.

So the non-atomic operation "ct-&gt;status |= status;" via ctnetlink
becomes unsafe, and this may clear the IPS_DYING_BIT bit set by
another CPU unexpectedly. For example:
         CPU0                            CPU1
  ctnetlink_change_status        __nf_conntrack_find_get
      old = ct-&gt;status              nf_ct_gc_expired
          -                         nf_ct_kill
          -                      test_and_set_bit(IPS_DYING_BIT
      new = old | status;                 -
  ct-&gt;status = new; &lt;-- oops, _DYING_ is cleared!

Now using a series of atomic bit operation to solve the above issue.

Also note, user shouldn't set IPS_TEMPLATE, IPS_SEQ_ADJUST directly,
so make these two bits be unchangable too.

If we set the IPS_TEMPLATE_BIT, ct will be freed by nf_ct_tmpl_free,
but actually it is alloced by nf_conntrack_alloc.
If we set the IPS_SEQ_ADJUST_BIT, this may cause the NULL pointer
deference, as the nfct_seqadj(ct) maybe NULL.

Last, add some comments to describe the logic change due to the
commit a963d710f367 ("netfilter: ctnetlink: Fix regression in CTA_STATUS
processing"), which makes me feel a little confusing.

Fixes: 76507f69c44e ("[NETFILTER]: nf_conntrack: use RCU for conntrack hash")
Signed-off-by: Liping Zhang &lt;zlpnobody@gmail.com&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
[bwh: Backported to 3.16: IPS_UNCHANGEABLE_MASK was not previously defined and
 ctnetlink_update_status() is not needed]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 53b56da83d7899de375a9de153fd7f5397de85e6 upstream.

After converting to use rcu for conntrack hash, one CPU may update
the ct-&gt;status via ctnetlink, while another CPU may process the
packets and update the ct-&gt;status.

So the non-atomic operation "ct-&gt;status |= status;" via ctnetlink
becomes unsafe, and this may clear the IPS_DYING_BIT bit set by
another CPU unexpectedly. For example:
         CPU0                            CPU1
  ctnetlink_change_status        __nf_conntrack_find_get
      old = ct-&gt;status              nf_ct_gc_expired
          -                         nf_ct_kill
          -                      test_and_set_bit(IPS_DYING_BIT
      new = old | status;                 -
  ct-&gt;status = new; &lt;-- oops, _DYING_ is cleared!

Now using a series of atomic bit operation to solve the above issue.

Also note, user shouldn't set IPS_TEMPLATE, IPS_SEQ_ADJUST directly,
so make these two bits be unchangable too.

If we set the IPS_TEMPLATE_BIT, ct will be freed by nf_ct_tmpl_free,
but actually it is alloced by nf_conntrack_alloc.
If we set the IPS_SEQ_ADJUST_BIT, this may cause the NULL pointer
deference, as the nfct_seqadj(ct) maybe NULL.

Last, add some comments to describe the logic change due to the
commit a963d710f367 ("netfilter: ctnetlink: Fix regression in CTA_STATUS
processing"), which makes me feel a little confusing.

Fixes: 76507f69c44e ("[NETFILTER]: nf_conntrack: use RCU for conntrack hash")
Signed-off-by: Liping Zhang &lt;zlpnobody@gmail.com&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
[bwh: Backported to 3.16: IPS_UNCHANGEABLE_MASK was not previously defined and
 ctnetlink_update_status() is not needed]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>netfilter: xt_bpf: add mising opaque struct sk_filter definition</title>
<updated>2014-11-27T11:22:07+00:00</updated>
<author>
<name>Pablo Neira</name>
<email>pablo@netfilter.org</email>
</author>
<published>2014-07-29T16:12:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=aa9a588aba7a34c832ecfb2c2189d2174d8fd6cf'/>
<id>aa9a588aba7a34c832ecfb2c2189d2174d8fd6cf</id>
<content type='text'>
commit e10038a8ec06ac819b7552bb67aaa6d2d6f850c1 upstream.

This structure is not exposed to userspace, so fix this by defining
struct sk_filter; so we skip the casting in kernelspace. This is safe
since userspace has no way to lurk with that internal pointer.

Fixes: e6f30c7 ("netfilter: x_tables: add xt_bpf match")
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Acked-by: Willem de Bruijn &lt;willemb@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Luis Henriques &lt;luis.henriques@canonical.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit e10038a8ec06ac819b7552bb67aaa6d2d6f850c1 upstream.

This structure is not exposed to userspace, so fix this by defining
struct sk_filter; so we skip the casting in kernelspace. This is safe
since userspace has no way to lurk with that internal pointer.

Fixes: e6f30c7 ("netfilter: x_tables: add xt_bpf match")
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Acked-by: Willem de Bruijn &lt;willemb@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Luis Henriques &lt;luis.henriques@canonical.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next</title>
<updated>2014-05-31T00:54:47+00:00</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2014-05-31T00:54:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=90d0e08e574d1aa8553ee6179fcf3bf2b333ca6d'/>
<id>90d0e08e574d1aa8553ee6179fcf3bf2b333ca6d</id>
<content type='text'>
Pablo Neira Ayuso says:

====================
Netfilter/IPVS updates for net-next

This small patchset contains three accumulated Netfilter/IPVS updates,
they are:

1) Refactorize common NAT code by encapsulating it into a helper
   function, similarly to what we do in other conntrack extensions,
   from Florian Westphal.

2) A minor format string mismatch fix for IPVS, from Masanari Iida.

3) Add quota support to the netfilter accounting infrastructure, now
   you can add quotas to accounting objects via the nfnetlink interface
   and use them from iptables. You can also listen to quota
   notifications from userspace. This enhancement from Mathieu Poirier.
====================

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pablo Neira Ayuso says:

====================
Netfilter/IPVS updates for net-next

This small patchset contains three accumulated Netfilter/IPVS updates,
they are:

1) Refactorize common NAT code by encapsulating it into a helper
   function, similarly to what we do in other conntrack extensions,
   from Florian Westphal.

2) A minor format string mismatch fix for IPVS, from Masanari Iida.

3) Add quota support to the netfilter accounting infrastructure, now
   you can add quotas to accounting objects via the nfnetlink interface
   and use them from iptables. You can also listen to quota
   notifications from userspace. This enhancement from Mathieu Poirier.
====================

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>netfilter: nf_tables: use new transaction infrastructure to handle sets</title>
<updated>2014-05-19T10:06:10+00:00</updated>
<author>
<name>Pablo Neira Ayuso</name>
<email>pablo@netfilter.org</email>
</author>
<published>2014-04-03T09:48:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=958bee14d0718ca7a5002c0f48a099d1d345812a'/>
<id>958bee14d0718ca7a5002c0f48a099d1d345812a</id>
<content type='text'>
This patch reworks the nf_tables API so set updates are included in
the same batch that contains rule updates. This speeds up rule-set
updates since we skip a dialog of four messages between kernel and
user-space (two on each direction), from:

 1) create the set and send netlink message to the kernel
 2) process the response from the kernel that contains the allocated name.
 3) add the set elements and send netlink message to the kernel.
 4) process the response from the kernel (to check for errors).

To:

 1) add the set to the batch.
 2) add the set elements to the batch.
 3) add the rule that points to the set.
 4) send batch to the kernel.

This also introduces an internal set ID (NFTA_SET_ID) that is unique
in the batch so set elements and rules can refer to new sets.

Backward compatibility has been only retained in userspace, this
means that new nft versions can talk to the kernel both in the new
and the old fashion.

Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch reworks the nf_tables API so set updates are included in
the same batch that contains rule updates. This speeds up rule-set
updates since we skip a dialog of four messages between kernel and
user-space (two on each direction), from:

 1) create the set and send netlink message to the kernel
 2) process the response from the kernel that contains the allocated name.
 3) add the set elements and send netlink message to the kernel.
 4) process the response from the kernel (to check for errors).

To:

 1) add the set to the batch.
 2) add the set elements to the batch.
 3) add the rule that points to the set.
 4) send batch to the kernel.

This also introduces an internal set ID (NFTA_SET_ID) that is unique
in the batch so set elements and rules can refer to new sets.

Backward compatibility has been only retained in userspace, this
means that new nft versions can talk to the kernel both in the new
and the old fashion.

Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>netfilter: nfnetlink_acct: Adding quota support to accounting framework</title>
<updated>2014-04-29T16:25:14+00:00</updated>
<author>
<name>Mathieu Poirier</name>
<email>mathieu.poirier@linaro.org</email>
</author>
<published>2014-04-21T00:57:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=683399eddb9fff742b1a14c5a5d03e12bfc0afff'/>
<id>683399eddb9fff742b1a14c5a5d03e12bfc0afff</id>
<content type='text'>
nfacct objects already support accounting at the byte and packet
level.  As such it is a natural extension to add the possiblity to
define a ceiling limit for both metrics.

All the support for quotas itself is added to nfnetlink acctounting
framework to stay coherent with current accounting object management.
Quota limit checks are implemented in xt_nfacct filter where
statistic collection is already done.

Pablo Neira Ayuso has also contributed to this feature.

Signed-off-by: Mathieu Poirier &lt;mathieu.poirier@linaro.org&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
nfacct objects already support accounting at the byte and packet
level.  As such it is a natural extension to add the possiblity to
define a ceiling limit for both metrics.

All the support for quotas itself is added to nfnetlink acctounting
framework to stay coherent with current accounting object management.
Quota limit checks are implemented in xt_nfacct filter where
statistic collection is already done.

Pablo Neira Ayuso has also contributed to this feature.

Signed-off-by: Mathieu Poirier &lt;mathieu.poirier@linaro.org&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>netfilter: nf_tables: Add meta expression key for bridge interface name</title>
<updated>2014-04-24T08:37:28+00:00</updated>
<author>
<name>Tomasz Bursztyka</name>
<email>tomasz.bursztyka@linux.intel.com</email>
</author>
<published>2014-04-14T12:41:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=f5efc696cc711021cc73e7543cc3038e58459707'/>
<id>f5efc696cc711021cc73e7543cc3038e58459707</id>
<content type='text'>
NFT_META_BRI_IIFNAME to get packet input bridge interface name
NFT_META_BRI_OIFNAME to get packet output bridge interface name

Such meta key are accessible only through NFPROTO_BRIDGE family, on a
dedicated nft meta module: nft_meta_bridge.

Suggested-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Signed-off-by: Tomasz Bursztyka &lt;tomasz.bursztyka@linux.intel.com&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
NFT_META_BRI_IIFNAME to get packet input bridge interface name
NFT_META_BRI_OIFNAME to get packet output bridge interface name

Such meta key are accessible only through NFPROTO_BRIDGE family, on a
dedicated nft meta module: nft_meta_bridge.

Suggested-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Signed-off-by: Tomasz Bursztyka &lt;tomasz.bursztyka@linux.intel.com&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>netfilter: nf_tables: implement proper set selection</title>
<updated>2014-04-02T19:32:57+00:00</updated>
<author>
<name>Patrick McHardy</name>
<email>kaber@trash.net</email>
</author>
<published>2014-03-28T10:19:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=c50b960ccc5981627628302701e93e6aceccdb1c'/>
<id>c50b960ccc5981627628302701e93e6aceccdb1c</id>
<content type='text'>
The current set selection simply choses the first set type that provides
the requested features, which always results in the rbtree being chosen
by virtue of being the first set in the list.

What we actually want to do is choose the implementation that can provide
the requested features and is optimal from either a performance or memory
perspective depending on the characteristics of the elements and the
preferences specified by the user.

The elements are not known when creating a set. Even if we would provide
them for anonymous (literal) sets, we'd still have standalone sets where
the elements are not known in advance. We therefore need an abstract
description of the data charcteristics.

The kernel already knows the size of the key, this patch starts by
introducing a nested set description which so far contains only the maximum
amount of elements. Based on this the set implementations are changed to
provide an estimate of the required amount of memory and the lookup
complexity class.

The set ops have a new callback -&gt;estimate() that is invoked during set
selection. It receives a structure containing the attributes known to the
kernel and is supposed to populate a struct nft_set_estimate with the
complexity class and, in case the size is known, the complete amount of
memory required, or the amount of memory required per element otherwise.

Based on the policy specified by the user (performance/memory, defaulting
to performance) the kernel will then select the best suited implementation.

Even if the set implementation would allow to add more than the specified
maximum amount of elements, they are enforced since new implementations
might not be able to add more than maximum based on which they were
selected.

Signed-off-by: Patrick McHardy &lt;kaber@trash.net&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The current set selection simply choses the first set type that provides
the requested features, which always results in the rbtree being chosen
by virtue of being the first set in the list.

What we actually want to do is choose the implementation that can provide
the requested features and is optimal from either a performance or memory
perspective depending on the characteristics of the elements and the
preferences specified by the user.

The elements are not known when creating a set. Even if we would provide
them for anonymous (literal) sets, we'd still have standalone sets where
the elements are not known in advance. We therefore need an abstract
description of the data charcteristics.

The kernel already knows the size of the key, this patch starts by
introducing a nested set description which so far contains only the maximum
amount of elements. Based on this the set implementations are changed to
provide an estimate of the required amount of memory and the lookup
complexity class.

The set ops have a new callback -&gt;estimate() that is invoked during set
selection. It receives a structure containing the attributes known to the
kernel and is supposed to populate a struct nft_set_estimate with the
complexity class and, in case the size is known, the complete amount of
memory required, or the amount of memory required per element otherwise.

Based on the policy specified by the user (performance/memory, defaulting
to performance) the kernel will then select the best suited implementation.

Even if the set implementation would allow to add more than the specified
maximum amount of elements, they are enforced since new implementations
might not be able to add more than maximum based on which they were
selected.

Signed-off-by: Patrick McHardy &lt;kaber@trash.net&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>netfilter: ipset: add forceadd kernel support for hash set types</title>
<updated>2014-03-06T08:31:43+00:00</updated>
<author>
<name>Josh Hunt</name>
<email>johunt@akamai.com</email>
</author>
<published>2014-03-01T03:14:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=07cf8f5ae2657ac495b906c68ff3441ff8ba80ba'/>
<id>07cf8f5ae2657ac495b906c68ff3441ff8ba80ba</id>
<content type='text'>
Adds a new property for hash set types, where if a set is created
with the 'forceadd' option and the set becomes full the next addition
to the set may succeed and evict a random entry from the set.

To keep overhead low eviction is done very simply. It checks to see
which bucket the new entry would be added. If the bucket's pos value
is non-zero (meaning there's at least one entry in the bucket) it
replaces the first entry in the bucket. If pos is zero, then it continues
down the normal add process.

This property is useful if you have a set for 'ban' lists where it may
not matter if you release some entries from the set early.

Signed-off-by: Josh Hunt &lt;johunt@akamai.com&gt;
Signed-off-by: Jozsef Kadlecsik &lt;kadlec@blackhole.kfki.hu&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Adds a new property for hash set types, where if a set is created
with the 'forceadd' option and the set becomes full the next addition
to the set may succeed and evict a random entry from the set.

To keep overhead low eviction is done very simply. It checks to see
which bucket the new entry would be added. If the bucket's pos value
is non-zero (meaning there's at least one entry in the bucket) it
replaces the first entry in the bucket. If pos is zero, then it continues
down the normal add process.

This property is useful if you have a set for 'ban' lists where it may
not matter if you release some entries from the set early.

Signed-off-by: Josh Hunt &lt;johunt@akamai.com&gt;
Signed-off-by: Jozsef Kadlecsik &lt;kadlec@blackhole.kfki.hu&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>netfilter: ipset: kernel: uapi: fix MARKMASK attr ABI breakage</title>
<updated>2014-03-06T08:31:42+00:00</updated>
<author>
<name>Florian Westphal</name>
<email>fw@strlen.de</email>
</author>
<published>2014-02-13T11:40:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=004088768b78f69002f03a341597217eb608fb2c'/>
<id>004088768b78f69002f03a341597217eb608fb2c</id>
<content type='text'>
commit 2dfb973c0dcc6d2211 (add markmask for hash:ip,mark data type)
inserted IPSET_ATTR_MARKMASK in-between other enum values, i.e.
changing values of all further attributes.  This causes 'ipset list'
segfault on existing kernels since ipset no longer finds
IPSET_ATTR_MEMSIZE (it has a different value on kernel side).

Jozsef points out it should be moved below IPSET_ATTR_MARK which
works since there is some extra reserved space after that value.

Signed-off-by: Florian Westphal &lt;fw@strlen.de&gt;
Signed-off-by: Jozsef Kadlecsik &lt;kadlec@blackhole.kfki.hu&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 2dfb973c0dcc6d2211 (add markmask for hash:ip,mark data type)
inserted IPSET_ATTR_MARKMASK in-between other enum values, i.e.
changing values of all further attributes.  This causes 'ipset list'
segfault on existing kernels since ipset no longer finds
IPSET_ATTR_MEMSIZE (it has a different value on kernel side).

Jozsef points out it should be moved below IPSET_ATTR_MARK which
works since there is some extra reserved space after that value.

Signed-off-by: Florian Westphal &lt;fw@strlen.de&gt;
Signed-off-by: Jozsef Kadlecsik &lt;kadlec@blackhole.kfki.hu&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>netfilter: ipset: Prepare the kernel for create option flags when no extension is needed</title>
<updated>2014-03-06T08:31:42+00:00</updated>
<author>
<name>Jozsef Kadlecsik</name>
<email>kadlec@blackhole.kfki.hu</email>
</author>
<published>2014-02-13T11:19:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=af284ece87365f3a69723f5bcc1bcdb505b5eb5d'/>
<id>af284ece87365f3a69723f5bcc1bcdb505b5eb5d</id>
<content type='text'>
Signed-off-by: Jozsef Kadlecsik &lt;kadlec@blackhole.kfki.hu&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Signed-off-by: Jozsef Kadlecsik &lt;kadlec@blackhole.kfki.hu&gt;
</pre>
</div>
</content>
</entry>
</feed>
