linux-stable.git/drivers/infiniband, branch linux-2.6.33.y

IB/cm: Bump reference count on cm_id before invoking callback

2011-03-21T19:45:39+00:00

commit 29963437a48475036353b95ab142bf199adb909e upstream.

When processing a SIDR REQ, the ib_cm allocates a new cm_id.  The
refcount of the cm_id is initialized to 1.  However, cm_process_work
will decrement the refcount after invoking all callbacks.  The result
is that the cm_id will end up with refcount set to 0 by the end of the
sidr req handler.

If a user tries to destroy the cm_id, the destruction will proceed,
under the incorrect assumption that no other threads are referencing
the cm_id.  This can lead to a crash when the cm callback thread tries
to access the cm_id.

This problem was noticed as part of a larger investigation with kernel
crashes in the rdma_cm when running on a real time OS.

Signed-off-by: Sean Hefty 
Acked-by: Doug Ledford 
Signed-off-by: Roland Dreier 
Signed-off-by: Greg Kroah-Hartman

RDMA/cma: Fix crash in request handlers

2011-03-21T19:45:38+00:00

commit 25ae21a10112875763c18b385624df713a288a05 upstream.

Doug Ledford and Red Hat reported a crash when running the rdma_cm on
a real-time OS.  The crash has the following call trace:

    cm_process_work
       cma_req_handler
          cma_disable_callback
          rdma_create_id
             kzalloc
             init_completion
          cma_get_net_info
          cma_save_net_info
          cma_any_addr
             cma_zero_addr
          rdma_translate_ip
             rdma_copy_addr
          cma_acquire_dev
             rdma_addr_get_sgid
             ib_find_cached_gid
             cma_attach_to_dev
          ucma_event_handler
             kzalloc
             ib_copy_ah_attr_to_user
          cma_comp

[ preempted ]

    cma_write
        copy_from_user
        ucma_destroy_id
           copy_from_user
           _ucma_find_context
           ucma_put_ctx
           ucma_free_ctx
              rdma_destroy_id
                 cma_exch
                 cma_cancel_operation
                 rdma_node_get_transport

        rt_mutex_slowunlock
        bad_area_nosemaphore
        oops_enter

They were able to reproduce the crash multiple times with the
following details:

    Crash seems to always happen on the:
            mutex_unlock(&conn_id->handler_mutex);
    as conn_id looks to have been freed during this code path.

An examination of the code shows that a race exists in the request
handlers.  When a new connection request is received, the rdma_cm
allocates a new connection identifier.  This identifier has a single
reference count on it.  If a user calls rdma_destroy_id() from another
thread after receiving a callback, rdma_destroy_id will proceed to
destroy the id and free the associated memory.  However, the request
handlers may still be in the process of running.  When control returns
to the request handlers, they can attempt to access the newly created
identifiers.

Fix this by holding a reference on the newly created rdma_cm_id until
the request handler is through accessing it.

Signed-off-by: Sean Hefty 
Acked-by: Doug Ledford 
Signed-off-by: Roland Dreier 
Signed-off-by: Greg Kroah-Hartman

IB/uverbs: Handle large number of entries in poll CQ

2011-03-21T19:44:24+00:00

commit 7182afea8d1afd432a17c18162cc3fd441d0da93 upstream.

In ib_uverbs_poll_cq() code there is a potential integer overflow if
userspace passes in a large cmd.ne.  The calls to kmalloc() would
allocate smaller buffers than intended, leading to memory corruption.
There iss also an information leak if resp wasn't all used.
Unprivileged userspace may call this function, although only if an
RDMA device that uses this function is present.

Fix this by copying CQ entries one at a time, which avoids the
allocation entirely, and also by moving this copying into a function
that makes sure to initialize all memory copied to userspace.

Special thanks to Jason Gunthorpe 
for his help and advice.

Signed-off-by: Dan Carpenter 

[ Monkey around with things a bit to avoid bad code generation by gcc
  when designated initializers are used.  - Roland ]

Signed-off-by: Roland Dreier 
Signed-off-by: Greg Kroah-Hartman

RDMA/cxgb3: Turn off RX coalescing for iWARP connections

2011-03-21T19:43:11+00:00

commit bec658ff31453a5726b1c188674d587a5d40c482 upstream.

The HW by default has RX coalescing on.  For iWARP connections, this
causes a 100ms delay in connection establishement due to the ingress
MPA Start message being stalled in HW.  So explicitly turn RX
coalescing off when setting up iWARP connections.

This was causing very bad performance for NP64 gather operations using
Open MPI, due to the way it sets up connections on larger jobs.

Signed-off-by: Steve Wise 
Signed-off-by: Roland Dreier 
Signed-off-by: Greg Kroah-Hartman

IPoIB: Fix world-writable child interface control sysfs attributes

2010-08-02T17:26:40+00:00

commit 7a52b34b07122ff5f45258d47f260f8a525518f0 upstream.

Sumeet Lahorani  reported that the IPoIB
child entries are world-writable; however we don't want ordinary users
to be able to create and destroy child interfaces, so fix them to be
writable only by root.

Signed-off-by: Or Gerlitz 
Signed-off-by: Roland Dreier 
Signed-off-by: Greg Kroah-Hartman

IPoIB: Fix TX queue lockup with mixed UD/CM traffic

2010-04-26T14:47:59+00:00

commit f0dc117abdfa9a0e96c3d013d836460ef3cd08c7 upstream.

The IPoIB UD QP reports send completions to priv->send_cq, which is
usually left unarmed; it only gets armed when the number of
outstanding send requests reaches the size of the TX queue. This
arming is done only in the send path for the UD QP.  However, when
sending CM packets, the net queue may be stopped for the same reasons
but no measures are taken to recover the UD path from a lockup.

Consider this scenario: a host sends high rate of both CM and UD
packets, with a TX queue length of N.  If at some time the number of
outstanding UD packets is more than N/2 and the overall outstanding
packets is N-1, and CM sends a packet (making the number of
outstanding sends equal N), the TX queue will be stopped.  When all
the CM packets complete, the number of outstanding packets will still
be higher than N/2 so the TX queue will not be restarted.

Fix this by calling ib_req_notify_cq() when the queue is stopped in
the CM path.

Signed-off-by: Eli Cohen 
Signed-off-by: Roland Dreier 
Cc: maximilian attems 
Signed-off-by: Greg Kroah-Hartman

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband

2010-02-11T22:01:25+00:00

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
  RDMA/cm: Revert association of an RDMA device when binding to loopback

RDMA/cm: Revert association of an RDMA device when binding to loopback

2010-02-10T20:00:48+00:00

Revert the following change from commit 6f8372b6 ("RDMA/cm: fix
loopback address support")

   The defined behavior of rdma_bind_addr is to associate an RDMA
   device with an rdma_cm_id, as long as the user specified a non-
   zero address.  (ie they weren't just trying to reserve a port)
   Currently, if the loopback address is passed to rdma_bind_addr,
   no device is associated with the rdma_cm_id.  Fix this.

It turns out that important apps such as Open MPI depend on
rdma_bind_addr() NOT associating any RDMA device when binding to a
loopback address.  Open MPI is being updated to deal with this, but at
least until a new Open MPI release is available, maintain the previous
behavior: allow rdma_bind_addr() to succeed, but do not bind to a
device.

Signed-off-by: Sean Hefty 
Acked-by: Steve Wise 
Signed-off-by: Roland Dreier

Fix failure exit in ipathfs

2010-01-27T03:22:27+00:00

deactivate_locked_super() will be done by caller of fill_super, doing
it there as well is b0rken.

Signed-off-by: Al Viro

Merge branches 'misc' and 'mlx4' into for-next

2010-01-06T21:16:47+00:00