linux-stable.git/drivers/dma/dmaengine.c, branch linux-2.6.33.y

dmaengine: fix memleak in dma_async_device_unregister

2010-02-02T21:58:37+00:00

While debugging a dma driver I noticed a memleak after
unloading the driver module.

Caught by kmemleak.

Signed-off-by: Anatolij Gustschin 
Cc: Maciej Sosnowski 
Signed-off-by: Dan Williams

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu

2009-12-14T17:58:24+00:00

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (34 commits)
  m68k: rename global variable vmalloc_end to m68k_vmalloc_end
  percpu: add missing per_cpu_ptr_to_phys() definition for UP
  percpu: Fix kdump failure if booted with percpu_alloc=page
  percpu: make misc percpu symbols unique
  percpu: make percpu symbols in ia64 unique
  percpu: make percpu symbols in powerpc unique
  percpu: make percpu symbols in x86 unique
  percpu: make percpu symbols in xen unique
  percpu: make percpu symbols in cpufreq unique
  percpu: make percpu symbols in oprofile unique
  percpu: make percpu symbols in tracer unique
  percpu: make percpu symbols under kernel/ and mm/ unique
  percpu: remove some sparse warnings
  percpu: make alloc_percpu() handle array types
  vmalloc: fix use of non-existent percpu variable in put_cpu_var()
  this_cpu: Use this_cpu_xx in trace_functions_graph.c
  this_cpu: Use this_cpu_xx for ftrace
  this_cpu: Use this_cpu_xx in nmi handling
  this_cpu: Use this_cpu operations in RCU
  this_cpu: Use this_cpu ops for VM statistics
  ...

Fix up trivial (famous last words) global per-cpu naming conflicts in
	arch/x86/kvm/svm.c
	mm/slab.c

async_tx: build-time toggling of async_{syndrome,xor}_val dma support

2009-11-20T06:21:03+00:00

ioat3.2 does not support asynchronous error notifications which makes
the driver experience latencies when non-zero pq validate results are
expected.  Provide a mechanism for turning off async_xor_val and
async_syndrome_val via Kconfig.  This approach is generally useful for
any driver that specifies ASYNC_TX_DISABLE_CHANNEL_SWITCH and would like
to force the async_tx api to fall back to the synchronous path for
certain operations.

Signed-off-by: Dan Williams

dmaengine: include xor/pq validate in device_has_all_tx_types()

2009-11-20T06:21:03+00:00

A channel must include these capabilities to satisfy
ASYNC_TX_DISABLE_CHANNEL_SWITCH.

Signed-off-by: Dan Williams

this_cpu: Eliminate get/put_cpu

2009-10-03T10:48:23+00:00

There are cases where we can use this_cpu_ptr and as the result
of using this_cpu_ptr() we no longer need to determine the
currently executing cpu.

In those places no get/put_cpu combination is needed anymore.
The local cpu variable can be eliminated.

Preemption still needs to be disabled and enabled since the
modifications of the per cpu variables is not atomic. There may
be multiple per cpu variables modified and those must all
be from the same processor.

Acked-by: Maciej Sosnowski 
Acked-by: Dan Williams 
Acked-by: Tejun Heo 
cc: Eric Biederman 
cc: Stephen Hemminger 
cc: David L Stevens 
Signed-off-by: Christoph Lameter 
Signed-off-by: Tejun Heo

Merge branch 'dmaengine' into async-tx-next

2009-09-09T00:55:21+00:00

Conflicts:
	crypto/async_tx/async_xor.c
	drivers/dma/ioat/dma_v2.h
	drivers/dma/ioat/pci.c
	drivers/md/raid5.c

dmaengine: kill tx_list

2009-09-09T00:53:04+00:00

The tx_list attribute of struct dma_async_tx_descriptor is common to
most, but not all dma driver implementations.  None of the upper level
code (dmaengine/async_tx) uses it, so allow drivers to implement it
locally if they need it.  This saves sizeof(struct list_head) bytes for
drivers that do not manage descriptors with a linked list (e.g.: ioatdma
v2,3).

Signed-off-by: Dan Williams

dmaengine, async_tx: add a "no channel switch" allocator

2009-09-09T00:42:51+00:00

Channel switching is problematic for some dmaengine drivers as the
architecture precludes separating the ->prep from ->submit.  In these
cases the driver can select ASYNC_TX_DISABLE_CHANNEL_SWITCH to modify
the async_tx allocator to only return channels that support all of the
required asynchronous operations.

For example MD_RAID456=y selects support for asynchronous xor, xor
validate, pq, pq validate, and memcpy.  When
ASYNC_TX_DISABLE_CHANNEL_SWITCH=y any channel with all these
capabilities is marked DMA_ASYNC_TX allowing async_tx_find_channel() to
quickly locate compatible channels with the guarantee that dependency
chains will remain on one channel.  When
ASYNC_TX_DISABLE_CHANNEL_SWITCH=n async_tx_find_channel() may select
channels that lead to operation chains that need to cross channel
boundaries using the async_tx channel switch capability.

Signed-off-by: Dan Williams

Merge branch 'md-raid6-accel' into ioat3.2

2009-09-09T00:42:29+00:00

Conflicts:
	include/linux/dmaengine.h

async_tx: add support for asynchronous GF multiplication

2009-08-30T02:09:27+00:00

[ Based on an original patch by Yuri Tikhonov ]

This adds support for doing asynchronous GF multiplication by adding
two additional functions to the async_tx API:

 async_gen_syndrome() does simultaneous XOR and Galois field
    multiplication of sources.

 async_syndrome_val() validates the given source buffers against known P
    and Q values.

When a request is made to run async_pq against more than the hardware
maximum number of supported sources we need to reuse the previous
generated P and Q values as sources into the next operation.  Care must
be taken to remove Q from P' and P from Q'.  For example to perform a 5
source pq op with hardware that only supports 4 sources at a time the
following approach is taken:

p, q = PQ(src0, src1, src2, src3, COEF({01}, {02}, {04}, {08}))
p', q' = PQ(p, q, q, src4, COEF({00}, {01}, {00}, {10}))

p' = p + q + q + src4 = p + src4
q' = {00}*p + {01}*q + {00}*q + {10}*src4 = q + {10}*src4

Note: 4 is the minimum acceptable maxpq otherwise we punt to
synchronous-software path.

The DMA_PREP_CONTINUE flag indicates to the driver to reuse p and q as
sources (in the above manner) and fill the remaining slots up to maxpq
with the new sources/coefficients.

Note1: Some devices have native support for P+Q continuation and can skip
this extra work.  Devices with this capability can advertise it with
dma_set_maxpq.  It is up to each driver how to handle the
DMA_PREP_CONTINUE flag.

Note2: The api supports disabling the generation of P when generating Q,
this is ignored by the synchronous path but is implemented by some dma
devices to save unnecessary writes.  In this case the continuation
algorithm is simplified to only reuse Q as a source.

Cc: H. Peter Anvin 
Cc: David Woodhouse 
Signed-off-by: Yuri Tikhonov 
Signed-off-by: Ilya Yanok 
Reviewed-by: Andre Noll 
Acked-by: Maciej Sosnowski 
Signed-off-by: Dan Williams