Skip to content

Commit

Permalink
NTB: Remove dma_sync_wait from ntb_async_rx
Browse files Browse the repository at this point in the history
The dma_sync_wait can hurt the performance of workloads mixed with both
large and small frames.  Large frames will be copied using the dma
engine.  Small frames will be copied by the cpu.  The dma_sync_wait
prevents the cpu and dma engine copying in parallel.

In the period where the cpu is copying, the dma engine is stopped.  The
dma engine is not doing any useful work to copy large frames during that
time, and the additional time to restart the dma engine for the next
large frame.  This will decrease the throughput for the portion of a
workload with large frames.

In the period where the dma engine is copying, the cpu is held up
waiting for dma to complete.  The small frames processing will be
delayed until the dma is complete.  The RX frames are completed
in-order, and the processing of small frames takes very little time, so
dma_sync_wait may have an insignificant impact on the respose time of
frames.  The more significant impact is to the system, because the delay
in dma_sync_wait is implemented as busy non-blocking wait.  This can
prevent the delayed core from doing any useful work, even if it could be
processing work for other drivers, unrelated to transport RX processing.

After applying the earlier patch to fix out-of-order RX acknoledgement,
the dma_sync_wait is no longer necessary.  Remove it, so that cpu memcpy
will proceed immediately for small frames, in parallel with ongoing dma
for large frames.  Do not hold up the cpu from doing work while dma is
in progress.  The prior fix will continue to ensure in-order completion
of the RX frames to the upper layer, and in-order delivery of the RX
acknoledgement.

Signed-off-by: Allen Hubbe <[email protected]>
Signed-off-by: Jon Mason <[email protected]>
  • Loading branch information
Allen Hubbe authored and jonmason committed Sep 7, 2015
1 parent d98ef99 commit 905921e
Showing 1 changed file with 3 additions and 9 deletions.
12 changes: 3 additions & 9 deletions drivers/ntb/ntb_transport.c
Original file line number Diff line number Diff line change
Expand Up @@ -1233,18 +1233,18 @@ static void ntb_async_rx(struct ntb_queue_entry *entry, void *offset)
goto err;

if (len < copy_bytes)
goto err_wait;
goto err;

device = chan->device;
pay_off = (size_t)offset & ~PAGE_MASK;
buff_off = (size_t)buf & ~PAGE_MASK;

if (!is_dma_copy_aligned(device, pay_off, buff_off, len))
goto err_wait;
goto err;

unmap = dmaengine_get_unmap_data(device->dev, 2, GFP_NOWAIT);
if (!unmap)
goto err_wait;
goto err;

unmap->len = len;
unmap->addr[0] = dma_map_page(device->dev, virt_to_page(offset),
Expand Down Expand Up @@ -1287,12 +1287,6 @@ static void ntb_async_rx(struct ntb_queue_entry *entry, void *offset)
dmaengine_unmap_put(unmap);
err_get_unmap:
dmaengine_unmap_put(unmap);
err_wait:
/* If the callbacks come out of order, the writing of the index to the
* last completed will be out of order. This may result in the
* receive stalling forever.
*/
dma_sync_wait(chan, qp->last_cookie);
err:
ntb_memcpy_rx(entry, offset);
qp->rx_memcpy++;
Expand Down

0 comments on commit 905921e

Please sign in to comment.