Skip to content

Commit

Permalink
Merge pull request #103 from dssgabriel/refactor-comm-modes
Browse files Browse the repository at this point in the history
Refactor comm modes
  • Loading branch information
cwpearson authored Jul 11, 2024
2 parents 374df92 + 329a99f commit 59d783d
Show file tree
Hide file tree
Showing 15 changed files with 209 additions and 149 deletions.
46 changes: 25 additions & 21 deletions docs/api/core.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,25 +9,25 @@ Core
- ``KokkosComm::``
- ``Kokkos::View``
* - ``MPI_Send``
- ``send`` or ``send<CommMode::Standard>``
- ``send`` or ``send(KokkosComm::DefaultCommMode{}, ...)``
- ✓
* - ``MPI_Rsend``
- ``send<CommMode::Ready>``
- ``send(KokkosComm::ReadyCommMode{}, ...)``
- ✓
* - ``MPI_Recv``
- ``recv``
- ✓
* - ``MPI_Ssend``
- ``send<CommMode::Synchronous>``
- ``send(KokkosComm::SynchronousCommMode{}, ...)``
- ✓
* - ``MPI_Isend``
- ``isend`` or ``isend<CommMode::Standard>``
- ``isend`` or ``isend(KokkosComm::DefaultCommMode{}, ...)``
- ✓
* - ``MPI_Irsend``
- ``isend<CommMode::Ready>``
- ``isend(KokkosComm::ReadyCommMode{}, ...)``
- ✓
* - ``MPI_Issend``
- ``isend<CommMode::Synchronous>``
- ``isend(KokkosComm::SynchronousCommMode{}, ...)``
- ✓
* - ``MPI_Reduce``
- ``reduce``
Expand All @@ -36,32 +36,34 @@ Core
Point-to-point
--------------

.. cpp:function:: template <KokkosComm::CommMode SendMode, KokkosExecutionSpace ExecSpace, KokkosView SendView> \
.. cpp:function:: template <CommunicationMode SendMode, KokkosExecutionSpace ExecSpace, KokkosView SendView> \
Req KokkosComm::isend(const ExecSpace &space, const SendView &sv, int dest, int tag, MPI_Comm comm)

Wrapper for ``MPI_Isend``, ``MPI_Irsend`` and ``MPI_Issend``.

:param mode: The communication mode to use
:param space: The execution space to operate in
:param sv: The data to send
:param dest: the destination rank
:param tag: the MPI tag
:param comm: the MPI communicator
:tparam SendMode: A CommMode_ to use. If unspecified, defaults to a synchronous ``MPI_Issend`` if ``KOKKOSCOMM_FORCE_SYNCHRONOUS_MODE`` is defined, otherwise defaults to a standard ``MPI_Isend``.
:tparam IsendMode: A communication mode to use, one of: ``KokkosComm::DefaultCommMode``, ``KokkosComm::StandardCommMode``, ``KokkosComm::SynchronousCommMode`` or ``KokkosComm::ReadyCommMode`` (modeled with the ``KokkosComm::CommunicationMode`` concept)
:tparam SendView: A Kokkos::View to send
:tparam ExecSpace: A Kokkos execution space to operate in
:returns: A KokkosComm::Req representing the asynchronous communication and any lifetime-extended views.

.. cpp:function:: template <KokkosComm::CommMode SendMode, KokkosExecutionSpace ExecSpace, KokkosView SendView> \
.. cpp:function:: template <typename SendMode, KokkosExecutionSpace ExecSpace, KokkosView SendView> \
void KokkosComm::send(const ExecSpace &space, const SendView &sv, int dest, int tag, MPI_Comm comm)

Wrapper for ``MPI_Send``, ``MPI_Rsend`` and ``MPI_Ssend``.

:param mode: The communication mode to use
:param space: The execution space to operate in
:param sv: The data to send
:param dest: the destination rank
:param tag: the MPI tag
:param comm: the MPI communicator
:tparam SendMode: A CommMode_ to use. If unspecified, defaults to a synchronous ``MPI_Ssend`` if ``KOKKOSCOMM_FORCE_SYNCHRONOUS_MODE`` is defined, otherwise defaults to a standard ``MPI_Send``.
:tparam SendMode: A communication mode to use, one of: ``KokkosComm::DefaultCommMode``, ``KokkosComm::StandardCommMode``, ``KokkosComm::SynchronousCommMode`` or ``KokkosComm::ReadyCommMode`` (modeled with the ``KokkosComm::CommunicationMode`` concept)
:tparam SendView: A Kokkos::View to send
:tparam ExecSpace: A Kokkos execution space to operate in

Expand Down Expand Up @@ -116,28 +118,30 @@ Collective
Related Types
-------------

.. _CommMode:
Communication Modes
^^^^^^^^^^^^^^^^^^^

.. cpp:enum-class:: KokkosComm::CommMode
Structures to specify the mode of an operation. Buffered mode is not supported.

A scoped enum to specify the mode of an operation. Buffered mode is not supported.
.. cpp:struct:: KokkosComm::StandardCommMode

.. cpp:enumerator:: KokkosComm::CommMode::Standard
Let the MPI implementation decide whether outgoing messages will be buffered. Send operations can be started whether or not a matching receive has been started. They may complete before a matching receive begins. Standard mode is non-local: successful completion of the send operation may depend on the occurrence of a matching receive.

Standard mode: the MPI implementation decides whether outgoing messages will be buffered. Send operations can be started whether or not a matching receive has been started. They may complete before a matching receive is started. Standard mode is non-local: successful completion of the send operation may depend on the occurrence of a matching receive.
.. cpp:struct:: KokkosComm::SynchronousCommMode

.. cpp:enumerator:: KokkosComm::CommMode::Ready
Send operations complete successfully only if a matching receive is started, and the receive operation has started to receive the message sent.

Ready mode: Send operations may be started only if the matching receive is already started.
.. cpp:struct:: KokkosComm::ReadyCommMode

.. cpp:enumerator:: KokkosComm::CommMode::Synchronous
Send operations may be started only if the matching receive is already started.

Synchronous mode: Send operations complete successfully only if a matching receive is started, and the receive operation has started to receive the message sent.
.. cpp:struct:: KokkosComm::DefaultCommMode

.. cpp:enumerator:: KokkosComm::CommMode::Default
The default mode is aliased as ``Standard`` but lets users override the behavior of operations at compile-time using the ``KOKKOSCOMM_FORCE_SYNCHRONOUS_MODE`` pre-processor definition. The latter forces ``Synchronous`` mode for all "default-mode" operations, which can be helpful for debugging purposes, e.g., asserting that the communication scheme is correct.

Default mode is an alias for ``Standard`` mode, but lets users override the behavior of operations at compile-time using the ``KOKKOSCOMM_FORCE_SYNCHRONOUS_MODE`` pre-processor define. This forces ``Synchronous`` mode for all "default-mode" operations, which can be useful for debugging purposes, e.g., for asserting that the communication scheme is correct.

Requests
^^^^^^^^

.. cpp:class:: KokkosComm::Req

Expand Down
22 changes: 12 additions & 10 deletions perf_tests/test_2dhalo.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -22,9 +22,9 @@

void noop(benchmark::State, MPI_Comm) {}

template <typename Space, typename View>
void send_recv(benchmark::State &, MPI_Comm comm, const Space &space, int nx, int ny, int rx, int ry, int rs,
const View &v) {
template <KokkosComm::CommunicationMode Mode, typename Space, typename View>
void send_recv(benchmark::State &, MPI_Comm comm, const Mode &mode, const Space &space, int nx, int ny, int rx, int ry,
int rs, const View &v) {
// 2D index of nbrs in minus and plus direction (periodic)
const int xm1 = (rx + rs - 1) % rs;
const int ym1 = (ry + rs - 1) % rs;
Expand All @@ -48,10 +48,10 @@ void send_recv(benchmark::State &, MPI_Comm comm, const Space &space, int nx, in

std::vector<KokkosComm::Req> reqs;
// std::cerr << get_rank(rx, ry) << " -> " << get_rank(xp1, ry) << "\n";
reqs.push_back(KokkosComm::isend(space, xp1_s, get_rank(xp1, ry), 0, comm));
reqs.push_back(KokkosComm::isend(space, xm1_s, get_rank(xm1, ry), 1, comm));
reqs.push_back(KokkosComm::isend(space, yp1_s, get_rank(rx, yp1), 2, comm));
reqs.push_back(KokkosComm::isend(space, ym1_s, get_rank(rx, ym1), 3, comm));
reqs.push_back(KokkosComm::isend(mode, space, xp1_s, get_rank(xp1, ry), 0, comm));
reqs.push_back(KokkosComm::isend(mode, space, xm1_s, get_rank(xm1, ry), 1, comm));
reqs.push_back(KokkosComm::isend(mode, space, yp1_s, get_rank(rx, yp1), 2, comm));
reqs.push_back(KokkosComm::isend(mode, space, ym1_s, get_rank(rx, ym1), 3, comm));

KokkosComm::recv(space, xm1_r, get_rank(xm1, ry), 0, comm);
KokkosComm::recv(space, xp1_r, get_rank(xp1, ry), 1, comm);
Expand Down Expand Up @@ -82,12 +82,14 @@ void benchmark_2dhalo(benchmark::State &state) {
const int ry = rank / rs;

if (rank < rs * rs) {
auto mode = KokkosComm::DefaultCommMode();
auto space = Kokkos::DefaultExecutionSpace();
// grid of elements, each with 3 properties, and a radius-1 halo
grid_type grid("", nx + 2, ny + 2, nprops);
while (state.KeepRunning()) {
do_iteration(state, MPI_COMM_WORLD, send_recv<Kokkos::DefaultExecutionSpace, grid_type>, space, nx, ny, rx, ry,
rs, grid);
do_iteration(state, MPI_COMM_WORLD,
send_recv<KokkosComm::DefaultCommMode, Kokkos::DefaultExecutionSpace, grid_type>, mode, space, nx,
ny, rx, ry, rs, grid);
}
} else {
while (state.KeepRunning()) {
Expand All @@ -113,4 +115,4 @@ void benchmark_2dhalo(benchmark::State &state) {
// clang-format on
}

BENCHMARK(benchmark_2dhalo)->UseManualTime()->Unit(benchmark::kMillisecond);
BENCHMARK(benchmark_2dhalo)->UseManualTime()->Unit(benchmark::kMillisecond);
17 changes: 10 additions & 7 deletions perf_tests/test_osu_latency_isendirecv.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -21,11 +21,11 @@
#include "test_utils.hpp"
#include "KokkosComm.hpp"

template <typename Space, typename View>
void osu_latency_Kokkos_Comm_isendirecv(benchmark::State &, MPI_Comm comm, const Space &space, int rank,
const View &v) {
template <KokkosComm::CommunicationMode Mode, typename Space, typename View>
void osu_latency_Kokkos_Comm_isendirecv(benchmark::State &, MPI_Comm comm, const Mode &mode, const Space &space,
int rank, const View &v) {
if (rank == 0) {
KokkosComm::Req sendreq = KokkosComm::isend(space, v, 1, 1, comm);
KokkosComm::Req sendreq = KokkosComm::isend(mode, space, v, 1, 1, comm);
sendreq.wait();
} else if (rank == 1) {
KokkosComm::Req recvreq = KokkosComm::irecv(v, 0, 1, comm);
Expand Down Expand Up @@ -53,13 +53,16 @@ void benchmark_osu_latency_KokkosComm_isendirecv(benchmark::State &state) {
state.SkipWithError("benchmark_osu_latency_KokkosComm needs exactly 2 ranks");
}

auto mode = KokkosComm::DefaultCommMode();
auto space = Kokkos::DefaultExecutionSpace();
using view_type = Kokkos::View<char *>;
view_type a("A", state.range(0));

while (state.KeepRunning()) {
do_iteration(state, MPI_COMM_WORLD, osu_latency_Kokkos_Comm_isendirecv<Kokkos::DefaultExecutionSpace, view_type>,
space, rank, a);
do_iteration(
state, MPI_COMM_WORLD,
osu_latency_Kokkos_Comm_isendirecv<KokkosComm::DefaultCommMode, Kokkos::DefaultExecutionSpace, view_type>, mode,
space, rank, a);
}
state.counters["bytes"] = a.size() * 2;
}
Expand Down Expand Up @@ -90,4 +93,4 @@ BENCHMARK(benchmark_osu_latency_MPI_isendirecv)
->UseManualTime()
->Unit(benchmark::kMicrosecond)
->RangeMultiplier(8)
->Range(1, 1 << 28);
->Range(1, 1 << 28);
14 changes: 9 additions & 5 deletions perf_tests/test_osu_latency_sendrecv.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -21,10 +21,11 @@
#include "test_utils.hpp"
#include "KokkosComm.hpp"

template <typename Space, typename View>
void osu_latency_Kokkos_Comm_sendrecv(benchmark::State &, MPI_Comm comm, const Space &space, int rank, const View &v) {
template <KokkosComm::CommunicationMode Mode, typename Space, typename View>
void osu_latency_Kokkos_Comm_sendrecv(benchmark::State &, MPI_Comm comm, const Mode &mode, const Space &space, int rank,
const View &v) {
if (rank == 0) {
KokkosComm::send(space, v, 1, 0, comm);
KokkosComm::send(mode, space, v, 1, 0, comm);
} else if (rank == 1) {
KokkosComm::recv(space, v, 0, 0, comm);
}
Expand All @@ -48,13 +49,16 @@ void benchmark_osu_latency_KokkosComm_sendrecv(benchmark::State &state) {
state.SkipWithError("benchmark_osu_latency_KokkosComm needs exactly 2 ranks");
}

auto mode = KokkosComm::DefaultCommMode();
auto space = Kokkos::DefaultExecutionSpace();
using view_type = Kokkos::View<char *>;
view_type a("A", state.range(0));

while (state.KeepRunning()) {
do_iteration(state, MPI_COMM_WORLD, osu_latency_Kokkos_Comm_sendrecv<Kokkos::DefaultExecutionSpace, view_type>,
space, rank, a);
do_iteration(
state, MPI_COMM_WORLD,
osu_latency_Kokkos_Comm_sendrecv<KokkosComm::DefaultCommMode, Kokkos::DefaultExecutionSpace, view_type>, mode,
space, rank, a);
}
state.counters["bytes"] = a.size() * 2;
}
Expand Down
15 changes: 9 additions & 6 deletions perf_tests/test_sendrecv.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -18,14 +18,14 @@

#include "KokkosComm.hpp"

template <typename Space, typename View>
void send_recv(benchmark::State &, MPI_Comm comm, const Space &space, int rank, const View &v) {
template <KokkosComm::CommunicationMode Mode, typename Space, typename View>
void send_recv(benchmark::State &, MPI_Comm comm, const Mode &mode, const Space &space, int rank, const View &v) {
if (0 == rank) {
KokkosComm::send(space, v, 1, 0, comm);
KokkosComm::send(mode, space, v, 1, 0, comm);
KokkosComm::recv(space, v, 1, 0, comm);
} else if (1 == rank) {
KokkosComm::recv(space, v, 0, 0, comm);
KokkosComm::send(space, v, 0, 0, comm);
KokkosComm::send(mode, space, v, 0, 0, comm);
}
}

Expand All @@ -39,15 +39,18 @@ void benchmark_sendrecv(benchmark::State &state) {

using Scalar = double;

auto mode = KokkosComm::DefaultCommMode();
auto space = Kokkos::DefaultExecutionSpace();
using view_type = Kokkos::View<Scalar *>;
view_type a("", 1000000);

while (state.KeepRunning()) {
do_iteration(state, MPI_COMM_WORLD, send_recv<Kokkos::DefaultExecutionSpace, view_type>, space, rank, a);
do_iteration(state, MPI_COMM_WORLD,
send_recv<KokkosComm::DefaultCommMode, Kokkos::DefaultExecutionSpace, view_type>, mode, space, rank,
a);
}

state.SetBytesProcessed(sizeof(Scalar) * state.iterations() * a.size() * 2);
}

BENCHMARK(benchmark_sendrecv)->UseManualTime()->Unit(benchmark::kMillisecond);
BENCHMARK(benchmark_sendrecv)->UseManualTime()->Unit(benchmark::kMillisecond);
2 changes: 1 addition & 1 deletion src/KokkosComm.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@
#include "KokkosComm_alltoall.hpp"
#include "KokkosComm_barrier.hpp"
#include "KokkosComm_concepts.hpp"
#include "KokkosComm_comm_mode.hpp"
#include "KokkosComm_comm_modes.hpp"

#include <Kokkos_Core.hpp>

Expand Down
43 changes: 0 additions & 43 deletions src/KokkosComm_comm_mode.hpp

This file was deleted.

47 changes: 47 additions & 0 deletions src/KokkosComm_comm_modes.hpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
//@HEADER
// ************************************************************************
//
// Kokkos v. 4.0
// Copyright (2022) National Technology & Engineering
// Solutions of Sandia, LLC (NTESS).
//
// Under the terms of Contract DE-NA0003525 with NTESS,
// the U.S. Government retains certain rights in this software.
//
// Part of Kokkos, under the Apache License v2.0 with LLVM Exceptions.
// See https://kokkos.org/LICENSE for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//@HEADER

#pragma once

// See section 3.4 of the MPI standard for a complete specification.
namespace KokkosComm {

// Standard mode: MPI implementation decides whether outgoing messages will
// be buffered. Send operations can be started whether or not a matching
// receive has been started. They may complete before a matching receive is
// started. Standard mode is non-local: successful completion of the send
// operation may depend on the occurrence of a matching receive.
struct StandardCommMode {};

// Ready mode: Send operations may be started only if the matching receive is
// already started.
struct ReadyCommMode {};

// Synchronous mode: Send operations complete successfully only if a matching
// receive is started, and the receive operation has started to receive the
// message sent.
struct SynchronousCommMode {};

// Default mode: lets the user override the send operations behavior at
// compile-time. E.g., this can be set to mode "Synchronous" for debug
// builds by defining KOKKOSCOMM_FORCE_SYNCHRONOUS_MODE.
#ifdef KOKKOSCOMM_FORCE_SYNCHRONOUS_MODE
using DefaultCommMode = SynchronousCommMode;
#else
using DefaultCommMode = StandardCommMode;
#endif

} // namespace KokkosComm
2 changes: 2 additions & 0 deletions src/KokkosComm_traits.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,8 @@

#include "KokkosComm_concepts.hpp"

#include <type_traits>

namespace KokkosComm {

template <typename T>
Expand Down
Loading

0 comments on commit 59d783d

Please sign in to comment.