Skip to content

Commit

Permalink
Support NBAs to arrays inside loops
Browse files Browse the repository at this point in the history
For NBAs that might execute a dynamic number of times in a single
evaluation (specifically: those that assign to array elements inside
loops), we introduce a new run-time VlNBACommitQueue data-structure
(currently a vector), which stores all pending updates and the necessary
info to reconstruct the LHS reference of the AstAssignDly at run-time.

All variables needing a commit queue has their corresponding unique
commit queue.

All NBAs to a variable that requires a commit queue go through the
commit queue. This is necessary to preserve update order in sequential
code, e.g.:
 a[7] <= 10
 for (int i = 1 ; i < 10; ++i) a[i] <= i;
 a[2] <= 10
needs to end with array elements 1..9 being 1, 10, 3, 4, 5, 6, 7, 8, 9.

This enables supporting common forms of NBAs to arrays on the left hand
side of <= in non-suspendable/non-fork code. (Suspendable/fork
implementation is unclear to me so I left it unchanged, see verilator#5084).

Any NBA that does not need a commit queue (i.e.: those that were
supported before), use the same scheme as before, and this patch should
have no effect on the generated code for those NBAs.
  • Loading branch information
gezalore committed May 2, 2024
1 parent 8044833 commit 8b56988
Show file tree
Hide file tree
Showing 12 changed files with 877 additions and 70 deletions.
29 changes: 20 additions & 9 deletions docs/guide/warnings.rst
Original file line number Diff line number Diff line change
Expand Up @@ -210,22 +210,33 @@ List Of Warnings
.. option:: BLKLOOPINIT

.. TODO better example
This indicates that the initialization of an array needs to use
non-delayed assignments. This is done in the interest of speed; if
delayed assignments were used, the simulator would have to copy large
arrays every cycle. (In smaller loops, loop unrolling allows the
delayed assignment to work, though it's a bit slower than a non-delayed
assignment.) Here's an example
Indicates certain constructs where non-blocking assignments to unpacked
arrays (memories) are not supported inside loops. These typically appear in
initialization/reset code:

.. code-block:: sv
always @(posedge clk)
if (~reset_l)
for (i=0; i<`ARRAY_SIZE; i++)
array[i] = 0; // Non-delayed for verilator
array[i] <= 0; // Non-blocking assignment inside loop
else
array[address] <= data;
While this is supported in typical synthesizeable code (including the
example above), some complicated cases are not supported. Namely:

1. If the above loop is inside a suspendable process or fork statement.

2. If the variable is also the target of a '<=' non-blocking assignment
in a suspendable process or fork statement (in addition to a synthesizable
loop).

3. If the element type of the array is a compound type.

It might slightly improve run-time performance if you change the
non-blocking assignment inside the loop into a blocking assignment
(that is: use '=' instead of '<='), if possible.

This message is only seen on large or complicated loops because
Verilator generally unrolls small loops. You may want to try increasing
Expand Down
187 changes: 187 additions & 0 deletions include/verilated_types.h
Original file line number Diff line number Diff line change
Expand Up @@ -415,8 +415,20 @@ class VlWriteMem final {

static int _vl_cmp_w(int words, WDataInP const lwp, WDataInP const rwp) VL_PURE;

template <std::size_t T_Words>
struct VlWide;

// Type trait to check if a type is VlWide
template <typename>
struct VlIsVlWide : public std::false_type {};

template <std::size_t T_Words>
struct VlIsVlWide<VlWide<T_Words>> : public std::true_type {};

template <std::size_t T_Words>
struct VlWide final {
static constexpr size_t Words = T_Words;

// MEMBERS
// This should be the only data member, otherwise generated static initializers need updating
EData m_storage[T_Words]; // Contents of the packed array
Expand Down Expand Up @@ -1511,6 +1523,181 @@ std::string VL_TO_STRING(const VlUnpacked<T_Value, T_Depth>& obj) {
return obj.to_string();
}

//===================================================================
// Helper to apply the given indices to a target expression

template <size_t Curr, size_t Rank, typename T_Target>
struct VlApplyIndices final {
VL_ATTR_ALWINLINE
static auto& apply(T_Target& target, const size_t* indicesp) {
return VlApplyIndices<Curr + 1, Rank, decltype(target[indicesp[Curr]])>::apply(
target[indicesp[Curr]], indicesp);
}
};

template <size_t Rank, typename T_Target>
struct VlApplyIndices<Rank, Rank, T_Target> final {
VL_ATTR_ALWINLINE
static T_Target& apply(T_Target& target, const size_t*) { return target; }
};

//===================================================================
// Commit queue for NBAs - currently only for unpacked arrays
//
// This data-structure is used to handle non-blocking assignments
// that might execute a variable number of times in a single
// evaluation. It has 2 operations:
// - 'enqueue' will add an update to the queue
// - 'commit' will apply all enqueued updates to the target variable,
// in the order they were enqueued. This ensures the last NBA
// takes effect as it is expected.
// There are 2 specializations of this class below:
// - A version when a partial element update is not required,
// e.g, to handle:
// logic [31:0] array[N];
// for (int i = 0 ; i < N ; ++i) array[i] <= x;
// Here 'enqueue' takes the RHS ('x'), and the array indices ('i')
// as arguments.
// - A different version when a partial element update is required,
// e.g. for:
// logic [31:0] array[N];
// for (int i = 0 ; i < N ; ++i) array[i][3:1] <= y;
// Here 'enqueue' takes one additional argument, which is a bitmask
// derived from the bit selects (_[3:1]), which masks the bits that
// need to be updated, and additionally the RHS is widened to a full
// element size, with the bits inserted into the masked region.
template <typename T_Target, // Type of the variable this commit queue updates
bool Partial, // Whether partial element updates are necessary
// The following we could figure out from 'T_Target using type traits, but passing
// explicitly to avoid template expansion, as Verilator already knows them
typename T_Element, // Non-array leaf element type of T_Target array
std::size_t T_Rank // Rank of T_Target (i.e.: how many dimensions it has)
>
class VlNBACommitQueue;

// Specialization for whole element updates only
template <typename T_Target, typename T_Element, std::size_t T_Rank>
class VlNBACommitQueue<T_Target, /* Partial: */ false, T_Element, T_Rank> final {
// TYPES
struct Entry final {
T_Element value;
size_t indices[T_Rank];
};

// STATE
std::vector<Entry> m_pending; // Pending updates, in program order

public:
// CONSTRUCTOR
VlNBACommitQueue() = default;
~VlNBACommitQueue() = default;

// METHODS
template <typename... Args>
void enqueue(const T_Element& value, Args... indices) {
m_pending.emplace_back(Entry{value, {indices...}});
}

// Note: T_Commit might be different from T_Target. Specifically, when the signal is a
// top-level IO port, T_Commit will be a native C array, while T_Target, will be a VlUnpacked
template <typename T_Commit>
void commit(T_Commit& target) {
if (m_pending.empty()) return;
for (const Entry& entry : m_pending) {
VlApplyIndices<0, T_Rank, T_Commit>::apply(target, entry.indices) = entry.value;
}
m_pending.clear();
}
};

// With partial element updates
template <typename T_Target, typename T_Element, std::size_t T_Rank>
class VlNBACommitQueue<T_Target, /* Partial: */ true, T_Element, T_Rank> final {
// TYPES
struct Entry final {
T_Element value;
T_Element mask;
size_t indices[T_Rank];
};

// STATE
std::vector<Entry> m_pending; // Pending updates, in program order

// STATIC METHODS

// Binary & | ~ for elements to use for masking in partial updates. Sorry for the templates.
template <typename T>
VL_ATTR_ALWINLINE static typename std::enable_if<!VlIsVlWide<T>::value, T>::type
bAnd(const T& a, const T& b) {
return a & b;
}

template <typename T>
VL_ATTR_ALWINLINE static typename std::enable_if<VlIsVlWide<T>::value, T>::type
bAnd(const T& a, const T& b) {
T result;
for (size_t i = 0; i < T::Words; ++i) {
result.m_storage[i] = a.m_storage[i] & b.m_storage[i];
}
return result;
}

template <typename T>
VL_ATTR_ALWINLINE static typename std::enable_if<!VlIsVlWide<T>::value, T>::type
bOr(const T& a, const T& b) {
return a | b;
}

template <typename T>
VL_ATTR_ALWINLINE static typename std::enable_if<VlIsVlWide<T>::value, T>::type //
bOr(const T& a, const T& b) {
T result;
for (size_t i = 0; i < T::Words; ++i) {
result.m_storage[i] = a.m_storage[i] | b.m_storage[i];
}
return result;
}

template <typename T>
VL_ATTR_ALWINLINE static typename std::enable_if<!VlIsVlWide<T>::value, T>::type
bNot(const T& a) {
return ~a;
}

template <typename T>
VL_ATTR_ALWINLINE static typename std::enable_if<VlIsVlWide<T>::value, T>::type
bNot(const T& a) {
T result;
for (size_t i = 0; i < T::Words; ++i) result.m_storage[i] = ~a.m_storage[i];
return result;
}

public:
// CONSTRUCTOR
VlNBACommitQueue() = default;
~VlNBACommitQueue() = default;

// METHODS
template <typename... Args>
void enqueue(const T_Element& value, const T_Element& mask, Args... indices) {
m_pending.emplace_back(Entry{value, mask, {indices...}});
}

// Note: T_Commit might be different from T_Target. Specifically, when the signal is a
// top-level IO port, T_Commit will be a native C array, while T_Target, will be a VlUnpacked
template <typename T_Commit>
void commit(T_Commit& target) {
if (m_pending.empty()) return;
for (const Entry& entry : m_pending) { //
auto& ref = VlApplyIndices<0, T_Rank, T_Commit>::apply(target, entry.indices);
// Maybe inefficient, but it works for now ...
const auto oldValue = ref;
ref = bOr(bAnd(entry.value, entry.mask), bAnd(oldValue, bNot(entry.mask)));
}
m_pending.clear();
}
};

//===================================================================
// Object that VlDeleter is capable of deleting

Expand Down
24 changes: 24 additions & 0 deletions src/V3AstNodeDType.h
Original file line number Diff line number Diff line change
Expand Up @@ -959,6 +959,30 @@ class AstMemberDType final : public AstNodeDType {
return false;
}
};
class AstNBACommitQueueDType final : public AstNodeDType {
// @astgen ptr := m_subDTypep : AstNodeDType // Type of the corresponding variable
const bool m_partial; // Partial element update required

public:
AstNBACommitQueueDType(FileLine* fl, AstNodeDType* subDTypep, bool partial)
: ASTGEN_SUPER_NBACommitQueueDType(fl)
, m_partial{partial}
, m_subDTypep{subDTypep} {
dtypep(this);
}
ASTGEN_MEMBERS_AstNBACommitQueueDType;

AstNodeDType* subDTypep() const { return m_subDTypep; }
bool partial() const { return m_partial; }
bool similarDType(const AstNodeDType* samep) const override { return this == samep; }
AstBasicDType* basicp() const override { return nullptr; }
AstNodeDType* skipRefp() const override { return (AstNodeDType*)this; }
AstNodeDType* skipRefToConstp() const override { return (AstNodeDType*)this; }
AstNodeDType* skipRefToEnump() const override { return (AstNodeDType*)this; }
int widthAlignBytes() const override { return 1; }
int widthTotalBytes() const override { return 24; }
bool isCompound() const override { return true; }
};
class AstParamTypeDType final : public AstNodeDType {
// Parents: MODULE
// A parameter type statement; much like a var or typedef
Expand Down
17 changes: 17 additions & 0 deletions src/V3AstNodes.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -788,6 +788,22 @@ AstNodeDType::CTypeRecursed AstNodeDType::cTypeRecurse(bool compound, bool packe
info.m_type = "VlUnpacked<" + sub.m_type;
info.m_type += ", " + cvtToStr(adtypep->declRange().elements());
info.m_type += ">";
} else if (const auto* const adtypep = VN_CAST(dtypep, NBACommitQueueDType)) {
UASSERT_OBJ(!packed, this, "Unsupported type for packed struct or union");
compound = true;
const CTypeRecursed sub = adtypep->subDTypep()->cTypeRecurse(compound, false);
AstNodeDType* eDTypep = adtypep->subDTypep();
unsigned rank = 0;
while (AstUnpackArrayDType* const uaDTypep = VN_CAST(eDTypep, UnpackArrayDType)) {
eDTypep = uaDTypep;
++rank;
}
info.m_type = "VlNBACommitQueue<";
info.m_type += sub.m_type;
info.m_type += ", " + adtypep->partial() ? ", true" : ", false";
info.m_type += ", " + eDTypep->cTypeRecurse(compound, false).m_type;
info.m_type += ", " + std::to_string(rank);
info.m_type += ">";
} else if (packed && (VN_IS(dtypep, PackArrayDType))) {
const AstPackArrayDType* const adtypep = VN_CAST(dtypep, PackArrayDType);
const CTypeRecursed sub = adtypep->subDTypep()->cTypeRecurse(false, true);
Expand Down Expand Up @@ -2683,6 +2699,7 @@ void AstCMethodHard::setPurity() {
{"commit", false},
{"delay", false},
{"done", false},
{"enqueue", false},
{"erase", false},
{"evaluate", false},
{"evaluation", false},
Expand Down
Loading

0 comments on commit 8b56988

Please sign in to comment.