Perpetuum generates Petri net code, and triggers transitions when it is their time. Erlang is good at handling process logic and responding to events. Let's bring them together!
The aim of this document is to describe how Erlang code can capture Petri nets efficiently and true to its nature. *For a more concise description, please see BITFIELDVEC.MD.
The average Perpetuum implementation in Erlang will have three layers of code, usually all contained in one process:
- Message reception, used to trigger Transitions
- The Petri Net driver; code generated by Perpetuum
- Transition handling; responding to Petri Nets
Message reception constitutes the interface of the process; the driver is a combination of code specific to the Petri Net that was input to the compiler and, quite possibly, generic code incorporated into it. Transition handling is where the application enforces its own logic, by communicating acceptance, rejection or deferral of possible transitions. In most use cases, message handling and transition handling would be integrated into one module and the generated code would be in another.
The main loop is not contained in the Petri Net driver, as is the case with the gen_server
. Nothing stops us from adding a gen_server
to take the role of the main loop, but the Petri Net is called from something outside its control.
The Petri Net driver is compiled into a module of its own. This means that we can still invoke routines like myPetriNet:init/1
which would return a new Petri Net instance with its initial marking. It is up to the context whether this is done for a separate process. The returned type is myPetriNet:pnc()
which is opaque to the callers of the module through a -opaque
declaration.
The state of a Petri Net is quite simply the state that is being carried around. Transitions and Places can also have some state attached, TODO:maybe.
We already concluded that events are properly modelled by transitions in a Petri net. There is a difference between transitions caused by outside events, and others that are triggered as soon as tokens are ready, but that has already been solved in the C interface for transition callbacks which have the power to accept, reject or defer transitions based on application logic. In Erlang, this could be defined as
TODO: Any application state should be merged in with these; see gen_server
for inspiration.
-type transreply() :: {noreply,pnc(),NewState} |
{reply,Reply,pnc(),NewState} |
{error,Reason} |
retry |
{retry,Event} |
{delay,int()}.
-spec handle_trans( pnc(), transition(), evdata(), State ) :: { pnc(), transreply() }.
-callback callback_trans( pnc(), transition(), time(), evdata(), State ) -> transreply().
The return values typed noreply
and {reply,Response}
both indicate success, whereas {error,Reason}
is a failure to perform the transition in the current marking; the marking and application state should only change for successful situations. The return values typed {delay,int()}
and retry
and {retry,Event}
indicate current inability to perform, but a later attempt may be more successful; the marking will not change, though a note of the delay time would usually be taken and enforced; the application logic may use the Event
to pass information about application updates that would warrent a retry.
The Petri Net driver receives attempts to trigger a transition through handle_trans/4
and when the current marking and possibly stored timeouts allow it, will pass the attempt on to the application logic through the callback_trans/5
callback function.
The handle_trans/4
function is used by the messaging interface to communicate a desire to trigger a transition. The trans_reply()
that it receives may indicate that a later retry is prudent, but it is up to the messaging interface to respond as it sees fit for the messaging protocol implemented.
The callback_trans/5
callback is used by the Petri Net driver to invoke application logic, and attempt a transition that is possible in the current marking. Application logic should be used to decide for or against it. The application logic can assume that any proposals made are in line with the Petri Net and its current marking.
Where we returned a maximum value for the delay in C, and hoped for another triggering event, we have explicitly modelled the desire for future triggering through the retry
replies.
To create a new instance, the customary invocation is through spawn/3
or spawn_link/3
. The function to call would be the main loop for message handling, which should in turn invoke init/1
in the generated Petri Net driver code. The result of this spawning action is to create the application logic including Petri Net driver in a new process. The new process is instantiated with the initial marking in its specification.
Events may be triggered on other processes; to that end, we can setup a gen_server
instance with the usual handle_call
and handle_cast
functions. These would represent the handling of events that tickle the Petri net in the hope to move it forward through the call_trans/4
function.
To trigger a transition from outside the Petri Net's own process, so from a process main loop that we might introduce, we can define functions that resemble call
and cast
:
-spec trigger( pid(), transition(), evdata() ) -> trans_reply().
-spec trigger( pid(), transition(), evdata(),Timeout ) -> trans_reply().
-spec tickle( transition(), evdata() ) -> none().
-spec tickle( transition(), evdata(), Timeout ) -> none().
The first is like call
, in that it ensures delivery and handling. When it is not handled, it will have a bad hair day — and make sure you do, too. The second is like cast
, in that it ships an attempted transition but merely shrugs when it fails. The choice between these depends on the application at hand. Note that tickle
is asynchronous and does not guarantee any order of transition handling; that is left to the sequence in the program invoking it. Note that both functions invoke handle_trans
but differ in their handling of the values returned.
The Timeout
parameter is the number of milliseconds that the transition may be held up. If it would be held up longer, it will return the timeout instruction immediately. A Timeout
of 0 indicates that no timeout at all is acceptable and is assumed as a default in trigger/3
; a Timeout
value of infinity
indicates that any timeout should be accepted and awaited and is assumed as a default in tickle/2
.
Perpetuum puts a lot of effort in making models that are small. Let us explain why before embarking on the implications that this has for our model.
Warning: Some lovely cruft is coming up!
There is a very good reason for keeping our data small and efficiently computing. Erlang is happy to manage large numbers of processes at the same time, and Petri Nets may model the flow of many of them. Small processes should stay that —small— in the interest of scalability. We use an integer representation that packs the Petri Net marking very tightly as a way to a highly scalable system that can smoothly run large numbers of Petri Net processes.
We used compact arrays to represent Petri Nets in C, but this is not very efficient in Erlang. Its data structures are more complex, and stack-based. We might use records or maps, naming all the individual additions separately and constantly reconstructing the data sets.
Ideally however, we would handle vectors, but these are not available in Erlang. Well, not directly. What Erlang does have are integers of arbitrary size. We can use those to contain a vector of bit fields!
The number of tokens in a place occupies a limited number of bits, and we might configure how many or make an estimate of, say, 7 or 8 bits each. It is quite possible to combine several of these small values in one arbitrary-sized integer and exploit the efficiency of machine-coded inner loops for large-integer addition and subtraction. To test for overflow or underflow in these vectorised bit fields, we might additionally use the bitwise operators band
, bor
and bnot
and bxor
.
When our transitions do not cause more than 7 bits of change, then 8-bit addition must never lead to an overflow; subtraction must never lead to an underflow. This can be detected by looking at what would be called "sign change" if the field is considered a signed bit field. When the sign changes, we can be sure that we are seeing overflow or underflow.
Underflow is great to detect efficiently, because it indicates that an attempted grab of tokens is not possible in the current marking. All we need for that is some trickery with bnot
and band
surrounding our subtraction.
Overflow is just as easy to detect efficiently, and it is just as informative. Depending on implementation choices made by Perpetuum while compiling a Petri Net to Erlang, overflow may be a sign of a Petri Net on the run, and may be considered fatal. Alternatively, it may be treated as a sign that the curent range is too small, and an expansion of all the bits is needed:
Bits = 8 * byte_size (InWord) div VectorWords,
OutWord = << X:(Bits+8) || <<X:Bits>> <= InWord >>.
Yes, I am mad.
The procedure described above needs a bit more time than the minimum; the bnot
and band
operations probably store intermediate values on the heap, and that would cause garbage collection to run more often.
When we make a small change, namely to have a bit between each word and require that it is always 0, then detection of overflow and underflow is much simpler:
<< <<S:1>> || <<S:1,_:(Bits-1)>> <= <<InWord>>, S/=0 >>.
We shall have to store everything, including all transition updates, in this format. The values can be shared between loop iterations, but probably not between processes; TODO: Mapping from the overall length of the marking vector?
Since Erlang will automatically make the choice to store a small integer on the process heap and the big number in explicitly allocated memory, it is nice to know how much leeway we have.
On 64-bit machines, we have 60 bits worth of storage in the process heap. One bit of that is (probably) lost to a sign, so we end up with 59 bits of unsigned integer data.
As an example, we can fit a Petri Net of 11 Places in this with 5 bits each. This means none of the transitions may not add or remove more than 4 bits of data, so the maximum transition multiplicity for the Petri Net would need to be 15. And each of these places could hold up to 15 tokens. Once this is no longer possible, we can scale up as described before (though we may then choose another algorithm than (Bits+8)
since we need not limit the mechanism to byte boundaries. There is no problem here; we can easily use (Bits+1)
since it is certain to solve our problem and any new overflows are going to map to new positions anyway.
We can in fact be a little more efficient by setting up a clever addition function that responds to overflow by making an answer in 1 bit extra, while keeping the input in its original size. Yummy. It would simply return the possibly renewed bit mask for the sign bits along with the new marking.
Note that the initial heap provided to an Erlang process is 233 words of 64 bits, so the amount of state until yet another round of garbage collection is fairly easily reached.
But it
looks like
there is no malloc()
and free()
overhead for small integer data, however. There simply is a bounce-up going from 59 bits in 1 word to 3 words and more. Meaning, we can start small, but when we outgrow 59 bits we can immediately scale up to 3 words.
A final note: In the above, we've not assumed that we can use the sign bit from the vector-holding integer. This is not always necessary, because band
and such work as expected on a 2's complement representation. We can simply reserve the first / highest word to one that matches the available bits in that position — which may be more or less, depending on rounding, but we can hardly expect to benefit from that without loss of efficiency and simplicity of our code.
The data we will store for a Petri Net involves its current marking. This is modified whenever a transition fires. We shall take this in phases:
Preparation takes a claim on the input vectors. We subtract this to see if it leads to underflow and if not, we have a lock that disables other transitions.
Rollback means that we undo preparation, by adding the transition-changes to all the places. We can do this, or ignore the outcome, if we detect underflow during preparation. We also do it when application logic decides to reject or defer a proposed transition. In general however, we re-inject tokens before input arcs by adding the transition's place-vector update. We only need to be aware of potential overflow when concurrent locks can be taken out, which is not our current intention.
Commit means that a prepared transition succeeded and will actually be effectuated. This does not consume input tokens anymore, but it will add output tokens. We may run into overflow, in which case we need to report an error or enlarge the size of all our integers or, in a more advanced model, those that experienced overflow. Note that one extra bit is enough to hold the total of a maximally filled place with a transition of maximal multiplicity, because both these numbers should individually fit in the size of the original place, and their added value will as a result always fit in one bit more.
The handling of inhibitor arcs is a special concern. In C, we used a trick with the counters that could have negative values, but that would be problematic (well, difficult is more accurate) for vectorised bit fields stored in arbitrary-size integers. What we can do easily is mask the entire integer for a place that inhibits a transition. This means that not just the sentinel bit is guarded after subtraction, but that we are also looking at the place's token count as a sentinal word for the transition. Needless to say that an inhibitor arc does not update that same word.
So, what do we do if a place has both an inhibitor arc and a normal arc to a transition? Well, we could arrange that in the commit update of places, but we don't have to. Such situations simply don't occur, because no transition can fire if it has both an inhibitor arc and a positive arc coming from the same place.
Then, how about multipliticity of arcs, either through annotation or repeated arcs? This is simpler than in the C model (where we had multiple arcs running in parallel, with special handling in certain cases). In Erlang, we can put the multiplicity into the preparation vector.
So, we have:
-type place() :: { atom(), int(), ... }.
-type trans() :: { atom(), int(), int(), ... }.
The place()
type has a name and a current marking; the trans()
type has a name, as well as prepare and commit vectors. Note the absense of what we had in C, a counter in a transition that marked how far it was from being fireable.
We probably need to add some more fields; at least the trans()
needs to store the initial timer value for a timeout.
The algorithm running over a marking basically applies all trans()
preparations to the place()
until one works. After a full round without success, it will return the Petri Net's collective state to its calling context. TODO: We don't have a main loop, so why define this?
Generated code will probably look somewhat like
receive
...
Msg={trans4,Args} when 0 == ((Marking - 13281239187) band 1239810239812) ->
...handle_trans( Msg )...
...
end
The Marking
in this example is just a plain integer variable, representing the vector of token counts from various places in different bits. In addition to these event-triggered transitions, there may be (unnamed) transitions that fire at will; they may be represented by if
clauses that should probably loop back, eventually ending up with the receive
above as their last option. Since this is non-deterministic in the semantics, any implementation would do, really. We should aim to trigger anonymous transitions too, for the sake of living up to some form of gut feeling of eventual transitioning that modelers are likely to assume (even if not entirely accurate as an expectation).
Note that Erlang is a (kind of) functional language, so it knows that Marking
will never change between received messages — and the same invariability applies to the when
guard as a whole. As a result, it only needs to evaluate the guard once, when receiving messages. We liberally assume that Erlang is indeed this clever. (In general however, guards may use variables bound by the message, in which case it would need to recompute for every propspective message to receive.)
The value is in the numbers — though we may find that we should generate these more dynamically if we intend to scale up our numbers. We should be careful to print numbers negatively when their highest bit is set, given the two's complement representation of the bit sets that we need. Dynamic may be as simple as a record or tuple field reference.
Generally, the number of bits starts at 60 (for a 64-bit architecture) including sign bit, and then starts off at 3*64 bits with 64-bit increments. The number of bits available for a place is the integer division of the number of bits in the integer by the number of places, minus 1 for the sentinel bit. So 60 bits and 19 places leave 2 bits per place, or room for markings up to 3 tokens per place.
The reason we can even contemplate these awkward implementation models is, of course, that we are generating them from graphics with a fully automated Perpetuum Compiler.
Nobody cares (?) about the debugging potential of generated code, as long as it can be trusted to be correct. This is an achievable goal, and the fact tht the complexity of the mapping remains hidden from the end user (and concentrated in the generated module with Petri Net logic) is just what we need.
It is sometimes desirable to set a timer to a transition that does not want to be disturbed for a while (perhaps because it is a timeout arc next to a productive arc). This is part of the Perpetuum model and it also turned out to be common in gen_server
patterns in Erlang.
Delays can easily be set with timer:after/2
, which sends a message of choice after a configurable time. The message would of course hint at the transition once more, indicating when it was first deferred. Further deferral would be possible, but not change that time of original attempt.
TODO: The remainder of this section appears useless because a transition may be tried more than once, possibly with different event data. The timeout:after/2
function furthermore outsources the management complexity of these recurring calls.
In terms of transition disabling, a timer can perhaps increase something in a marking or otherwise the mask, so it cannot fire. (This means that the recurring event must disable that stopper marking.)
One way of doing this, under the assumption of dynamic subtractions and masks, is through a bit that is always 1
and will be drawn into the mask for transitions that are waiting for a timer to expire. The integer sign is currently a bit of a rotten egg, at least conceptually, so that would be a very good candidate. This would not be a sentinel bit, but another that is always set in all transition vectors with a timer on them. The Marking
also has this bit set in all situations; there will be a sentinel bit just below it to avoid ever overflowing into it and making it wrap around (and extend the integer size while at it).
Another approach would be to extend the Marking or a secondary variable Timing
with a bit each for the transitions, indicating whether they have been delayed. The respective bit can then be sampled for immediate processing; it does not fit in the existing Marking
variable because that is an array of place information, while the timing information is bound to transitions; it would be possible to allocate a bit in the Marking to represent the transition, and make it work just like a sentinel bit; this saves a little on the computation but increases the size of the Marking
for no other reason. The bit could overlap the sentinel for a place of which it is the only departing transition arc, but most timeouts are alternatives to a well-behaved arc, so that would bring us nothing.
To work well, there should be no more than one timer active at the same time for any given transition. This is easily achieved when the timer demonstrates so clearly with one of its bits that it is indeed under influence of a timer.
A more elaborate model can use a map to store a transition's current timer as a time:TRef
type. This enables the use of timer:cancel()
on the timer in a branch that knows the timer may have been setup as its alternative.
TODO: This mechanism, using the sign bit on the transition dynamics to represent having a timer, will also work for anonymous transations that are more-or-less spontaneously triggered through if
guards. This is very useful; it means that maintenance in transitions can be a bit more laid back and not immediately spring to action. The fact that an explicit action may be triggered by a timer is however useful to keep in mind. The action might just clear the timed state of a transition, or it could handle transitions as any other; in that case, the original if
guards might also be replaced with different code, that triggers the anonymous transitions constantly, and repeatedly whenever one succeeds. Simplicity rules!
There is no reason why Petri Nets could not store their inner state in databases, or in a file system. Anything that serialises the integer values of the markings will work nicely to save and restore state. The functional nature of Erlang is very capable in managing two-phase transactions.
Memory-mapped files might be a good way to establish persistency as an operating system facility without the need to wait for sync()
to complete on small actions; all that remains is an atomically updating storage structure that is always in a valid state; this might be built on generations of the integer-stored bitfield vectors, with an atomically stored index variable to switch to the next generation.