Tiered Storage #1828
Replies: 2 comments 1 reply
-
Another alternative is to use state rent rather than recent access to determine which key-value pairs are part of |
Beta Was this translation helpful? Give feedback.
-
After a couple of passes trying to implement the writeup, one thing that can be refactored is the permission verification step. Currently, |
Beta Was this translation helpful? Give feedback.
-
Terminology:
MState
: fast state available in memorySState
: slow state assumed to require a disk readReadFromMState
: a state key permission that allows reads fromMState
ReadFromSState
: a state key permission that allows reads fromMState
andSState
Goals
n
dimensionsMState
vs reads fromSState
(n=6
)MState
and reads fromSState
MState
and reads fromSState
Idea
We want to implement a version of
MState
such that, for any state-transition function, reads are first queried againstMState
before querying intoSState
. Furthermore, if we have a cache miss, we want to load the missed KV-pair intoMState
and have it be persistent between blocks/transactions.Originally, one way for implementing this was to add an additional field to
vm
calledmState
; state transition functions would first queryvm.mState
prior to querying the view of the parent block. However, this introduces additional complexity such as making sure thatvm.mState
is synced with views and making sure that the size ofvm.mState
was managed. Instead of directly implementingMState
ourselves, we can make the following assumptions:SState
is highly likely to be kept in memoryMState
can therefore just be an abstraction of the cache/memory of the running nodeWith the assumptions above, the question shifts from how we can implement our own version of
MState
to how we can track of which data was loaded intoMState
. The most straightforward way to keep track of this is to etch this into values themselves. For any value, we would add a suffix representing the last time the KV-pair was loaded intoMState
; this suffix could be the most recent accepted block number which touched the KV-pair. This suffix would be used the following to ways:Permission Validation: let$b_{i}$ represent the height of the block which is touching key $k$ and let $b_j$ represent the height of the last block which touched the $k$ . Let $\epsilon$ represent the maximum number of blocks we guarantee a KV-pair will be kept in
MState
for. If we have the following:then this means that$k$ is still in memory and so a permission like
ReadFromMState
is sufficient enough here. However, if we have the following:Then$k$ is not guaranteed to be in
MState
and so we need a permission likeReadFromSState
Suffix Updates: for accepted blocks and for each key that we’ve read/written to, we would set the suffix of the value to the height of the accepted block
Implementation Details
Currently, logic similar to an implementation of tracking
MState
values can be found in the following:Fetcher
: for a given set of transactions,f
stores KV-pairs that have been loaded from persistent storagechain.BuildBlock()
: this function also stores KV-pairs that have been loaded from persistent storageHowever, these solutions are persistent only within a block; furthermore, these solutions treat
ReadMState
operations the same asReadSState
operations and so the fetching models described above are merely a performance optimization and do not benefit the user. In the case ofFetcher
, we would ideally like to do the following:f.runWorker()
is called, we still get the KV-pair from storage. However, we then add the additional step of validating the permission specified in the state key. If the permission is sufficient, then we callf.set()
. Otherwise, we return an error.With the above, when we now call$b_i$ , we could query the cache of
tx.preExecute()
andtx.Execute()
inprocessor
, the view being passed in (tsv
) will only have the KV-pairs whichtx
gave permission for. To handle updating the suffixes of the values touched byf
after all transactions are executed and then update the suffixes of any values in the cache.Side Note: Permission Piggybacking
One thing that we would need to handle is permission piggybacking; that is, in the following case:
permission(t_1, k) = ReadFromSState
,permission(t_2, k) = ReadFromMState
We don’t want a case where, in either pre-execution or execution, that$t_2$ has access to $t_2$ has access to data that it did not pay for.
k
as this would imply thatTesting
Ideally, we would start by defining the coverage which new
Fetcher
unit tests should cover:ErrNotFound
Next, we could define the additional cases that
Processor
unit tests would need to cover:b
, their suffixes are updated to match the height ofb
cc: @aaronbuchwald @darioush @tsachiherman
Beta Was this translation helpful? Give feedback.
All reactions