diff --git a/state-network.md b/state-network.md index 673cd87..143dfc7 100644 --- a/state-network.md +++ b/state-network.md @@ -2,35 +2,29 @@ This document is the specification for the sub-protocol that supports on-demand availability of state data from the execution chain. -> 🚧 THE SPEC IS IN A STATE OF FLUX AND SHOULD BE CONSIDERED UNSTABLE 🚧 - ## Overview The execution state network is a [Kademlia](https://pdos.csail.mit.edu/~petar/papers/maymounkov-kademlia-lncs.pdf) DHT that uses the [Portal Wire Protocol](./portal-wire-protocol.md) to establish an overlay network on top of the [Discovery v5](https://github.com/ethereum/devp2p/blob/master/discv5/discv5-wire.md) protocol. -State data from the execution chain consists of all account data from the main storage trie, all contract storage data from all of the individual contract storage tries, and the individul bytecodes for all contracts. +State data from the execution chain consists of all account data from the main storage trie, all contract storage data from all of the individual contract storage tries, and the individul bytecodes for all contracts across all historical state roots. This is traditionally referred to as an "archive node". ### Data -All of the execution layer state data is stored in two different formats. - -- Raw trie nodes -- Leaf data with merkle proof - -#### Types - The network stores the full execution layer state which emcompases the following: -- Account leaf nodes with accompanying trie proof. -- Contract storage leaf nodes with accompanying trie proof. +- Account trie nodes +- Contract storage trie nodes - Contract bytecode +The network is implemented as an "archive" node meaning that it stores all +tries for all historical blocks. + #### Retrieval -- Account trie leaf data by account address and state root. -- Contract storage leaf data by account address, state root, and slot number. -- Contract bytecode by address and code hash. +- Account trie nodes by their node hash. +- Contract storage trie nodes by their node hash. +- Contract bytecode by code hash. ## Specification @@ -38,48 +32,12 @@ The network stores the full execution layer state which emcompases the following ### Distance Function -The state network uses the following "ring geometry" distance function. - -```python -MODULO = 2**256 -MID = 2**255 - -def distance(node_id: uint256, content_id: uint256) -> uint256: - """ - A distance function for determining proximity between a node and content. - - Treats the keyspace as if it wraps around on both ends and - returns the minimum distance needed to traverse between two - different keys. - - Examples: - - >>> assert distance(10, 10) == 0 - >>> assert distance(5, 2**256 - 1) == 6 - >>> assert distance(2**256 - 1, 6) == 7 - >>> assert distance(5, 1) == 4 - >>> assert distance(1, 5) == 4 - >>> assert distance(0, 2**255) == 2**255 - >>> assert distance(0, 2**255 + 1) == 2**255 - 1 - """ - if node_id > content_id: - diff = node_id - content_id - else: - diff = content_id - node_id - - if diff > MID: - return MODULO - diff - else: - return diff - -``` - -This distance function is designed to preserve locality of leaf data within main account trie and the individual contract storage tries. The term "locality" in this context means that two trie nodes which are adjacent to each other in the trie will also be adjacent to each other in the DHT. +The state network uses the stock XOR distance metric defined in the portal wire protocol specification. ### Content ID Derivation Function -The derivation function for Content ID values is defined separately for each data type. +The state network uses the SHA256 Content ID derivation function from the portal wire protocol specification. ### Wire Protocol @@ -125,134 +83,237 @@ A node should track their own radius value and provide this value in all Ping or ### Data Types -#### Component Data Elements +#### OFFER/ACCEPT vs FINDCONTENT/FOUNDCONTENT payloads + +The data payloads for many content types in the history network differ between OFFER/ACCEPT and FINDCONTENT/FOUNDCONTENT. + +The OFFER/ACCEPT payloads need to be provable by their recipients. These proofs are useful during OFFER/ACCEPT because they verify that the offered data is indeed part of the canonical chain. + +The FINDCONTENT/FOUNDCONTENT payloads do not contain proofs because a piece of state can exist under many different state roots. All payloads can still be proved to be the correct requested data, however, it is the responsibility of the requesting party to anchor the returned data as canonical chain state data. + + +#### Helper Data Types + +##### Paths (Nibbles) + +A naive approach to storage of trie nodes would be to simply use the `node_hash` value of the trie node for storage. This scheme however results in stored data not being tied in any direct way to it's location in the trie. In a situation where a participant in the DHT wished to re-gossip data that they have stored, they would need to reconstruct a valid trie proof for that data in order to construct the appropriate OFFER/ACCEPT payload. We include the `path` metadata in state network content keys so that it is possible to reconstruct this proof. + +We define path as a sequences of "nibbles" which represent the path through the merkle patritia trie (MPT) to reach the trie node. + +``` +nibble := {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, a, b, c, d, e, f} +NibblePair := Byte # 2 nibbles tightly packed into a single byte +Nibbles := Container(is_odd_length=bool, packed_nibbles=List(NibblePair, max_length=32)) +``` + +`NibblePair` is packed so that nibble that goes first in the path uses high bits, while later nibble uses lower bits. + +`Nibbles.packed_nibbles` is a sequence of bytes with each byte containing two nibbles. When encoding an odd length sequence of nibbles, `Nibbles.is_odd_length` boolean flag MUST be set to `True`, the high bits of the first byte MUST be left empty, first nibble MUST use low bits of the first byte, and remaning nibbles are tightly packed in remaining bytes. + +Examples: + +``` +[1, 2, a, b] -> Nibbles(is_odd_length=false, packed_nibbles=[0x12, 0xab]) +[1, 2, a, b, c] -> Nibbles(is_odd_length=true, packed_nibbles=[0x01, 0x2a, 0xbc]) +``` + +##### Merkle Patricia Trie (MPT) Proofs + +Merkle Patricia Trie (MPT) proofs consist of a list of `WitnessNode` objects that correspond to individual trie nodes from the MPT. Each node can be one of the different node types from the MPT (e.g.blank, branch, extension, leaf). When serialized, each `WitnessNode` is represented as an RLP serialized list of the component elements. The largest possible node type is the branch node which when serialized is a list of up to sixteen hashes in `Bytes32` (representing the hashes of each of the 16 nodes in that branch and level of the tree) plus the 4 elements of the node's value (balance, nonce, codehash, storageroot) represented as `Bytes32`. When combined with the RLP prefixes, this yields a possible maximum length of 667 bytes. We specify 1024 as the maximum length due to constraints in the SSZ spec for list lengths being a power of 2 (for easier merkleization.) -#### Proofs -Merkle Patricia Trie (MPT) proofs consist of a list of witness nodes that correspond to each trie node that consists of various data elements depending on the type of node (e.g.blank, branch, extension, leaf). When serialized, each witness node is represented as an RLP serialized list of the component elements with the largest possible node type being the branch node which when serialized is a list of up to sixteen hashes in `Bytes32` (representing the hashes of each of the 16 nodes in that branch and level of the tree) plus the 4 elements of the node's value (balance, nonce, codehash, storageroot) represented as `Bytes32`. When combined with the RLP prefixes, this yields a possible maximum length of 667 bytes. We specify 1024 as the maximum length due to constraints in the SSZ spec for list lengths being a power of 2 (for easier merkleization.) ``` WitnessNode := ByteList(1024) -MPTWitness := List(witness: WitnessNode, max_length=32) +Witness := List(WitnessNode, max_length=1024) +StateWitness := Container(key: Nibbles, proof: Witness) +StorageWitness := Container(key: Nibbles, proof: Witness, state_witness: StateWitness) ``` -#### Account Trie Proof +The `StateWitness.key` denotes the path to the trie node that is proven by the `StateWitness.proof`. The same applies to `StorageWitness.key/StorageWitness.proof`. + +The `StorageWitness.state_witness` MUST be for a leaf node in the account trie. The `StorageWitness.proof` MUST be anchored to the contract state root denoted by the account from the `StorageWitness.state_witness`. + +All `Witness` objects are subject to the following validity requirements. + +- A: Lexical Ordering +- B: No Extraneous Nodes + +###### A: Lexical Ordering -A leaf node from the main account trie and accompanying merkle proof against a recent `Header.state_root` +The sequence of nodes in the witness MUST be lexically ordered by their nibbles +path in the trie. This results in the state root node always occuring first and +node being proven last in the list of trie nodes. + +> This validity condition is to ensure that verifcation of the proof can be done +in a single pass. + +###### B: No Extraneous Nodes + +A witness MUST NOT contain any nodes that are not part of the set needed to for proving. + +> This validity condition is to protect against malicious or erroneous bloating of proof payloads. + + +#### Account Trie Node + +These data types represent a node from the main state trie. ``` -account_trie_proof_key := Container(address: Bytes20, state_root: Bytes32) +account_trie_node_key := Container(path: Nibbles, node_hash: Bytes32) selector := 0x20 -content := Container(witness: MPTWitness) -content_id := keccak(address) -content_key := selector + SSZ.serialize(account_trie_proof_key) +content_key := selector + SSZ.serialize(account_trie_node_key) ``` -#### Contract Storage Trie Proof +##### Account Trie Node: OFFER/ACCEPT -A leaf node from a contract storage trie and accompanying merkle proof against the `Account.storage_root`. +This type MUST be used when content offered via OFFER/ACCEPT. ``` -storage_trie_proof_key := Container(address: Bytes20, slot: uint256, state_root: Bytes32) -selector := 0x21 +content_for_offer := Container(proof: StateWitness, block_hash: Bytes32) +``` + -content := Container(witness: MPTWitness) -content_id := (keccak(address) + keccak(slot)) % 2**256 -content_key := selector + SSZ.serialize(storage_trie_proof_key) +##### Account Trie Node: FINDCONTENT/FOUNDCONTENT + +This type MUST be used when content retrieved from another node via FINDCONTENT/FOUNDCONTENT. + +``` +content_for_retrieval := Container(node: WitnessNode) ``` -#### Contract Bytecode -The bytecode for a specific contract as referenced by `Account.code_hash` +#### Contract Trie Node + +These data types represent a node from an individual contract storage trie. ``` -contract_bytecode_key := Container(address: Bytes20, code_hash: Bytes32) -selector := 0x22 +storage_trie_node_key := Container(address: Address, path: Nibbles, node_hash: Bytes32) +selector := 0x21 -content := ByteList(24756) // Represents maximum possible size of contract bytecode -content_id := sha256(address + code_hash) -content_key := selector + SSZ.serialize(contract_bytecode_key) +content_key := selector + SSZ.serialize(storage_trie_node_key) ``` -## Gossip -### Overview +##### Contract Trie Node: OFFER/ACCEPT -A bridge node composes proofs for altered (i.e. created/modified/deleted) state data based on the latest block. -These proofs are tied to the latest block by the state root. -The bridge node gossips each proof to some (bounded-size) subset of its peers who are closest to the data based on the distance metric. +This type MUST be used when content offered via OFFER/ACCEPT. -### Terminology +``` +content_for_offer := Container(proof: StorageWitness, block_hash: Bytes32) +``` -We define the following terms when referring to state data. -> The diagrams below use a binary trie for visual simplicity. The same -> definitions naturally extend to the hexary patricia trie. +##### Contract Trie Node: FINDCONTENT/FOUNDCONTENT +This type MUST be used when content retrieved from another node via FINDCONTENT/FOUNDCONTENT. ``` -0: X - / \ - / \ - / \ - / \ - / \ - / \ - / \ -1: 0 1 - / \ / \ - / \ / \ - / \ / \ -2: 0 1 0 1 - / \ / \ / \ / \ - / \ / \ / \ / \ -3: 0 1 0 1 0 1 0 1 - / \ / \ / \ / \ / \ / \ / \ / \ -4: 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 +content_for_retrieval := Container(node: WitnessNode) +``` + + +#### Contract Code + +These data types represent the bytecode for a contract. + +> NOTE: Because CREATE2 opcode allows for redeployment of new code at an existing address, we MUST randomly distribute contract code storage across the DHT keyspace to avoid hotspots developing in the network for any contract that has had many different code deployments. Were we to use the path based *high-bits* approach for computing the content-id, it would be possible for a single location in the network to accumulate a large number of contract code objects that all live in roughly the same space. +Problematic! + +``` +contract_code_key := Container(address: Address, code_hash: Bytes32) +selector := 0x22 + +content_key := selector + SSZ.serialize(contract_code_key) ``` -#### *"state root"* -The node labeled `X` in the diagram. +##### Contract Code: OFFER/ACCEPT -#### *"trie node"* +This types MUST be used when content offered via OFFER/ACCEPT. -Any of the individual nodes in the trie. +``` +content_for_offer := Container(code: ByteList, account_proof: StateWitness, block_hash: Bytes32) +``` -#### *"intermediate node"* -Any of the nodes in the trie which are computed from other nodes in the trie. The nodes in the diagram at levels 0, 1, 2, and 3 are all intermediate. +##### Contract Code: FINDCONTENT/FOUNDCONTENT -#### *"leaf node"* +This type MUST be used when content retrieved from another node via FINDCONTENT/FOUNDCONTENT. -Any node in the trie that represents a value stored in the trie. The nodes in the diagram at level 4 are leaf nodes. +``` +content_for_retrieval := Container(code: ByteList) +``` -#### *"leaf proof"* -The merkle proof which contains a leaf node and the intermediate trie nodes necessary to compute the state root of the trie. +## Gossip -### Gossip +As each new block is added to the chain, the state from that block must be gossiped into the network. +The state network defines a specific gossip algorithm which is referred to as "Recursive Gossip". +This section of the specification defines how this gossip mechanism works. -Each time a new block is added to their view of the chain, a set of merkle proofs which are all anchored to `Header.state_root` is generated which contains: +### Terminology + +The Merkle Patricia Trie (MPT) has three types of nodes: *"branch"*, *"extension"* and *"leaf"*. +The MPT also specifies the `nil` node, but it will never be sent or stored over network, so we will +ignore it for this spec. + +Similarly to other tree structure, the `leaf` node is the lowest node on a certain path and it's +where value is stored in the tree (strictly speaking, MPT allows value to be stored in `branch` +nodes as well, but Ethreum storage doesn't use this functionality). The `branch` and `extension` +nodes can be called `intermediate` nodes because there will always be a node that can only be +reached by passing through them. + +A *"merkle proof"* or *"proof"* is a collection of nodes from the trie sufficient to recompute the +state root and prove that *"target"* node is part of the trie defined by that state root. A proof is: + +- *"ordered"* + - the order of the nodes in the proof have to represent the path from root node to the target node + - first node must be the root node, followed by zero or more intermediate nodes, ending + with a target node + - it should be provable that any non-first node is part of the preceding node + - if root node is the target node, then the proof will only contain the root node + - the target node can be of any type (branch, extension or leaf) +- *"minimal"* + - it contains only the nodes from on a path from the root node to the target node + +### Overview -- Account trie Data: - - All of the intermediate and leaf trie nodes from the account trie necessary to prove new and modified accounts. -- Contract Storage trie data: - - All of the intermediate and leaf trie nodes from each contract storage trie necessary to prove new and modified storage slots. -- All contract bytecode for newly created contracts +We will give the overview of the process for the account trie. The same process should be applied +for the storage trie as well. -> TODO: Figure out language for defining which trie nodes from this proof the bridge node must initialize gossip. +The goal of the "recursive gossip" mechanism is to reduce the burden of responsibility placed on +bridge nodes for injecting new state data into the network while simultaniously spreading the +responsibility for gossiping new state data across the nodes in the network. -> TODO: Determine mechanism for contract code. +At each block we construct a proof for each new or modified value in the trie, which is stored in +the leaf nodes. The bridge node would search the DHT for nodes that are *interested* in storing +each leaf node and gossip the proof to those nodes. -The receiving DHT node will propagate the data to nearby nodes from their routing table. +> For example, let this be the proof: `[A, B, C, D, E, F]`, then `A` is the root note and `F` is +> the target node. This proof should be gossiped to nodes that are interested in storing node `F`. -### Updating cold Leaf Proofs +The recipients of this gossip are then responsible for gossiping the penultimate node of the proof +(parent of the target node, `E` in our example). To do so, they would strip the last (target) node +from the proof, search the DHT for nodes that are interested in `E` and gossip this proof to them. -Anytime the state root changes for either the main account trie or a contract storage trie, every leaf proof under that root will need to be updated. The primary gossip mechanism will ensure that leaf data that was added, modified, or removed will receive and updated proof. However, we need a mechanism for updating the leaf proofs for "cold" data that has not been changed. +This process repeats until it terminates at the state root, with the final round of gossip only +containing the `[A]` which is the state root node of the trie. -Each time a new block is added to the chain, the DHT nodes storing leaf proof data will need to perform a walk of the trie starting at the state root. This walk of the trie will be directed towards the slice of the trie dictated by the set of leaves that the node is storing. As the trie is walked it should be compared to the previous proof from the previous state root. This walk concludes once all of the in-range leaves can be proven with the new state root. +### Bridge Node Responsibilities +The bridge is responsible for creating and gossiping all following data and their proofs: -> TODO: reverse diffs and storing only the latest proof. +- account trie data: + - all of the new and modified account nodes from the state trie +- contract storage trie data: + - all of the new and modified storage slots from each modified contract storage trie + - proof has to include the proof for the account trie that corresponds to the same contract + - recursive gossip stops at the root of the storage trie +- all contract bytecode for newly created contracts + - proof has to include the proof for the account trie that corresponds to the same contract + - recursive gossip doesn't happen in this case -> TODO: gossiping proof updates to neighbors to reduce duplicate work. +A bridge should compute the content-id values for all proofs that are part of the initial round of +recursive gossip. These proofs should be sorted by proximity to its own node-id. Beginning with +the content that is *closest* to its own node-id it should proceed to GOSSIP each individual proof +to nodes interested in that content.