Table of Links
-
Key Concepts
2.1 Append-Only Log and 2.2 Virtual Machine State
2.3 Transactions As Curried Functions
2.6 Efficient Representations of State
-
3.1 VM Job Queue and Transaction Order Finality
A. Discrepancy Detection Security Parameters
2 Key Concepts
In this section, we introduce ideas, notation, and terminology used to describe and delineate the design space of smart contract systems, using some existing well-known systems as reference points. Applying these ideas will help us create a simpler, easier-tounderstand mathematical / mental model of what all blockchain systems do, and to discuss / analyze tradeoffs in current and future designs. These notions apply both to standard blockchains as well as to scaling solutions such as rollups.
2.1 Append-Only Log
The append-only logs used in Bitcoin and Ethereum v1 are Proof-of-Work (PoW) based designs, whereas those used in Ethereum v2, Cosmos, Polkadot, Oasis, etc are Proof-of-Stake (PoS) based designs.
At the level of abstraction needed here, the only thing that we care about is that the append operation for a PoW log is probabilistic, and an entry is not considered successful until about 6 additional entries have been made (the distance to the end of the chain is a security parameter here). Typically the idea for needing additional entries in PoW is called “probabilistic finality”.
PoS logs use committee elections and digital signatures instead of solving cryptographic puzzles, and once the consensus protocol completes and a new log entry is made, it is considered final. While it takes time for the consensus protocol to run, no additional entries are required and this is often referred to as “instant finality.”
We will refer to the general idea as “log finality,” whether probabilistic (with appropriate security parameter) or instant. Log finality is the mechanism upon which other notions of finality is built: log finality only says that some entry is successfully appended to the log, but says nothing about what that entry means.
2.2 Virtual Machine State
Since “state” can be encoded and represented in many ways, we need to start with mathematical necessities to describe what we mean by “state” in a way that is divorced from any particular representation. By “state”, we mean a function from a key set, its domain, to a value set, its range, i.e., State : Key → Value. In Bitcoin, Key would essentially be public keys, and Value would be numbers representing the quantity of Bitcoin tokens under the control of the corresponding private keys. In Ethereum-like systems, Key would be a union of several disjoint sets, e.g., public key hashes associated with Externally Owned Accounts (EOAs), contract addresses, contract address and 256-bit address tuples for naming persistent store locations, etc. Correspondingly, Value would be public keys, EOA balances, contracts’ code and balance, and 256-bit values from contracts’ persistent store, etc.[1]
2.3 Transactions As Curried Functions
Users of blockchain-based systems propose transactions that are recorded by being appended to the blockchain. We normally think of transactions as altering state. In a mathematical description, they are functions that transform state, i.e., State → State. Smart contract entry points take arguments— in Ethereum, the callData—and does not fit this type signature; however, by currying all the user-supplied arguments including authorization information (msg.sender, etc), all transactions can be modeled in this manner.[2]
This is a useful separation since most contract virtual machines implement standard ACID transaction semantics, and an explicit transaction abort reverting the contract state (failure atomicity) is just tc returning its input state, though gas fees for computation performed until the abort must still be paid.
2.4 Natural Names of State
Because users think of their transactions as being executed in a strict serial order and submit transactions based on their model of what the current system state (e.g., their EOA balance), a natural way to refer to any reachable state is the sequence of transactions that yields that state.
2.4.1 Names And Composition of Transformations
Note that for Bitcoin, Ethereum, and Arbitrum, the natural name is exactly what is recorded on the blockchain. Other smart contract systems may only implicitly record the execution order, e.g., as part of a zkSNARK proving correct execution.
2.4.2 Names Are Not Unique
Note that names are not unique. Many states have multiple names. If two transactions f and g commute at a given state s, i.e., the resultant state is (f; g)(s) = (g; f)(s), then there are (at least) two names for the resultant state. Of course, if both f and g were proposed, the system will eventually decide on some order. This is analogous to how 0+ 1+ 2 and 0 + 2 + 1 are both ways to name the value 3.
Note also that some states have no names in this nomenclature. For example, a state where an EOA controls more tokens than the total token supply is unreachable, since (absent a catastrophic bug) there are no sequences of transactions from the genesis state that would result in such a state. This is fine, since such unreachable states are of no interest.
Authors:
(1) Bennet Yee, Oasis Labs;
(2) Dawn Song, Oasis Labs;
(3) Patrick McCorry, Infura;
(4) Chris Buckland, Infura.
This paper is
[1] This is just one way to model Ethereum state; other models, e.g., as a set of mappings one for each key type, would be more precise.