VERKLE TREE → THE ‘VERGE’ PART OF ETHEREUM
By Sivadas Neelima, Intern Kerala Blockchain Academy
The “Verge” stage is one amongst the six potential upgrades in the roadmap proposed by Ethereum co-founder Vitalik Buterin for revamping blockchain technology. It is set to address the scalability trilemma the blockchain platform continues to face. The “Verge” puts forward the idea of “Verkle tree” that is set to replace the erstwhile Merkle proofs/Merkle tree. Before going to the Verkle tree concept it is essential to understand the need for such a mechanism in the blockchain framework. So let us get into the basics of it by looking through the blockchain itself.
In a blockchain, there are numerous blocks lined up indicating the number of transactions verified and validated. This is only set to increase with more participants/nodes joining the blockchain network. So every block contains a set of data including all the details pertaining to the transactions. With the addition of more blocks into the blockchain, there is a proportional increase in the database too!
In order to manage these enormous databases and to keep the blockchain running, the “state” concept comes to the fore. The state includes the block data in a specialized format using the Merkle tree concept by keeping all accounts linked by hashes and finally reducing to a single root hash stored on the blockchain. This root hash or the Merkle proof poses as a stronger digital signature with a simplified verification procedure. The Merkle proofs not only ease the block verification procedure but also helps in reducing the disc space requirements for performing unhindered transactions.
But unknown to many there lies a hidden concern of bulging state size. According to the Ethereum Foundation, the state size and the Merkle Proofs together have exceeded 135+ gigabytes of storage space. In order to overcome this dilemma, the “ Verge” stage of Ethereum upgradation proposes the idea of the Verkle tree. Introduced by John Kuszmaul in 2018, the Verkle tree concept works similarly to the Merkle tree but here the proofs are shorter in size, and the proof calculation time is also considerably reduced.
We will try to understand the overall concept through a simple example.
Generally, the Merkle tree is represented as a binary tree where the branching factor of nodes is 2. In the example discussed here, we will explore a set of 9 transactions belonging to a block added in a blockchain network using a k-ary Merkle tree where k is the branching factor.
Do you know that Ethereum uses a hexary structure in its Merkle Patricia Trie ?
Ethereum uses a variant of the Merkle tree called Merkle Patricia Trie. In this structure, each parent node can have 16 child nodes. The average depth of the tree ranges from 10 to 15 levels and the Merkle proof requires 15 sibling node details at each level.
So here the branching factor selected is 3. The first step here involves hashing each of the transactions using a cryptographic hash function. The next step is grouping the hashed transactions into subsets, further generating a hash from the subsets until the root hash is computed. The root hash or Merkle proof denotes the entire block of transactions. The figure below shows how a Merkle tree is constructed.
As depicted in the figure, the hashes of sibling nodes of T3 i.e. Hash(T1) and Hash(T2) + concatenated hashes of T4 to T9 can cumulatively help in tracing if T3 belongs to the entire set of transactions. And here the root Hash [ T123456789] is the Merkle proof that serves as the digest (footprint) for the entire set of transactions.
Moving further, it’s time to explore the Verkle tree with a similar setting as above. Unlike the Merkle tree which uses a cryptographic hash function to calculate the proof, the Verkle tree uses the Vector Commitment (VC) method for computing the same. The Vector Commitment is computed using the polynomial functions.
Thus for a Verkle Tree with a certain number of transactions, say, T1, T2, T3…..T9, the branching factor of the tree, k (here k = 3) is considered first. The Vector Commitment is then computed over each of these subsets. The membership proofs πi for each transaction Ti in the subset is computed with respect to the Vector Commitment (note that i refers to the index position of the given transactions). Here C1, C2, and C3 are the resulting Vector Commitments. The Vector Commitment C4 is computed over these three commitments along with their membership proofs π10, π11, and π12 respectively. Hence, C4 is the root commitment that forms the digest of the Verkle Tree.
To comprehend in a better way, consider Figure 2, here (T3,π3) in the yellow colour block means that transaction T3 exists in Commitment C1 and π3 is proof of this existence. Similarly (C1,π10) means that the Commitment C1 exists in the root Commitment C4 and π10 is proof of this existence.
Vector commitments are cryptographic techniques which allow committing to an ordered sequence of k values (t1, t2 . . . ,tk ) in such a way that one can later reveal one or many values at a specific position and prove it consistent with the initial commitment (for e.g., prove that ti is the i-th committed value) values. A Merkle tree is also a vector commitment, with the property that opening the i-th value requires a specific number of hashes as proof (Merkle proof).
The Ethereum team plans the use of KZG polynomial commitment using elliptic-curve cryptography scheme to replace the hash functions used in the Merkle tree.
Thus for a tree with billion data storage points, making Merkle proofs cumulatively would require about 1 kilobyte, but in a Verkle tree the proof would be less than 150 bytes. With the addition of nodes, the Merkle proof calculations continue to expand leading to an increase in the depth of the Merkle tree. On the contrary, the proof size in the Verkle tree remains constant and the depth of the tree is considerably reduced.
So while a verifier has to look through multiple hash calculations at each level to reach the Merkle root, the proofs in the Verkle tree simplify the process by requiring to provide only a single proof at each level to reach the root Commitment, thereby proving that a transaction or a piece of data is part of a given set.
To sum up, Verkle tree’s application in blockchain can be summarised below:
- Overcomes related issues to data storage: Smaller proofs in Verkle trees address the data storage problem. For instance from the above example in the case of Merkle tree the proofs for transaction T3 are [Hash(T1)+Hash (T2)+ Hash (T456) +Hash (T789)] while those for the Verkle tree are [ π3 + (C1, π10) + C4].
- Verification time is reduced with shorter proofs generated in a shorter duration of time which further enables more participation from network validators.
- Renders implementation of stateless blockchain networks.
- Verkle tree’s proof size efficiency in turn helps to reduce the node size, this will further lead to actualizing the stateless client’s concept.
- The stateless clients help in the validation of execution blocks without entirely possessing the full account state.
- Network synchronizations will be more simplified.
- Promotes sustainable storage requirements.
- Reduced proof sizes help smaller devices to participate as validators.
- Faster access to information ( With root commitment in hand, blocks contain all the data required to process further).
- Kuszmaul, J. (2019). Verkle trees. Verkle Trees, 1.
- Campanelli, M., Fiore, D., Greco, N., Kolonelos, D., & Nizzardo, L. (2020). Vector Commitment Techniques and Applications to Verifiable Decentralized Storage. IACR Cryptol. ePrint Arch., 2020, 149
- Catalano, D., & Fiore, D. (2013). Vector commitments and their applications. In Public-Key Cryptography–PKC 2013: 16th International Conference on Practice and Theory in Public-Key Cryptography, Nara, Japan, February 26–March 1, 2013. Proceedings 16 (pp. 55–72). Springer Berlin Heidelberg.