[cover photo by Felix Mittermeier]
Last time, we discussed a high-level overview of Merkle trees including the components, how they are constructed, and a couple different context dependent ways to traverse them. This time, we are going into depth about how we make Merkle trees, featuring the rust language.
So, we know what a tree-like structure looks like and the different parts thereof. Every tree begins as a vector of Leafs. Each of these Leafs are hashed to form leaf nodes, the first layer of the tree. I leverage the functional programming aspects of the rust programming language here, namely iterators and closures. The main idea is that you can iterate over sequences and call a closure (like an arrow function in JavaScript or lambda function in python) on each value using map and transform the element to type Node.
Now we need to evaluate the case where the number of leaf nodes is uneven, which would unbalance the tree. If this is the case, we’ll push an extra element to the vector of children nodes, this element will be a copy of the last element in the vector.
Next, we’ll loop over the set of children nodes two at a time. Here, we perform two operations: first we link them together as one another’s K:V pairs in a HashMap and then declare a new variable equal to the hash of both nodes. Then, we can increment our internal loop counter by 2. When we exit the first loop, we alter the outer counting variable, dividing the node count in half (since we have n/2 resulting hashes), and check for an uneven number of nodes. If the count is uneven, we can append an extra copy of the final node in the set to balance the tree once more. This loop will repeat until we reach a single, resulting hash at the end called the root.
While we iterate through the first set of nodes and hash two at a time, the number of remaining nodes to hash with be cut in half and in order to stop the loop at the root, we break when i
is less than the remaining count of nodes to be hashed minus one. In addition, we take the new parent nodes and make them elements of the children node vector (since they will be hashed next) and clear the parent node vector subsequently. At the end of the loop sequence here, we check for an uneven number of nodes as well, which would otherwise unbalance the tree if unaddressed. We use the same concept as above with children nodes, with an extra check to ensure that the remaining node is not the root.
If you want to learn more from the source code, come check it out on GitHub.
Follow me @Raisin_Labs on Twitter!