The concept of a Merkle tree was proposed in the early ‘80s by Ralph Merkle – a computer scientist renowned for his work on public-key cryptography.
A Merkle tree is a structure used to efficiently verify the integrity of data in a set. They’re particularly interesting in the context of peer-to-peer networks, where participants need to share and independently validate information.
Hash functions are at the core of Merkle tree structures, so we recommend you check out What is Hashing? before proceeding.
Suppose that you want to download a large file. With open-source software, you’d typically want to check that the hash of the file you downloaded matches one made public by the developers. If it does, you know that the file you have on your computer is exactly the same as theirs.
If the hashes don’t match, you have a problem. You’ve either downloaded a malicious file masquerading as the software, or it hasn’t downloaded correctly and, therefore, won’t work. If the latter is the case, you probably won’t be too happy if you’ve had to wait for some time for the file to download. Now, you need to restart the process and hope that it doesn’t corrupt again.
If only there were an easier way to go about this, you think. Fortunately, that’s where Merkle trees come in. With one of these, you would have your file broken up into chunks. If it was a 50GB file, you might divide it into one hundred pieces, such that each is 0.5GB in size. Then, it would be downloaded piece-by-piece. This is essentially what you do when you torrent files.
In this case, your source will have provided you with a hash known as the Merkle root. This single hash is a representation of every chunk of data that makes up your file. But the Merkle root makes it much easier to verify the data.
To keep it simple, let’s take an example where we use an 8GB file broken into eight pieces. Call the different fragments A through H. Each fragment is then passed through a hash function, giving us eight different hashes.
We pass each of our eight fragments through a hash function to get their hashes.
Okay, so we’ve got something that makes a bit more sense. We have the hash of all the fragments, so if one is faulty, we’ll know by comparing it with the source’s one, right? Possibly, but that’s also incredibly inefficient. If your file has thousands of fragments, are you really going to hash all of them and meticulously compare the results?
No. Instead we’re going to take each pair of hashes, combine them, then hash them together. So we hash hA + hB, hC + hD, hE + hF, and hG + hH. We end up with four hashes. Then we do another round of hashing with these to end up with two. Finally, we hash the remaining two to get to our master hash – the Merkle root (or root hash).
The structure looks like an upside-down tree. On the bottom row, we have the leaves, which are combined to produce the nodes and, finally, the root.
We now have the Merkle root that represents the file we downloaded. We can compare this root hash with the one provided by the source. If it matches, perfect! But if the hashes are different, we can be sure that the data was modified. In other words, one or more fragments have produced a different hash. So any slight modification of data will give us a totally different Merkle root.
Fortunately, there’s a handy way for us to check which fragment is faulty. In our case, let’s say it’s hE. You would start by asking a peer for the two hashes that produced the Merkle root (hABCD and hEFGH). Your value hABCD should match theirs since there’s no mistake in that subtree. But hEFGH won’t, so you know to check in there. You then request hEF and hGH, and compare them with yours. hGH will look fine, so you know that hEF is our culprit. Lastly, you compare the hashes of hE and hF. You now know that hE is incorrect, so you can redownload that chunk.
Summing it all up, a Merkle tree is created by dividing data into many pieces, which are then hashed repeatedly to form the Merkle root. You can then efficiently verify if something has gone wrong with a piece of data. As we’ll see in the next section, there are other interesting applications, too.
There are a handful of use cases for Merkle trees, but here we will focus on their importance in blockchains. Merkle trees are essential in Bitcoin and many other cryptocurrencies. They’re an integral component of every block, where they can be found in the block headers. To get the leaves for our tree, we use the transaction hash (the TXID) of every transaction included in the block.
The Merkle root serves a couple of purposes in this case. Let’s take a look at their applications in cryptocurrency mining and transaction verification.
A Bitcoin block is made up of two pieces. The first part is the block header, a fixed-size segment containing metadata for the block. The second part is a list of transactions whose size is variable, but tends to be much larger than the header.
Miners need to repeatedly hash data to produce an output that matches certain conditions to mine a valid block. They can make trillions of attempts before finding one. With each attempt, they change a random number in the block header (the nonce) to produce a different output. But much of the block remains the same. There can be thousands of transactions, and you’d still need to hash them every time.
A Merkle root streamlines the process considerably. When you start mining, you line up all of the transactions you want to include and construct a Merkle tree. You put the resulting root hash (32 bytes) in the block header. Then, when you’re mining, you only need to hash the block header, instead of the whole block.
This works because it’s tamper-proof. You effectively summarize all of the block’s transactions in a compact format. You can’t find a valid block header and later change the transaction list, because that would change the Merkle root. When the block is sent to other nodes, they calculate the root from the transaction list. If it doesn’t match the one in the header, they reject the block.
There’s another interesting property of Merkle roots that we can leverage. This one concerns the light clients (nodes that don’t hold a full copy of the blockchain). If you’re running a node on a device with limited resources, you don’t want to download and hash all of a block’s transactions. What you can do instead is simply request a Merkle proof – evidence provided by the full node that proves that your transaction is in a particular block. This is more commonly referred to as Simplified Payment Verification, or SPV, and was detailed by Satoshi Nakamoto in the Bitcoin whitepaper.
To check hD, we only need the hashes shown in red.
Consider the scenario where we want to know information about the transaction whose TXID is hD. If hC is provided to us, we can work out hCD. Then, we need hAB to calculate hABCD. Lastly, with hEFGH, we can check that the resulting Merkle root matches the one from the block header. If it does, it’s proof that the transaction was included in the block – it would be near-impossible to create the same hash with different data.
In the above example, we’ve only had to hash three times. Without a Merkle proof, we would have needed to do it seven times. Since blocks nowadays contain thousands of transactions, using Merkle proofs saves us a lot of time and computing resources.
Merkle trees have proven themselves highly useful in a range of computer science applications – as we’ve seen, they’re incredibly valuable in blockchains. In distributed systems, Merkle trees allow for easy verification of information without flooding the network with unnecessary data.
Without Merkle trees (and Merkle roots), Bitcoin and other cryptocurrencies’ blocks would not be nearly as compact as they are today. And while light clients are lacking on the privacy and security fronts, Merkle proofs enable users to check whether their transactions have been included in a block with minimal overhead.