One of the important issues that has been raised in the course of the launch of the Olympic stress network is the large amount of data that clients need to store; In just over three months of operation, and particularly in the last month, the amount of data in each Ethereum client’s blockchain folder has skyrocketed to an impressive 10-40 gigabytes, depending on which client you’re using and whether the compression is is enabled or not. . Although it is important to note that this is in fact a stress test scenario where users are incentivized to dump transactions on the blockchain by paying only the free test ether as a transaction fee, and performance levels transaction rates are therefore several times higher than Bitcoin, is however a legitimate concern for users, who in many cases do not have hundreds of gigabytes to store other people’s transaction histories.
First of all, let’s start by exploring why the current Ethereum customer database is so large. Ethereum, unlike Bitcoin, has the property that each block contains something called a “root of state”: the root hash of a specialized type of Merkle tree which stores all system state: all account balances, contract storage, contract code, and account nonces are inside.
The purpose of this is simple: it allows a node given only the last block, along with some assurance that the last block is actually the most recent block, to “synchronize” with the blockchain extremely quickly without processing any historical transactions. , simply downloading the rest of the node tree onto the network (the proposed HashSearch cable protocol message will make this easier), checking that the tree is correct by checking that all the hashes match, and then proceeding from there. In a fully decentralized context, this will likely be done via an advanced version of Bitcoin’s header-first verification strategy, which will look something like this:
- Download as many block headers as the client can get their hands on.
- Determine the header that is at the end of the longest string. From that header, go back 100 blocks for safety and call the block at that position P100(H) (“the grandfather of the head of the hundredth generation”)
- Download the state tree from the state root of P100(H), using the HashSearch opcode (note that after the first one or two rounds, this can be parallelized across as many pairs as desired.) Verify that all parts of the tree match.
- Continue normally from there.
For thin clients, the status root is even more advantageous: they can immediately determine the exact balance and status of any account simply by asking the network for a particular branch of the tree, without needing to follow Bitcoin’s multi-step 1 of N thin client model of “request all transaction outputs, then request all transactions that spend those outputs, and take the rest.”
However, this state tree mechanism has one major drawback if implemented naively: Intermediate nodes in the tree greatly increase the amount of disk space required to store all the data. To see why, consider this diagram here:
The change in the tree during each individual block is quite small, and the magic of the tree as a data structure is that most data can be referenced twice without being copied. However, still, for every state change that is made, a logarithmically large number of nodes (i.e. ~5 in 1000 nodes, ~10 in 1000000 nodes, ~15 in 1000000000 nodes) need to be buffered twice, once version for the old tree and a version for the new trie. Eventually, as a node processes each block, we can expect the total disk space utilization to be, in computing terms, approximately O(n*log(n))where north is the transaction charge. In practical terms, the Ethereum blockchain is only 1.3 gigabytes, but the size of the database, including all these additional nodes, is 10 to 40 gigabytes.
So what can we do? A retrospective solution is to simply go ahead and implement header sync first, essentially resetting new users’ hard drive consumption to zero and allowing users to keep their hard drive consumption down by re-syncing every month or two, but that’s a rather ugly solution. The alternative approach is to implement state tree pruning: essentially, use reference count to track when the nodes in the tree (here using “node” in the computing term which means “piece of data that is somewhere in a graph or tree structure”, not “computer on the network”) fall out of the tree , and at that point, put them on “death row”: unless the node somehow gets used again within the next X blocks (eg. X = 5000), after that number of blocks passes, the node should be permanently removed from the database. Essentially, we store the tree nodes that are part of the current state, and we even store recent history, but we don’t store history older than 5000 blocks.
X should be set as low as possible to save space, but X too low compromises robustness: once this technique is implemented, a node cannot go back more than X blocks without essentially completely resetting the sync. Now, let’s see how this approach can be fully implemented, taking all edge cases into account:
- When processing a block with number north, keep track of all nodes (in the state trees, trees, and receipts) whose number of references drops to zero. Put the hashes of these nodes in a “death row” database in some sort of data structure so the list can be retrieved later by block number (specifically, block number N+X), and mark the node’s database entry as worthy of deletion in the block N+X.
- If a node that is on death row resets (a practical example of this is account A acquiring a particular combination of balance/nonce/code/storage Fthen change to a different value gramand then account B acquiring state F while the node for F is on death row), then increase your referral count back to one. If that node is removed again in some future block SUBWAY (with M > N), then put it back on the death row of the future block so that it is removed in the block M+X.
- When you get to the processing block N+Xretrieve the list of hashes you recorded during the block north. Check the node associated with each hash; if the node is still marked for deletion during that specific block (i.e. not reset, and more importantly not reset and then marked again for deletion) later), delete it. Also remove the list of hashes in the death row database.
- Sometimes the new head of a chain will not be on top of the old head and you will need to revert a block. For these cases, you’ll need to keep in the database a journal of all changes in reference counts (that’s “daily” as in journal file systems; essentially an ordered list of changes made); when reverting a block, delete the death row list generated when producing that block and undo the changes made according to the journal (and delete the journal when done).
- When processing a block, delete the journal in the block N-X; you are not able to reverse more than X blocks anyway, so the journal is superfluous (and, if kept, would actually defeat the whole point of pruning).
Once this is done, the database should only store state nodes associated with the last X blocks, so you’ll still have all the information you need from those blocks, but nothing else. In addition to this, there are more optimizations. Particularly, after X blocks, transaction trees and receipts should be removed entirely, and even blocks could be removed as well, although there is a strong argument for maintaining a subset of “file nodes” that store absolutely everything to help the rest of the network acquire the data you need.
Now, how much savings can this give us? As a result, a lot! Particularly if we were to take the last reckless route and go X = 0 (i.e. losing absolutely all ability to handle even single block forks, without storing any history), then the size of the database would essentially be the size of the state: a value that, even now (this data was captured at block 670000 ) is about 40 megabytes, most of which is made up of accounts like this with full storage slots to deliberately spam the network. A X = 100000, we would essentially get the current size of 10 to 40 gigabytes, since most of the growth occurred in the last hundred thousand blocks, and the additional space needed to store death row diaries and lists would make up the rest of the difference. At each value in between, we can expect disk space growth to be linear (i.e., X = 10000 would take us about ninety percent of the way to almost zero).
Note that we may want to follow a hybrid strategy: keep each to block but not all state tree node; in this case, we would need to add approximately 1.4 gigabytes to store the block data. It is important to note that the cause of the size of the block chain is NOT fast block times; currently block headers from the last three months represent about 300 megabytes, with the rest being transactions from the last month, so at high usage levels we can expect transactions to continue to dominate. That being said, thin clients will also need to strip block headers if they want to survive in low memory circumstances.
The strategy described above has been implemented in a very early alpha form in piety; it will be successfully rolled out to all clients in due time after the release of Frontier, as such excess storage is only a medium-term scalability issue and not a short-term one.