In my previous post explaining scalability, I mentioned how scaling can be done both on layer 1 and layer 2. ETH 2.0 will solve the scalability issues on layer 1 using Sharding and Rollups. In this post, I will try to explain what sharding is and how it helps Ethereum scale 10,000 fold.
The question again is how do you scale Ethereum without compromising on decentralization. With increase in users and transactions, the blockchain is only going to grow bigger making it very hard for an average user to run a full node. Every node in the Ethereum network needs to store the entire blockchain. Ethereum full nodes (archive) now take up to 7.5 terabytes of storage and full nodes (default) take up to 830 GB. This makes nodes to be run by only a handful of people defeating the concept of a decentralized ledger. This is where sharding comes in.
Sharding
Sharding is a common concept in computer science where the database is split into multiple smaller ones in order to reduce the load. In the context of Ethereum, the state of ethereum's blockchain is split into partitions that are called "shards". The number of shards ethereum will be split into is 64. Now, running a node is a lot easier because of low hardware requirements. Each node on the network will only have to run the transactions of one of these 64 shards. This is means that the architecture of blockchain itself is being changed. What this means is that:
- A node doesn't have to run all the transactions that come into the network, but only those that belong to the shard chain that it is running.
- The hardware requirement is a lot lower because the entire state of the blockchain need not be stored.
- As everyone can now run a node there will be more network participation.
A few questions arise as well:
Q: How are blocks on one shard chain send to the others?
A: Each shard chain will have its own bunch of nodes running and validating the shard chain. Each shard has its own history and state and any transaction that goes through will have been verified by all the nodes running that shard. After a block becomes valid, all the nodes digitally sign the block saying that it is verified and pass it over to other shards.
Q: How can nodes on one shard be sure about the authenticity of transactions on blocks validated by other shards?
A: The digital signature of all the nodes of one shard can not be spoofed. Instead of going through all the transactions themselves, they can just verify that this block is signed by nodes of other shard. This is exactly what reduces the load of the blockchain.
Q: How do these shards communicate with each other?
A: The Beacon chain comes into play here. This Beacon Chain has in it the logic for keeping shards secure and synced up. This chain follows the Proof of Stake consensus in which each node will have to stake 32 ETH coins to become a validator. The beacon chain assigns the validator to a shard on which they should work on (validate). Communication and syncing among the shards is done through the beacon chain.
This illustration on Vitalik's blog helps understand the architecture of Sharding:
Simplification
Analogies always help to understand concepts better. Let's imagine a situation where for a very powerful position in the country a candidate is selected but before he is officially declared verification on a bunch of documents (100) need to be done. A large group of people are appointed wherein each person will have to go through all the documents and verify each one of them and then pass it to the next person. All the documents are given to every person and are validated. Yes, in the context of blockchain the documents are transactions and people are validators.
But what if the documents are divided into categories like income records, criminal records, educational records, career record, health record etc., Now we can also split the people into different groups where they only have to verify documents of one category. A group handling income records don't have to verify criminal records because they have already been verified and signed by another group. In blockchain, these categories are shards and the groups have people who are validators running only that shard. Of course, it is not as easy as it sounds, there are a lot of complications but we are not going into the rabbit-hole in this post.
Shard coins are expected to be implemented by 2022. Vitalik is a huge fan of sharding and is very sure this (along with rollups) will be the best way ethereum will scale.