Syscoin Hack Ethereum Bridge Bounty - The Cut Off Problem

By art_of_bug | art_of_bug | 14 Aug 2019

$1.32 tipped


Welcome back. Hacking production chains a.k.a. mainnets is the most fun, but when incentives allow, exploring testnets can be fun too. The following is our first submission to the hack the Syscoin's Ethereum bridge bounty (do follow this link also to find information about Syscoin Ethereum bridge, some understanding of it is useful to be able to grasp this report). This vulnerability was never exploitable on the mainnet because the bridge is not active on the Syscoin mainnet yet, which was kind of the whole point of this bounty. Today, we do not present any exploit source code. This is for two reasons – it's freaking complicated and the report is super long already. Hope you enjoy even without the code. Also, the proof of knowledge is trivial today because the submission was made publicly as an issue on Syscoin's Github. Finally, this issue has already been fixed in the latest source code.

Before we start, let us quickly comment on the latest news. Previously we were optimistic about the rumours that Particl was going to create a bug bounty program. It's now live, but sadly it's nothing more than a PR stunt. The awards are obviously too low to attract any attention of skilled bug hunters and therefore the whole program is probably only useful to please the crowd and maybe in couple of years it will allow Particl to claim that their system is so secure that no one claimed those bounties. Although we will know that's not the case.

The Cut Off

Bug type: netsplit, 51% attack
Bug severity:
7/10

Scenario 1

Attacker cost: very low

The attacker performs setup, which requires small amount of SYS and ETH. After that the attacker starts the attack against the Syscoin network. The attack is not instant and needs to be repeated many times, but each attempt is very cheap and has certain chance of disabling part of the network. If the node is disabled by this attack, it means that it stops following the best chain of Syscoin. Any node connected to the network can be affected regardless of whether the attacker is connected to it. If the affected node is a miner, it will stop mining.

Scenario 2

Attacker cost: low

In this case, to initiate the attack, the attacker needs to be able to mine a single block. If she wants to take over the network, she obviously needs to mine continuously after that. The attacker mines a special block, which disables all other miners on the network. Only the attacker can mine after that and she can take over the network. Initially, she will mine blocks very slowly because we assume she has only small amount of hashing power, but after each retargeting period (every 360 blocks), the difficulty goes down and the attacker produces blocks faster. Eventually, the difficulty is so low that the attacker can mine blocks normally every minute. After this happens, the chain looks normal again. The attacker can perform any kind of attack that 51% miner could, for example double spending.

Description of Netsplit Attack

First we describe the network split attack, for which we assume the attacker without any hash power on her side.

For simplicity, we assume all the target nodes to be fully synced with both the Syscoin blockchain as well as Ethereum blockchain through the relayer – i.e. fGethSynced is true. All other nodes that are not fully synced yet can be attacked later once they finish synchronisation (or they fail to complete the sync because of the attack). We only consider fully synced target nodes below. Such nodes finished the initial block download and set the flag such that they won't again consider themselves in the initial block download unless they are restarted.

Setup Phase

The attacker starts her setup by obtaining small amount of arbitrary existing Syscoin asset on Ethereum blockchain. The attacker can either buy such an asset directly on Ethereum, or she can transfer it from Syscoin. Then the attacker burns a minimal amount of the asset on Ethereum blockchain through burn method of a Token contract implementing SyscoinTransactionProcessor. When such a burning through the contract is executed on Ethereum, the attacker waits until her transaction is mined into a block. Then the attacker burns again a minimal amount of the asset the same way and again waits for a confirmation. The attacker repeats this 1,000 times (arbitrary parameter of attacker's choice). The attacker does not announce these burns to anyone. The result of the setup is that there are 1,000 consecutive blocks on Ethereum blockchain, each containing one attacker's transaction that burns the minimal amount of the asset. We denote these blocks as A1, A2, A3, ... A1000. We denote the height of block X on its chain as h(X).

At the time of writing, the median Ethereum transaction fee is on the order of 0.10 USD. Therefore the cost of this setup is roughly 100 USD. The attacker can optionally save some money if she is willing to compromise on the speed of the attacking phase, in which case the attacker would use smaller fees for Ethereum transactions, which could result in nonconsecutive series of Ethereum blocks with her burning transactions, which is not a problem and the attack can be executed with such a series. However, we consider the investment of 100 USD to be very reasonable, so further we will assume consecutive series.

It can happen, that the attacker will be unable to retrieve the burnt coins. So just to be sure we calculate with the upper bound of the attacker's cost, we include the price of all the burnt assets on Ethereum blockchain to the attacker's cost. For example, if we consider SYSX token, which is one-to-one binding of SYS token transferred to Ethereum network, and if the minimal asset value that can be burnt stays at 3 tokens, the cost of burnt assets would be roughly 0.03 * 3 * 1,000 = 90 USD.

The attacker now waits for MAX_ETHEREUM_TX_ROOTS - 1 (= 39,999) blocks to be added to the Ethereum blockchain on the top of block A1. The setup phase is now complete.

Attacking Phase

First, we describe the attack in the simplest form. Then we discuss how the attacker can increase her chances of each attempt in exchange for (small) additional cost on her side. We assume the attacker to be operating a well-connected Ethereum node (it does not need to be a full node) or employ any other mechanism through which it can get information about newly created blocks on the Ethereum blockchain as soon as possible after they are added to the blockchain (within a few seconds is sufficient for the attack, even if inconsistent).

The attacker now repeatedly creates and propagates Syscoin mint transactions (nVersion = SYSCOIN_TX_VERSION_ALLOCATION_MINT) with SPV proofs of burning transactions from Ai. We denote these mint transactions as Mi, where i is an index of the corresponding burning transaction from Ai. The attacker increases the index i as it gets information about new blocks being created on Ethereum. The attacker releases Mi to the Syscoin network trying to maintain the optimal gap of MAX_ETHEREUM_TX_ROOTS blocks between Ethereum blockchain tip height and h(Ai).

For example, suppose that at the end of the setup phase the Ethereum blockchain tip height is 100,000. Thus h(A1) = 60,001, h(A2) = 60,002, h(A3) = 60,003, etc. Let's examine the state of a fully synced Syscoin node, which relayer is synced as well. At this moment, in CheckSyscoinMint, fGethSyncHeight = 100,000 and cutoffHeight = 60,000. We assume most of the nodes in the network are synced similarly, but we also allow nodes to be reasonably behind as well – those node would potentially have slightly lower values in fGethSyncHeight and cutoffHeight. The attacker now creates and propagates M1, which is a valid transaction at this moment (from the point of view of all nodes that are synced or reasonably behind) and it is going to be mined in the next block by arbitrary miner on the Syscoin network.

Now 3 different scenarios can happen:

1) A new block on Ethereum is found before a new block on Syscoin is found AND the Syscoin miner that mines the next block receives the information from the relayer before the Syscoin block is found. When this happens, the miner is already working on a block template and trying to find a solution. If the solution is found, the new block will contain transaction M1, which is now invalid (from the point of view of this miner) because it is going to be rejected by CheckSyscoinMint's code:

    if(ethTxRootShouldExist){
        LOCK(cs_ethsyncheight);
        // cutoff is ~1 week of blocks is about 40K blocks
        cutoffHeight = fGethSyncHeight - MAX_ETHEREUM_TX_ROOTS;
        if(fGethSyncHeight >= MAX_ETHEREUM_TX_ROOTS && mintSyscoin.nBlockNumber <= (uint32_t)cutoffHeight) {
            errorMessage = "SYSCOIN_CONSENSUS_ERROR ERRCODE: 1001 - " + _("The block height is too old, your SPV proof is invalid. SPV Proof must be done within 40000 blocks of the burn transaction on Ethereum blockchain");
            bTxRootError = true;
            return false;
        }

Here mintSyscoin.nBlockNumber = h(A1) = 60,001 for M1, but fGethSyncHeight is 100,001 or greater, therefore M1 fails the second condition as the cutoffHeight is 60,001. Such a block would thus not be propagated to the network and the miner will try again but it will no longer put M1 into the block. In this case, the attacker's attempt with M1 was unsuccessful and A1 and M1 are wasted.

2) A new block on Ethereum is found before a new block on Syscoin is found AND the Syscoin miner that mines the next block receives the information from the relayer after the Syscoin block is found; OR the Ethereum block was found roughly at the same time as the Syscoin block. In this case, the Syscoin miner creates a subjectively valid block and will propagate it to the network. However, because of the propagation delay and validation delay and asynchronicity with Ethereum relayer, some of the nodes on the network could have been informed by their relayers about the new Ethereum block with height 100,001 before the Syscoin block containing M1 reaches them, or before their validation reaches CheckSyscoinInputs call in ConnectBlock. Such nodes would reject the block (by the CheckSyscoinMint's code as above). Because of setting bTxRootError to true, no banning would be involved for the rejected block.

In this case the attack attempt with M1 was successful and affected all nodes that rejected the block. Moreover, if a miner is affected, it stops mining. See the network takeover attack below for more details.

3) A new block on Ethereum is found well after a new block on Syscoin is propagated throughout the network, such that all nodes accept the block. In this case the attempt with M1 was unsuccessful and wasted.


The attacker now continues to attack with this scheme as long as she has unused burnt transaction blocks Ai, for which she can create mint transactions Mi. Over the time the total number of affected nodes increases as some of the attempts are successful and some are not. Those that are successful increase the number of affected nodes. Unsuccessful attempts do not decrease the total number of affected nodes.

The cost of this phase is (at the time of writing) negligible, it only requires transaction fees on the Syscoin network. Blocks are empty on Syscoin, thus fees are minimal, so the cumulative fee used by the attacker for all 1,000 attempts in this stage is below 10 USD.

Impact

Once a node is affected by the attack, it stops following the best chain because it rejects a block that is valid for other nodes. The affected part of the network is cut off and stalls because all miners are disabled there. After the successful attacking block is rejected by affected node, another block can be mined on the top of it by unaffected miners. Such a block is going to be propagated even to the already affected nodes, and they will recognise that the block's total chain work is better than their best chain and they try to connect this block. This attempt, however, requires the previously rejected block to be connected. At this point the Ethereum network likely advanced even more, but even if it did not, the previously rejected block is going to be rejected again and therefore the attempt to connect to the chain with the newly created block built on the top of the rejected one will fail as well.

Optimizations

One optimisation that the attacker can do is that they can make sure that the Syscoin blocks with attacking transactions are propagated and validated slower (remember, we are trying to run into a race condition with the relayer). This can be done through transaction spam. Since Syscoin block weight limit is 4 times higher than Bitcoin's, we expect that about 16,000 transactions can easily be put into every block, likely even more.

There are two factors here. First, the propagation time, which is affected by the block size and validation time of CheckBlock and AcceptBlock, which are executed before the block is propagated through the high bandwidth compact block mode. Second, the validation time of the part of ConnectBlock before CheckSyscoinMint is called. This means that specially constructed output scripts that are superslow to validate (e.g. non-segwit transactions with quadratic hashing problem) are of no use because the scripts are only executed in ConnectBlock after (or in parallel with) CheckSyscoinMint. So the attacker should rather use segwit transactions to increase the absolute block size and the total number of transactions.

It seems optimal for the attacker to construct as many small transactions as possible and, if possible, to make transaction inputs spread among as many blocks as possible to enforce as many disk operations as possible. But even if the attacker just naïvely fills the blocks with random transactions without additional complexity, validation of 16,000+ transactions and propagation of 8+ MB blocks should be slow enough to increase the chances of attacker's attempts.

Because of minimal fees on Syscoin, even producing so many transactions is still very cheap for the attacker and the expected cost per a block full of spam transactions is less than 0.1 USD. Using this optimisation, we believe it is possible to prolong the block propagation to the whole network such that it takes several seconds, which makes the whole attack very practical. We haven't done any measurements, however, so this is only an estimate. Notably, the attack is still valid even without this optimisation as only the chance of a single attempt is affected and thus the attacker may only need more attempts before it succeeds to split the network if the optimization is not used.

Other Factors

There are couple of other factors that influence attacker's success rate. First, we assumed the attacker has no access to any hash rate, so all the blocks related to the attack are mined by honest miners. In case the attacker has significant hash rate on her side, she can take advantage of the time she releases the blocks with the attacking transactions. This is especially true if the attacker has very fast notification channel about new blocks from the Ethereum network. However, in this case it is rather better for the attacker to perform the second attack described below.

Another factor is the network topology, especially the high bandwidth compact block subgraph. We have assumed the attacker is not powerful enough to influence the network topology, although it is partially possible to somehow influence the subgraph if the attacker has significant hash rate on her side. We consider this beyond the scope of this report, but again, the second way of attacking is better if the attacker has some hash rate.

Finally, the quality of the Ethereum relayer can be a big factor in this attack. In the description above we assumed that the relayer works perfectly and is very reliable and fast for all nodes, but from our tests and from our analysis of the current implementation of the relayer, it seems that this is not the case. The implementation contains various delays, which can cause that the node is informed about new blocks from Ethereum many seconds after they appear in the network. This actually makes it easier for the attacker, because it causes a great variance in whether each node considers the attacker's transactions valid or not. This means it's much more likely with the current implementation of the relayer to cause network splits.

Description of Network Takeover Attack

Setup Phase

In the previous attack, the attacker needed quite a large setup. In this case, the attacker makes a similar setup, except that there will be no waste. The attacker prepares Syscoin mint transaction (nVersion = SYSCOIN_TX_VERSION_ALLOCATION_MINT) as it did in the previous attack. This time, however, the attacker waits little bit longer, so that some of the transactions become invalid because they won't make it through the cut off rule. The transactions are not propagated to network. If the attacker wants to just stall the blockchain, only one invalid transaction is needed. If the attacker wants to completely take over the whole network, she will need more such transactions.

Attacking Phase

Also the attacking phase is much simpler here. To initiate, the attacker just needs to mine a block, to which she includes a single invalid transaction from the setup phase. The attacker propagates the block to the network. Such a block is invalid because the attacker's transaction won't make it through the cut off rule as described above. However, this is only detected in ConnectBlock, so the high bandwidth compact block relay mode, which is executed at the end of AcceptBlock, will make sure that the block is sent to the whole network.

For every node that receives the block the following happens. The node will go through CheckBlock and AcceptBlock successfully, then it goes to ActivateBestChain and subsequently to ConnectBlock. All the validation succeeds except for CheckSyscoinMint at the end of ConnectBlock. We already mentioned that the invalid transaction in a block is rejected in a way that does not lead to any bans. Moreover, the block itself is not marked as invalid. This is crucial together with the fact that AcceptBlock was passed successfully.

This causes the rejected, but not invalidated, block to be put into setBlockIndexCandidates. This is an ordered set of candidates for the new blockchain tip. Whenever a node tries to connect new branch that ends with such a tip, invalid blocks on the new branch and blocks built on the top of them are removed from the set. But in our case, when the block is just rejected, the rejected block stays in the set. This means that the tip of the chain of every node is going to stay where it was before this block was received. Miners will not attempt to prolong the chain with the invalid attacker's block, they will try to mine on the top of its previous block – what they still consider as the tip of their best chain. However, when any miner creates a new block, it will call ProcessNewBlock and go through the normal validation flow, which causes the newly minted, perfectly valid, block to be inserted into setBlockIndexCandidates.

Crucially, both these blocks – the newly minted block and the attacker's block – will have the same difficulty. setBlockIndexCandidates set is (inversely) ordered firstly by the difficulty of blocks and secondly by the time the node received them. But the attacker's block came in first. This will cause FindMostWorkChain called from ActivateBestChain to always pick the attacker's block as the next chain tip candidate. As a consequence, no node, including miners, is going to advance with their chain as soon as they receive attacker's block with height one over their current best block.

Finally, even if the node is restarted, it won't help. But it won't help in a surprising way – when the blocks are loaded from the disk, attacker's blocks will be included to the main chain because crucial part of CheckSyscoinMint validation are skipped before the relayer is synced. But then if such a restarted node creates a new block on the top of attacker's invalid block, the rest of the network will simply reject that as an invalid chain.

Attacker Can Mine

So how does the attacker proceeds after she makes the whole network stuck? She just needs to mine more than one block on the top of the last valid block before she propagates her new chain. When a chain with better cumulative work is presented to the stuck node, it will get unstuck. This is because a better block with more work will go into setBlockIndexCandidates and it will be selected instead of the invalid block that was there before as the best one according to the ordering. In order to maintain her position as the only block producer, the attacker needs to mine another invalid block on the top of every longer chain that she presents to the network. By doing that, the attacker makes sure that the chain grows with her new blocks, but all nodes are always stuck due to the last block being invalid. The attacker can decide how many valid blocks she wants to mine before she includes the invalid one and propagates blocks to the network. This comes with the overhead of mining the invalid block, so for example if the attacker mines nine valid blocks and then one invalid on the top of that, her efficiency is 90%. This will put the attacker in the exclusive position of the only block producer.

Cost

The cost of this type of attack is simply the ability to mine blocks plus some negligible cost of creating Syscoin mint transactions. The cost here thus mostly depends on the Syscoin difficulty, which currently is about 1.2% of mining a Bitcoin block per each Syscoin block (Syscoin is merged mined with Bitcoin). Note that the attacker does not need to initiate the attack at any specific time, she just needs to mine her block eventually, for example within one or two days or even weeks. Such amount of hashing power is cheaply available on NiceHash, and the actual cost very much depends on how much is the attacker willing to wait. Should the attacker then wishes to fully take over the network, she should invest in a reasonable amount of hashing power, so that she mines blocks remaining in the current retargeting period (360 blocks) in reasonable time. As the difficulty lowers, the cost of the attack goes down and becomes cheaper and cheaper.




art_of_bug
art_of_bug

We are research group with focus to expose bugs in design and implementation of blockchain projects. We only honour responsible disclosure with projects that honour responsible development.


art_of_bug
art_of_bug

We are research group with focus to expose bugs in design and implementation of blockchain projects. We only honour responsible disclosure with projects that honour responsible development.

Send a $0.01 microtip in crypto to the author, and earn yourself as you read!

20% to author / 80% to me.
We pay the tips from our rewards pool.