home

Measuring Smart Contract Execution Speeds Across Blockchains


Update: Since writing this I've added Juno to these measurements and am tracking smart contract performance on that chain here.


Execution time of example programs on various blockchains.

There is a good amount talk about blockchain performance in terms of transactions per second but a surprisingly small amount about smart contract execution speeds. Once would imagine this being be an important metric.

For CalHacks this weekend Alex, Alec, Dev and I built a system for measuring smart contract execution times across blockchains. We measured the three most popular smart contract platforms Ethereum, Solana, and the Substrate Polkadot parachain. We used Cloudflare Workers as a centralized comparison as it has similar geographic distribution and compute limits to smart contracts.

Our results show that due to relatively tight limits on smart contract execution time network and mining time always dominate. Despite considerable effort we were unable to find a program that notably impacted the execution time of a smart contract while remaining within smart contract execution limits. These observations suggest three things:

  1. Once a smart contract developer has written a functional smart contract there is little payoff to optimizing the code for performance as network and mining latency will dominate.
  2. Smart contract developers concerned about performance should look primarily at transaction throughput and latency when choosing a platform to deploy their contracts.
  3. Even blockchains like Solana which bill themselves as being high performance are much, much slower than their centralized counterparts.

This project was done as part of a weekend hackathon. These is still work to be done to make these results more rigorous. I personally would avoid making any decisions with money involved based entirely on these results. The source code can be found here.

Methodology

We designed a simple programming language capable of running some small performance benchmarks. We then implemented an interpreter for that language on the Ethereum, Solana, and Polkadot blockchains in the form of a smart contract. To perform a measurement we then submit the same program to each chain and time its execution.

We measured the performance of three programs written in this language:

  1. An inefficient, recursive Fibonacci number generator computing the 12th Fibonacci number.
  2. A program designed to “thrash the cache” by repeatedly making modifications to disparate memory locations.
  3. A simple program consisting of two instructions to measure cold start times

To compute execution time we measured the time between when the transaction to run the start contract was sent and when it was confirmed by the blockchain. Due to budgetary constraints our testing was done on test networks.

We understand that this is an imperfect proxy for actual code execution time. In the meantime we imagine that most users of a smart contract benchmarking system care primarily about total transaction time. This is the time delay that users of their smart contracts will experience and also the time that we measure.

To provide a non-blockchain alternative we also wrote a runtime on top of Cloudflare Workers as a point of comparison. Like these smart contracts Cloudflare Workers run in geographically distributed locations and feature reasonably strict limitations on runtime resource consumption.

Results

Solana was the fastest blockchain that we measured. It was still two orders of magnitude slower than Cloudflare Workers.

While Solana was faster than Polkadot and Ethereum in our benchmarks it also presented the most restrictive computational limits. The following plot shows the largest Fibonacci number computable on each blockchain before computational limits were exceeded. Once again we include Cloudflare Workers as a non-blockchain baseline.

We were unable to write a program that significantly changed smart contract execution times. This appeared to be because compute limits on smart contracts were so restrictive that we could not submit programs where the execution time would be significant relative to the network and mining times.

The benchmarking language

To provide a single interface for performance measurements we designed and implemented a 17 instruction programming language we called Arcesco. For each platform we then implement a runtime for Arcesco and time the execution of a standard suite of programs.

Each runtime takes assembled Arcesco bytecode through stdin and prints the execution result to stdout. An example invocation might look like this:

cat program.bc | assembler | runtime

This unified runtime interface means that very different runtimes can be plugged in and run the same way. As testament to the simplicity of runtime implementations we were able to implement five different runtimes over the course of the weekend.

Arcesco is designed as a simple stack machine which is as easy as possible to implement an interpreter for. An example Arcesco program that computes the 10th Fibonacci number looks like this:

    pi 10
    call fib
    exit
fib:
    copy
    pi 3
    jlt done
    copy
    pi 1
    sub
    call fib
    rot 1
    pi 2
    sub
    call fib
    add
done:
    ret

To simplify the job of Arcesco interpreters we have written a very simple bytecode compiler for Arcesco which replaces labels with relative jumps and encodes instructions into 40 bit instructions. That entire pipeline for the above program looks like this:

  text          |  assembled    |  bytecode
----------------|---------------|--------------------
                |               |
    pi 10       |    pi 10      |    0x010a000000
    call fib    |    call 2     |    0x0e02000000
    exit        |    exit       |    0x1100000000
fib:            |               |
    copy        |    copy       |    0x0200000000
    pi 3        |    pi 3       |    0x0103000000
    jlt done    |    jlt 10     |    0x0b0a000000
    copy        |    copy       |    0x0200000000
    pi 1        |    pi 1       |    0x0101000000
    sub         |    sub        |    0x0400000000
    call fib    |    call -6    |    0x0efaffffff
    rot 1       |    rot 1      |    0x0d01000000
    pi 2        |    pi 2       |    0x0102000000
    sub         |    sub        |    0x0400000000
    call fib    |    call -10   |    0x0ef6ffffff
    add         |    add        |    0x0300000000
done:           |               |
    ret         |    ret        |    0x0f00000000
                |               |

Each bytecode instruction is five bytes. The first byte is the instructions opcode and the next four are its immediate. Even instructions without immediates are encoded this way to simplify instruction decoding in interpreters. We understand this to be a small performance trade off but as much as possible we were optimizing for ease of interpretation.

0        8                              40
+--------+-------------------------------+
| opcode |    immediate                  |
+--------+-------------------------------+

The result of this is that an interpreter for Arcesco bytecode is just a simple while loop and switch statement. Each bytecode instruction being the same size and format makes decoding instructions very simple.

while True:
    switch opcode:
    case 1:
        stack.push(immediate)
        break
    # etc..

This makes it very simple to implement an interpreter for Arcesco bytecode which is essential for smart contracts where larger programs are more expensive and less auditable.

A complete reference for the Arcesco instruction set is below.

opcode | instruction  | explanation
-----------------------------------
1      | pi   <value> | push immediate - pushes VALUE to the stack
2      | copy         | duplicates the value on top of the stack
3      | add          | pops two values off the stack and adds them pushing
                        the result back onto the stack.
4      | sub          | like add but subtracts.
5      | mul          | like add but multiplies.
6      | div          | like add but divides.
7      | mod          | like add but modulus.
8      | jump <label> | moves program execution to LABEL
9      | jeq  <label> | moves program execution to LABEL if the two two
                        stack values are equal. Pops those values from the
                        stack.
10     | jneq <label> | like jeq but not equal.
11     | jlt  <label> | like jeq but less than.
12     | jgt  <label> | like jeq but greater than.
13     | rot  <value> | swaps stack item VALUE items from the top with the
                        stack item VALUE-1 items from the top. VALUE must
                        be >= 1.
14     | call <label> | moves program execution to LABEL and places the
                        current PC on the runtime's call stack
15     | ret          | sets PC to the value on top of the call stack and
                        pops that value.
16     | pop          | pops the value on top of the stack.
17     | exit         | terminates program execution. The value at the top
                        of the stack is the program's return value.

Future work

We would like to implement a fuzzer that integrates with this project to generate slow bytecode. This might reveal additional information about the performance characteristics of different blockchain’s smart contract execution systems.

Reflections on smart contract development

Despite a lot of hype about smart contracts we found that writing them was quite painful.

Solana was far and away the most pleasant to work with as its solana-test-validator program made local development easy. Solana’s documentation was also approachable and centralized. The process of actually executing a Solana smart contract after it was deployed was very low level and required a pretty good understanding of the entire stack before it could be done.

Ethereum comes in at a nice second. The documentation was reasonably approachable and the sheer size of the Ethereum community meant that there was almost too much information. Unlike Solana though, we were unable to set up a functional local development environment which meant that the code -> compile -> test feedback loop was slow. Working on Ethereum felt like working on a large C++ project where you spend much of your time waiting for things to compile.

Polkadot was an abject nightmare to work with. The documentation was massively confusing and what tutorials did exist failed to explain how one might interface with a smart contract outside of some silly web UI. This was surprising given that Polkadot has a $43 billion market cap and was regularly featured in “best smart contract” articles that we read at the beginning of this hackathon.

We had a lot of fun working on this project. Externally, it can often be very hard to tell the truth from marketing fiction when looking in the blockchain space. It was fun to dig into the technical details of it for a weekend.


☕️☕️

If you found this interesting and would like to buy us a coffee you can send some Solana to EzKjo8nGv37m5Qp7P5N1pTZsHG4J8C15rSwUWu9nAwfX. If anything ends up getting sent to that wallet it'll be my first time seeing crypto being used outside of an investment context which sounds interesting.