SBN

Amarna: Static analysis for Cairo programs

By Filipe Casal

We are open-sourcing Amarna, our new static analyzer and linter for the Cairo programming language. Cairo is a programming language powering several trading exchanges with millions of dollars in assets (such as dYdX, driven by StarkWare) and is the programming language for StarkNet contracts. But, not unlike other languages, it has its share of weird features and footguns. So we will first provide a brief overview of the language, its ecosystem, and some pitfalls in the language that developers should be aware of. We will then present Amarna and discuss how it works, and what it finds, and what we plan to implement down the line.

Introduction to Cairo

Why do we need Cairo?

The purpose of Cairo, and similar languages such as Noir and Leo, is to write “provable programs,” where one party runs the program and creates a proof that it returns a certain output when given a certain input.

Suppose we want to outsource a program’s computation to some (potentially dishonest) server and need to guarantee that the result is correct. With Cairo, we can obtain a proof that the program output the correct result; we need only to verify the proof rather than recomputing the function ourselves (which would defeat the purpose of outsourcing the computation in the first place).

In summary, we take the following steps:

  • Write the function we want to compute.
  • Run the function on the worker machine with the concrete inputs, obtain the result, and generate a proof of validity for the computation.
  • Validate the computation by validating the proof.

The Cairo programming language

As we just explained, the Cairo programming model involves two key roles: the prover, who runs the program and creates a proof that the program returns a certain output, and the verifier, who verifies the proofs created by the prover.

However, in practice, Cairo programmers will not actually generate or verify the proofs themselves. Instead, the ecosystem includes these three pillars:

The fact registry is a database that stores program facts, or values computed from hashes of programs and of their outputs; creating a program fact is a way to bind a program to its output.

This is the basic workflow in Cairo:

  • A user writes a program and submits its trace to the SHARP (via the Cairo playground or the command cairo-sharp).
  • The SHARP creates a STARK proof for the program trace and submits it to the proof verifier contract.
  • The proof verifier contract validates the proof, and, if valid, writes the program fact to the fact registry.
  • Any other user can now query the fact registry contract to check whether that program fact is valid.

There are two other things to keep in mind:

  • Memory in Cairo is write-once: after a value is written to memory, it cannot be changed.
  • The assert statement assert a = b will behave differently depending on whether a is initialized: if a is uninitialized, the assert statement assigns b to a;  if a is initialized, the assert statement asserts that a and b are equal.

Although the details of Cairo’s syntax and keywords are interesting, we will not cover these topics in this post. The official Cairo documentation and Perama’s notes on Cairo are a good starting point for more information.

Setting up and running Cairo code

Now that we’ve briefly outlined the Cairo language in general, let’s discuss how to set up and run Cairo code. Consider the following simple Cairo program. This function computes the Pedersen hash function of a pair of numbers, (input, 1), and outputs the result in the console:

# validate_hash.cairo
%builtins output pedersen

from starkware.cairo.common.cairo_builtins import HashBuiltin
from starkware.cairo.common.hash import hash2
from starkware.cairo.common.serialize import serialize_word

func main{output_ptr:felt*, pedersen_ptr : HashBuiltin*}():
    alloc_locals
    local input
    %{ ids.input = 4242 %}

    # computes the Pedersen hash of the tuple (input, 1)
    let (hash) = hash2{hash_ptr=pedersen_ptr}(input, 1)

    # prints the computed hash
    serialize_word(hash)

    return ()
end

To set up the Cairo tools, we use a Python virtual environment:

$ mkvirtualenv cairo-venv
(cairo-venv)$ pip3 install cairo-lang

Then, we compile the program:

# compile the validate_hash.cairo file,
# writing the output to compiled.json
$ cairo-compile validate_hash.cairo --output compiled.json

Finally, we run the program, which will output the following value:

# run the program
$ cairo-run --program=compiled.json --print_output --layout small
Program output:
  1524309693207128500197192682807522353121026753660881687114217699526941127707

This value is the field element corresponding to the Pedersen hash of (4242, 1).

Now, suppose that we change the input from 4242 to some hidden value and instead provide the verifier with the following output:

$ cairo-run --program=compiled.json --print_output --layout small
Program output:
  1134422549749907873058035660235532262290291351787221961833544516346461369884

Why would the verifier believe us? Well, we can prove that we know the hidden value that will make the program return that output!

To generate the proof, we need to compute the hash of the program to generate the program fact. This hash does not depend on the input value, as the assignment is inside a hint (a quirk of Cairo that we’ll discuss later in this post):

# compute the program's hash
$ cairo-hash-program --program compiled.json                                                
0x3c034247e8bf20ce12c878793cd47c5faa6f5470114a33ac62a90b43cfbb494
 
# compute program fact
from web3 import Web3

def compute_fact(program_hash, program_output):
    fact = Web3.solidityKeccak(['uint256', 'bytes32'],
                               [program_hash, Web3.solidityKeccak(['uint256[]'], [program_output])])

    h = hex(int.from_bytes(fact, 'big'))
    return h

# hash and output computed above
program_hash = 0x3c034247e8bf20ce12c878793cd47c5faa6f5470114a33ac62a90b43cfbb494
program_output = [1134422549749907873058035660235532262290291351787221961833544516346461369884]

print(compute_fact(program_hash, program_output))
# 0xe7551a607a2f15b078c9ae76d2641e60ed12f2943e917e0b1d2e84dc320897f3

Then, we can check the validity of the program fact by using the fact registry contract and calling the isValid function with the program fact as input:

The result of calling the isValid function to check the validity of the program fact                                                                         .

To recap, we ran the program, and the SHARP created a proof that can be queried in the fact registry to check its validity, proving that we actually know the input that would cause the program to output this value.

Now, I can actually tell you that the input I used was 71938042130017, and you can go ahead and check that the result matches.

You can read more about the details of this process in Cairo’s documentation for blockchain developers and more about the fact registry in this article by StarkWare.

Cairo features and footguns

Cairo has several quirks and footguns that can trip up new Cairo programmers. We will describe three Cairo features that are easily misused, leading to security issues: Cairo hints, the interplay between recursion and underconstrained structures, and non-deterministic jumps.

Hints

Hints are special Cairo statements that basically enable the prover to write arbitrary Python code. Yes, the Python code written in a Cairo hint is literally exec’d!

Hints are written inside %{ %}. We already used them in the first example to assign a value to the input variable:

%builtins output

from starkware.cairo.common.serialize import serialize_word

func main{output_ptr:felt*}():

    # arbitrary python code
    %{
       import os
       os.system('whoami')
    %}

    # prints 1
    serialize_word(1)

    return ()
end

 

 
$ cairo-compile hints.cairo --output compiled.json               
$ cairo-run --program=compiled.json --print_output --layout small
fcasal
Program output:
  1

Because Cairo can execute arbitrary Python code in hints, you should not run arbitrary Cairo code on your own machine—doing so can grant full control of your machine to the person who wrote the code.

Hints are commonly used to write code that is only executed by the prover. The proof verifier does not even know that a hint exists because hints do not change the program hash. The following function from the Cairo playground computes the square root of a positive integer, n:

func sqrt(n) -> (res):
    alloc_locals
    local res

    # Set the value of res using a python hint.
    %{
        import math

        # Use the ids variable to access the value
        # of a Cairo variable.
        ids.res = int(math.sqrt(ids.n))
    %}

    # The following line guarantees that 
    # `res` is a square root of `n`
    assert n = res * res
    return (res)
end

The program computes the square root of n by using the Python math library inside the hint. But at verification time, this code does not run, and the verifier needs to check that the result is actually a square root. So, the function includes a check to verify that n equals res * res before the function returns the result.

Underconstrained structures

Cairo lacks support for while and for loops, leaving programmers to use good old recursion for iteration. Let’s consider the “Dynamic allocation” challenge from the Cairo playground. The challenge asks us to write a function that, given a list of elements, will square those elements and return a new list containing those squared elements:

%builtins output

from starkware.cairo.common.alloc import alloc
from starkware.cairo.common.serialize import serialize_word

# Fills `new_array` with the squares of the
# first `length` elements in `array`.
func _inner_sqr_array(array : felt*, new_array : felt*,
                                        length : felt):
    # recursion base case
    if length == 0:
        return ()
    end

    # recursive case: the first element of the new_array will
    # be the first element of the array squared
    # recall that the assert will assign to the
    # `new_array` array at position 0
    # since it has not been initialized
    assert [new_array] = [array] * [array]

    # recursively call, advancing the arrays
    # and subtracting 1 to the array length
    _inner_sqr_array(array=array + 1, 
                     new_array=new_array + 1,
                     length=length - 1)
    return ()
end

func sqr_array(array : felt*, length : felt) ->
                                          (new_array : felt*):
    alloc_locals
    # allocates an arbitrary length array
    let (local res_array) = alloc()

    # fills the newly allocated array with the squares
    # of the elements of array
    _inner_sqr_array(array, res_array, length)
    return (res_array)
end

func main{output_ptr : felt*}():
    alloc_locals

    # Allocate a new array.
    let (local array) = alloc()

    # Fill the new array with field elements.
    assert [array] = 1
    assert [array + 1] = 2
    assert [array + 2] = 3
    assert [array + 3] = 4

    let (new_array) = sqr_array(array=array, length=4)

    # prints the array elements
    serialize_word([new_array])
    serialize_word([new_array + 1])
    serialize_word([new_array + 2])
    serialize_word([new_array + 3])

    return ()
end

Running this code will output the numbers 1, 4, 9, and 16 as expected.

But what happens if an error (or an off-by-one bug) occurs and causes the sqr_array function to be called with a zero length?

func main{output_ptr : felt*}():
    alloc_locals
    # Allocate a new array.
    let (local array) = alloc()
    # Fill the new array with field elements.
    assert [array] = 1
    assert [array + 1] = 2
    assert [array + 2] = 3
    assert [array + 3] = 4

    let (new_array) = sqr_array(array=array, length=0)
    serialize_word([new_array])
    serialize_word([new_array + 1])
    serialize_word([new_array + 2])
    serialize_word([new_array + 3])

    return ()
end

Basically, the following happens:

  1. The sqr_array function will allocate res_array and call _inner_sqr_array(array, res_array, 0).
  2. _inner_sqr_array will compare the length with 0 and return immediately.
  3. sqr_array will return the allocated (but never written to) res_array.

So what happens when you call serialize_word on the first element of new_array?
Well, it depends… Running the code as-is will result in an error because the value of new_array is unknown:

The error that occurs after running the above code as-is                                                                                     .

However, remember that usually you won’t be running code; you’ll be verifying proofs that a program outputs some value. And I can actually provide you proof that this program can output any four values that you would like! You can compute all of this yourself to confirm that I’m not cheating:

$ cairo-compile recursion.cairo --output compiled.json    
$ cairo-hash-program --program compiled.json             
0x1eb05e1deb7ea9dd7bd266abf8aa8a07bf9a62146b11c0bd1da8bb844ff2479

The following fact binds this program with the output [1, 3, 3, 7]:

# hash and output computed above
program_hash = 0x01eb05e1deb7ea9dd7bd266abf8aa8a07bf9a62146b11c0bd1da8bb844ff2479
program_output = [1, 3, 3, 7]

print(compute_fact(program_hash, program_output))
# 0x4703704b8f7411d5195e907c2eba54af809cb05eebc65eb9a9423964409a8a4d

This fact is valid according to the fact registry contract:

The fact registry’s verification of the program fact                                                           .

So what is happening here?

Well, since the returned array is only allocated and never written to (because its length is 0, the recursion stops as soon as it starts), the prover can write to the array in a hint, and the hint code won’t affect the program’s hash!

The “evil” sqr_array function is actually the following:

func sqr_array(array : felt*, length : felt) -> 
                                           (new_array : felt*):
    alloc_locals
    let (local res_array) = alloc()

    %{  # write on the result array if the length is 0
        if ids.length == 0:
            data = [1, 3, 3, 7]
            for idx, d in enumerate(data):
                memory[ids.res_array + idx] = d
    %}

    _inner_sqr_array(array, res_array, length)
    return (res_array)
end

In a nutshell, if there is some bug that makes the length of the array 0, a malicious prover could create any arbitrary result he wants.

You might also ask yourself why, in general, a malicious prover can’t simply add a hint at the end of the program to change the output in any way he wishes. Well, he can, as long as that memory hasn’t been written to before; this is because memory in Cairo is write-once, so you can only write one value to each memory cell.

This pattern of creating the final result array is necessary due to the way memory works in Cairo, but it also carries the risk of a security issue: a simple off-by-one mistake in tracking the length of this array can allow a malicious prover to arbitrarily control the array memory.

Nondeterministic jumps

Nondeterministic jumps are another code pattern that can seem unnatural to a programmer reading Cairo for the first time. They combine hints and conditional jumps to redirect the program’s control flow with some value. This value can be unknown to the verifier, as the prover can set it in a hint.

For example, we can write a program that checks whether two elements, x and y, are equal in the following contrived manner:

func are_equal(x, y) -> (eq):
    # sets the ap register to True or False depending on 
    # the equality of x and y
    %{ memory[ap] = ids.x == ids.y %}
    
    # jump to the label equal if the elements were equal
    jmp equal if [ap] != 0; ap++

    # case x != y
    not_equal:    
    return (0)
    
    # case x == y
    equal:    
    return (1)
end

Running this program will return the expected result (0 for different values and 1 for equal values):

func main{output_ptr : felt*}():

    let (res) = are_equal(1, 2)
    serialize_word(res) # -> 0

    let (res) = are_equal(42, 42)
    serialize_word(res) # -> 1

    return()
end

However, this function is actually vulnerable to a malicious prover. Notice how the jump instruction depends only on the value written in the hint:

    %{ memory[ap] = ids.x == ids.y %}
    jmp equal if [ap] != 0; ap++

And we know that hints are fully controllable by the prover! This means that the prover can write any other code in that hint. In fact, there are no guarantees that the prover actually checked whether x and y are equal, or even that x and y were used in any way. Since there are no other checks in place, the function could return whatever the prover wants it to.

As we saw previously, the program hash does not consider code in hints; therefore, a verifier can’t know whether the correct hint was executed. The malicious prover can provide proofs for any possible output values of the program ((0, 0), (1, 1), (0, 1), or (1, 0)) by changing the hint code and submitting each proof to the SHARP.

So how do we fix it?

Whenever we see nondeterministic jumps, we need to make sure that the jumps are valid, and the verifier needs to validate the jumps in each label:

func are_equal(x, y) -> (eq):
    %{ memory[ap] = ids.x == ids.y %}
    jmp equal if [ap] != 0; ap++

    # case x != y
    not_equal:
    # we are in the not_equal case
    # so we can't have equal x and y
    if x == y:
	# add unsatisfiable assert
        assert x = x + 1
    end
    return (0)
    
    # case x == y
    equal:
    # we are in the equal case
    # so x and y must equal
    assert x = y
    return (1)
end

In this case, the function is simple enough that the code needs only an if statement:

func are_equal(x, y) -> (eq):
    if x == y:
        return (1)
    else:
        return (0)
    end
end

Amarna, our Cairo static analyzer

While auditing Cairo code, we noticed there was essentially no language support of any form, except for syntax highlighting in VScode. Then, as we found issues in the code, we wanted to make sure that similar patterns were not present elsewhere in the codebase.

We decided to build Amarna, a static analyzer for Cairo, to give us the ability to create our own rules and search code patterns of interest to us—not necessarily security vulnerabilities, but any security-sensitive operations that need to be analyzed or need greater attention when reviewing code.

Amarna exports its static analysis results to the SARIF format, allowing us to easily integrate them into VSCode with VSCode’s SARIF Viewer extension and to view warnings underlined in the code:

Cairo code with an underlined dead store (left) and the SARIF Viewer extension showing the results from Amarna (right)                                                                   .

How does Amarna work?

The Cairo compiler is written in Python using lark, a parsing toolkit, to define a grammar and to construct its syntax tree. Using the lark library, it is straightforward to build visitors to a program’s abstract syntax tree. From here, writing a rule is a matter of encoding what you want to find in the tree.

The first rule we wrote was to highlight all uses of arithmetic operations +, -, *, and /. Of course, not all uses of division are insecure, but with these operations underlined, the developer is reminded that Cairo arithmetic works over a finite field and that division is not integer division, as it is in other programming languages. Field arithmetic underflows and overflows are other issues that developers need to be aware of. By highlighting all arithmetic expressions, Amarna helps developers and reviewers to quickly zoom in on locations in the codebase that could be problematic in this regard.

The rule to detect all divisions is very simple: it basically just creates the result object with the file position and adds it to the analysis results:

class ArithmeticOperationsRule(GenericRule):
    """
    Check arithmetic operations:
        - reports ALL multiplications and divisions
        - reports ONLY addition and subtraction that
          do not involve a register like [ap - 1]
    """

    RULE_TEXT = "Cairo arithmetic is defined over a finite field and has potential for overflows."
    RULE_PREFIX = "arithmetic-"

    def expr_div(self, tree: Tree) -> None:
        result = create_result(
            self.fname,
            self.RULE_PREFIX + tree.data,
            self.RULE_TEXT,
            getPosition(tree)
        )
        self.results.append(result)

As we looked for more complex code patterns, we developed three classes of rules:

  • Local rules analyze each file independently. The rule described above, to find all arithmetic operations in a file, is an example of a local rule.
  • Gatherer rules analyze each file independently and gather data to be used by post-processing rules. For example, we have rules to gather all declared functions and all called functions.
  • Post-processing rules run after all files are analyzed and use the data gathered by the gatherer rules. For example, after a gatherer rule finds all declared functions and all called functions in a file, a post-processing rule can find all unused functions by identifying functions that are declared but never called.

So what does Amarna find?

So far, we have implemented 10 rules, whose impacts range from informational rules that help us audit code (marked as Info) to potentially security-sensitive code patterns (marked as Warning):

 

# Rule What it finds Impact Precision
1 Arithmetic operations All uses of arithmetic operations +, -, *, and / Info High
2 Unused arguments Function arguments that are not used in the function they appear in Warning High
3 Unused imports Unused imports Info High
4 Mistyped decorators Mistyped code decorators Info High
5 Unused functions Functions that are never called Info Medium
6 Error codes Function calls that have a return value that must be checked Info High
7 Inconsistent assert usage Asserts that use the same constant in different ways (e.g., assert_le(amount, BOUND) and assert_le(amount, BOUND - 1)) Warning High
8 Dead stores Variables that are assigned values but are not used before a return statement Info Medium
9 Potential unchecked overflows Function calls that ignore the returned overflow flags (e.g., uint256_add) Warning High
10 Caller address return value Function calls to the get_caller_address function Info High

While most of these rules fall into the informational category, they can definitely have security implications: for example, failing to check the return code of a function can be quite serious (imagine if the function is a signature verification); the Error codes rule will find some of these instances.

The Unused arguments rule will find function arguments that are not used in the function they appear in, a common pattern in general purpose programming language linters; this generally indicates that there was some intention of using the argument, but it was never actually used, which might also have security implications. The rule would have found this bug in an OpenZeppelin contract a few months ago which was due to an unchecked nonce, passed as an argument to the execute function.

Going forward

As Cairo is still a developing ecosystem, enumerating all vulnerable patterns can be difficult. We plan to add more rules moving forward, and in the medium/long term, we plan to add more complex analysis features, such as data-flow analysis.

In the meantime, if you have any ideas for vulnerable code patterns, we are more than happy to review feature requests, new rules, bug fixes, issues, and other contributions from the community.

*** This is a Security Bloggers Network syndicated blog from Trail of Bits Blog authored by fcasal. Read the original post at: https://blog.trailofbits.com/2022/04/20/amarna-static-analysis-for-cairo-programs/