This is a study note of How Cairo Works. It covers both low-level and high-level language features of Cairo.
1 Introduction to Cairo
In Cairo, the basic data type
felt (field element) is an integer in the range
0 <= x <P, where
P is a 256-bit prime number. All the computations are done modulo
P. In Cairo, the divisioin
x / y satisfies
(x / y) * y == x, therefore
2 * (P + 1) / 2 = P + 1 which is
1 (mod P),
1 / 2 = (P + 1 ) / 2. Additionally, multiply a number by
3 doesn’t always result in a number divisible by
The goal of a Cairo program is to prove that some computation is correct. A nondeterministic computation method is often used to first guess a value (the nondeterministic part), then verify other properites.
Cairo uses a read-only nondeterministic memory model. The value for each memory cell is chosen by the prover, but is immutable during the program execution. We use the syntax
[x] to represent the value of the memory at address
x. The memory can be thought as a
write-once memory. The
assert [x] == y can be interpreted either as
write value y to address x or
verify that the value of address x is y. They mean the same thing in read-only nondeterministic memory model. The memory addresses accessed by a program must be continuous. The memory offsets must be in the range of
The only values that may change over time are three register values:
ap: the allocation pointer points to a yet-unused memory cell.
fp: the frame pointer points to the frame of the current function. The addresses of all the function’s arguments and local variables are relative to the value of
fp. When a function starts,
fpis equal to
ap. But unlike
ap, the value of
fpremains the same throughout the scope of a function.
pc: the program counter points to the current instruction.
A simple Cairo instruction takes the form of an assertion for equality, so named assert-equal instruction. For example:
[ap] = [ap - 1] * [fp]; ap++ states that the production of two cells is written to the value of the next unused cell.
; ap++ is not a separate instruction, it is part of the assert-euqal instruction.
pc are pointers to memory cells. The values of memory cells are
2 The Program Counter and Reference
2.1 The Program Counter
Each instruction takes 1 or 2 field elements. When an immediate value is used in the instruction, it takes 2 field elements. The program counter (pc) keeps the address of the current insturciton. Usually it advances by 1 or 2 – the size of the instruciton.
jmp instruction may be used to jump to a different instruction in three flavors:
jmp abs 17:
pc = 17.
jmp rel 17:
pc = pc + 17
jmp label: a relative jump.
jmp rel and
jmp lable can be used with an
if condition for conditonal jump.
let x = <expr> syntax defines a reference to the
expr’s value and substitue
x accordingly when it is used later. A reference can hold any expression and may depende on
fp. For example:
let x =[[fp + 3] + 1].
assert <compound-expr> = <compound-expr> asserts the equality between the values of two compound expressions. The Cairo compiles it into multiple instructions and
ap may advance an unknown number of steps. Hence, you should avoid using
fp in compound expression and use the referece, temporary, or local variable. For example,
let x = [ap - 1] and
let y = [ap - 2], then
assert x * x = x + 5 * y. Here the
ap chanes dynamically during the execution of the compound expression.
References are not variables: the scope of each definition is defined according to a static analysis of the order in which the instructions will be executed. It will follow a basic flow from jumps and conditional jumps, but if there are colliding definitions for the same reference, the reference will be revoked.
If there is a label or a call instruction between the definition of a reference that depends on
ap and its usage, the reference may be revoked, since the compiler may not be able to compute the change of
ap for jump and call. Sometime the compiler will not automatically detect a jump and the reference will not be revoked. Using the reference in such cases may result in an undefined behavior. A reference depends on
fp is never revoked.
ref_name is a typed reference (
T*) with value
val and type
T.member_name is a member definition, compiles to
[ref_name + T.member_name]. The
member_name is the memory offset. The result is
[val + offset]. You may omit the type when define a typed reference like:
let ptr = cast([fp], MyStruct*) – assume
[fp] contains a point to a struct.
A temporary variable is a syntactic sugar for a reference based on the
ap register. For a simple expression,
tempvar var_name = <expr> is equivalent to
[ap] = <expr>; ap++ and
let var_name = [ap - 1]. A temporary variable is a reference that may be revoked. In practice,
let is used in pointer operations and function call;
tempvar is used for value expression.
Local variables are based on the
fp register. The first local variable will be a reference to
[fp + 0], the second one to
[fp + 1] and so on. The Cairo compiler auto-generates a constant
SIZEOF_LOCALS with is equal to the accumulated size of locals within the same scope. The instruction
alloc_locals is transformed to
ap += SIZEOF_LOCALS.
Note that unless the local variable is defined and initialized in the same line, the
local directive itself does not translate to a Cairo instruction (this is another difference from
tempvar) – it simply translates to a reference definition. This is one of the reasons you must increase the value of
ap manually. The type of a local variable must be explicitly stated (otherwise,
felt is used), and it is not deduced from the type of the initialization value.
You can specify a typed local variable in two different ways:
local x : T* = <expr> and
local y : T = <expr>. The first one allocates one cell whose value is a pointer to a struct of type
[[fp + 0] + T.a]. The second one allocates
y refers to the address of the structure
fp + 1 rather than
[fp + 1].
[fp + 1 + T.a]. Use
y is an error. For example,
tempvar z = y will faile, since it compiles to
[ap] = fp, which is not a valid instruction in Cairo. Nevertheless, defining a variable called
__fp__ will allow the code to compile.
Tuples are represented as a comma-separated list of elements enclosed in parentheses. For example:
(3, x). Cairo requires a trailing comma for single-element tuples, to distinguish them from regular parentheses. For example
(5,) is a single-element tuple. Tuple elements are accessed with the tuple expression followed by brackets containing a zero-based index to the element. The index must be known at compile time.
In order to represent an array (an ordered collection of homogeneous elements) in Cairo, one may use a pointer to the beginning of the array. The standard library function
alloc() may be used to “dynamically” allocate a new array. For example,
let (local struct_array : MyStruct*) = alloc() allocates a new memory segment and treats it as a pointer to
MyStruct. The expression
struct_array[n] is used to access the
n-th element of the array, where
n=0 is the first element.
struct_array[index] is compiled to
[struct_array + index * MyStruct.SIZE], and is of type
3.1 Function Execution
A function is a reusable unit of code that receives arguments and returns a value. Cairo provides two low-level instructions
call addr and
ret to facilitate resuable code. The high-level syntax are
return (). The
call is similar to
jmp label where a function is considered a label.
When a function starts, the
fp is initialized to the current value of
ap. When a function calls another function, the
fp changes for the inner call and restores when the inner call returns. Under the hood,
call addr works as the following:
3.2 Access Register Values
Cairo provides two functions to retrieve the values of the three registers.
When Cairo needs to use the address
fp in a compund expression, it will try to replace
fp with a variable named
__fp__ that contains the value of
fp. For example, in the following code, line B requires line A while line C doesn’t.
3.3 Function Arguments and Return Values
Function arguments are written to the stack before the
call instruction. Because a
call pushes the next
pc and current
fp into the stack, the function arguments are available at
[fp - 3],
[fp - 4], and so on in reverse order. For each argument, the Cairo compiler creates a reference
argname to its value and a constant
Args.argname with its offset
(0, 1, 2, ...). The use of
argname is replaced by
[fp - (2 + n_args) + Args.argname]. Cairo supports the named argment call (a syntactic sugar) that can use compound expression:
The function writes to the stack its return values just before the
ret instruction. The retun values will be available to caller at
[ap - 1],
[ap - 2] and so on. The Cairo compiler creates constants named
foo.Return.w for the return values.
A hint is a block of Python code that will be executed by the prover right before the next instruction.
To access Cairo data, use
memory[ap] to access
ids.x for constant, reference and function arguments. A hint is attached to the next instruction and executed before the next instruction.
5 Program Input & Output
A Cairo program may have a secret program input and a public program output.
You can specify a program input file using
--program_input=input_filename.json flag. Then you can use hints to access the input file content using the variable
To create a program output, you first need to add
%builtins output directive. The directive makes the funciton
main() get one argument and return one value. The argument, conventionally called
output_ptr, is a pointer to a block of memory to which it may write its outputs.
main() should return the value of the pointer after writing, signifying where the chunk of output memory ends.
The program uses a different layout such as
--layout=small that requires the number of steps to be divisible by
512. You can set it via
Cairo programs are themselves kept in memory, in what is called the program segment. This segment is of fixed length and contains the numeric representation of the Cairo program. The program counter
pc starts at the beginning of the program segment.
A Cairo program also requires an execution segment. This is where the register
fp start, and where data generated during the run of a Cairo program (variables, return addresses for function calls, etc.) is stored. The length of the execution segment is variable as it may depend on the program input.
Every builtin has its own variable-length continuous area in memory that is located in its own segment.
7 Builtins and Implicit Arguments
Builtins are predefined optimized low-level execution units which are added to the Cairo CPU board to perform predefined computations which are expensive to perform in vanilla Cairo (e.g., range-checks, Pedersen hash, ECDSA, …).
The CPU communicates with builtins thourgh memeory: each builtin has a segment and applies constraints on the memory cells in the segment.
Implicit argument is a syntactic sugar that adds arguments and return values to the funciton When you use the high-level
return statement, you don’t have to explicitly return the implicit argument. The Cairo compiler returns the current binding of the implicit argument.
There are three ways to call a function with implicit argments:
yis bind to
x. After the call,
yhas the return value of
- Implicitly using an implicit argument with the same name in caller.
- Implicitly using a
withstatement in caller.
Because an implicit argument is implemented as a reference, it might be revoked. The solution is to use a local variable to keep the implicit argument. When there is no branches,
alloc_locals can add the local variables automatically. Adding
tempvar at the end of every branch also works.
Cairo supports a few possible layouts. Each layout specifies which of the different builtins exist and how many instances of that builtin can be used. This is measured as the ratio between the number of instructions (
n_steps) and the number of available builtin instances. The default
plain layout has no builtins.
%builtins directive specifies which builtins are used by the program. Each builtin adds an argument to main() and requires a return value.