This is a study note of Solidity langauge description. It describes contract and Solidity language details.

1 Layout of a Solidity Source File

A Solidity source file has the following parts:

  • License Identifier. Every source file should start with a comment indicating its license. The license comment is included in the bytecode metadata. For example: // SPDX-License-Identifier: MIT. The Software Package Data Exchange (SPDX) is an open standard for software bill of material information, including components, licenses, copyright and references. SPDX maitains a list of software licenses. If your license is not open-source, use the special value UNLICENSED.
  • Pragma. It is a local directive that enables certain compiler features or checks. A source file should use a version pragma like pragma solidity ^0.8.2. The version pragma follows the NPM syntax.
  • Importing other files. import * as symbolName from "filename" or import "filename" as symbolName imports all global symbols from the filename into the current global scope with a prefix of symbolName. To import a specific symbols, use import {symbol1 as alias, symbol2} from "filename". The filename is an import path. The Solidity compiler maitains an internal database, a virtual filesystem (VFS), where is source unit is assigned a unique source unit name used for path resolution.
  • Comments. Comments can be single-line // or multi-line /*...*/. Additionally, Solidity uses NatSpec comments that include /// and /**...*/. It is recommended that Solidity contracts are fully annotated using NatSpec for all public interfaces.
  • Contract. It is similar to classes in other OO languages. A contract can contain state variables, functions, function modifiers, events, errors, struct types and enum types. Contract can inherit from other contracts.

Solidity has a Style Guide for writing Solidity code.

2 Types

There is no undefined or null values in Solidity. Newly declared variables have a default value dependent on its type. To handle any unexpected values, use revert or return a tuple with a second bool value denoting success.

Solidity has two types: value types and reference types. Mapping type is a specifal reference type.

2.1 Value Types

You always get an independent copy whenever a value type is used.

  • bool: two values of true and false. Operators are !, &&, ||, == and !=. && and || support short-circuiting evaluation.
  • int/uint:
    • signed and unsigned integers of various sizes in steps of 8. From int8, uint8 to int256 (int), uint256 (uint).
    • Operators include
      • comparisons <=, < ==, !=, >=, >;
      • bit operators &, |, ^ (bitwise exclusive), ~ (bitwise negation);
      • shift operators << and >>; arithmetic operators: +, -, *, /, %, **.
    • you can use typx(X).min and type(X).max to access the minimum and maximum value representable by the type.
    • Operations can be checked (the default) or unchecked {...}.
  • fixed/ufixed: signed and unsigned fixed point number of various sizes. They support comparison and arithmetic operators.
  • address/address payable
    • holds a 20 byte value and has a balance property.
    • address payable has two additional members: transfer adn send. - There are rules for converting address and uint160, bytes20, integer literals. Addresses support comparisons.
  • Contract types: every contract defines its own type. Contracts can be explicitly converted to and from the address type.
    • The data representation of a contract is identical to that of the address type and this type is also used in the ABI.
    • Contracts do not support any operators.
    • The members of contract types are the external functions of the contract including any state variables marked as public.
  • Fixed-size byte arrays: bytes1, bytes2, …, bytes32.
    • Operators are comparison, bit operators, shift operators, and index access.
    • the .length returns the length of byte array.
  • Literals
    • Adress literals: Hexadecimal literals that pass the address checksum test.
    • Rational and Integer Literals
    • String literals: marked with either double or single-quotes and can also be split into multiple consecutive parts. String literals can only contain printable ASCII characters.
    • Unicode literals: unicode'Hello'.
    • Hexadecimal literals: hex'12FF.
  • Enums: one way to create a user-defined type in Solidity. They are explicitly convertible to and from all integer types but implicit conversion is not allowed.
  • User defined value types. type C is V defined a new type C for the bult-in value type V.
  • Function types are the types of functions.
    • It is defined as function (<parameter types>) {internal|external} [pure|view|payable] [returns (<return types>)]. The return types cannot be empty - if the function type should not return anything, the whole returns (<return types>) part has to be omitted.
    • Exteranl functions have two members: .address for its contract, .selector for its ABI function selector.

2.2 Reference Types

Values of reference type can be modified through multiple different names. Reference types comprise structs, arrays and mappings. You have to explicitly provide the data area where the reference type value is stored:

  • memory: whose lifetime is limited to an external function call.
  • storage: the location where the state variables are stored, where the lifetime is limited to the lifetime of a contract.
  • calldata: special data location that contains the function arguments. Calldata is non-modifiable, non-persistent.

Data locations are not only relevant for persistency of data, but also for the semantics of assignments:

  • Assignments between storage and memory (or from calldata) always create an independent copy.
  • Assignments from memory to memory only create references. This means that changes to one memory variable are also visible in all other memory variables that refer to the same data.
  • Assignments from storage to a local storage variable also only assign a reference.
  • All other assignments to storage always copy. Examples for this case are assignments to state variables or to members of local variables of storage struct type, even if the local variable itself is just a reference.

Solidity has the following reference types:

  • Arrays
    • can have a fixed-size or a dynamic size
    • bytes and string are special arrys.
    • Memory arrays with dynamic length can be created using the new operator. Unlike storage arrays, you cannot resize memory arrays.
    • Array members: length, push, pop, slicing with x[start:end]
  • Struct

2.3 Mapping Types

Mapping types use the syntax mapping(KeyType => ValueType) and variables of mapping type are declared using the syntax mapping(KeyType => ValueType) VariableName. The KeyType can be any built-in value type, bytes, string, or any contract or enum type. Mappings can only have a data location of storage. They cannot be used as parameters or return parameters of contract functions that are publicly visible. You can mark state variables of mapping type as public and Solidity creates a getter for you. You cannot iterate over mappings.

2.4 Special Operators

The ternary operator is used in expressions of the form <expression> ? <trueExpression> : <falseExpression>.

There are compound operators such as a--, a += 7.

delete a assigns the initial value for the type to a. delete a[x] deletes the item at index x of the array.

3 Units and Globally Available Variables

The Ether units are wei, gwei and ether.

The time units are seconds, minutes, hours, days and weeks.

Block and transaction properties: blockhash(), gasleft(), block(basefee, coinbse, difficulty, number, timestamp), msg (data, sender, sig, value) and tx (gasprice, origin).

There are ABI encoding and decoding functions.

Error handling functions are assert, require, revert.

Math/cryptographic funcitons are addmod, mulmod, keccak256, sha256, ripemd160, ecrecover.

Others:, this (current conract), selfdestruct, type(C).name, type(C).creationCode, type(C).runtimeCode, type(I).interfaceId.

4 Expression and Control Structures

Control structures: if, else, while, do, for, break, continue, return.

For external function calls and contract creation calls, you can use try and catch.

Functions of other contracts have to be called externally. For an external call, all function arguments have to be copied to memory. It is a message call as part of the overall transaction.

Function call arguments can be given by name, in any order, if they are enclosed in { }. The names of unused parameters (especially return parameters) can be omitted.

A contract can create other contracts using the new keyword. The full code of the contract being created has to be known when the creating contract is compiled so recursive creation-dependencies are not possible. If you specify the option salt (a bytes32 value), then contract creation will use a different mechanism (the nonce is not used) to come up with the address of the new contract.

Scoping in Solidity follows the widespread scoping rules of C99.

Solidity uses state-reverting exceptions to handle errors. Such an exception undoes all changes made to the state in the current call (and all its sub-calls) and flags an error to the caller.

When exceptions happen in a sub-call, they “bubble up” (i.e., exceptions are rethrown) automatically unless they are caught in a try/catch statement. Exceptions to this rule are send and the low-level functions call, delegatecall and staticcall: they return false as their first return value in case of an exception instead of “bubbling up”.

5 Contracts

A contract defines an execution context. A contract contains persistent data in state variables and functions to access these variables.

5.1 Creating Contracts

Contracts can be created “from outside” via Ethereum transactions or “from within” via contracts. When a contract is created, its optional contructor is executed once. After the constructor has executed, the deployed code does not include the constructor code or internal functions only called from the constructor.

5.2 Visibility and State Mutability

State variable visibility includes:

  • public: compiler automatically generates a getter function for each public state variable. Within the same contract, this.x invokes the getter while x access it directly from storage.
  • internal: the default access level for state variables, can only be accessed within the contract and derived contracts.
  • private: can only be access within the contract, not derived and outside contracts.

Function Visibility:

  • external: can not be called internally as f(), but this.f() works.
  • public: can be called internally or publicly.
  • internal: within the current contract or derived contracts.
  • private: only within the current contract.

Function modifiers are used to change the behaviour of functions in a declarative way. Modifiers are inheritable properties of contracts and may be overridden by derived contracts, but only if they are marked virtual. If you want to access a modifier m defined in a contract C, you can use C.m to reference it without virtual lookup. Multiple modifiers are applied to a function by specifying them in a whitespace-separated list and are evaluated in the order presented.

State variables can be declared as constant or immutable. In both cases, the variables cannot be modified after the contract has been constructed. For constant variables, the value has to be fixed at compile-time, while for immutable, it can still be assigned at construction time. It is also possible to define constant variables at the file level. The compiler does not reserve a storage slot for these variables, and every occurrence is replaced by the respective value. The only supported types are string (only for constants) and value types.

5.3 Functions

Functions can be defined inside and outside of contracts.Functions outside of a contract, also called “free functions”, always have implicit internal visibility. Their code is included in all contracts that call them, similar to internal library functions.

Functions can be declared with view to promoise not to modify the state. The following states modify the state:

  • Writing to state variables
  • Emitting events
  • Using selfdestruct
  • Sending Ether
  • Calling any function not marked view or pure
  • Using low-level calls
  • Using inline assembly that contains certain opcodes

Fuctions not read from or modify the state can be declared pure. States include state variables, <address>.balance, any member of block, tx, msg (except msg.sig and, calling any function not marked pure, and using inline assembly that contains certain opcodes.

Special Functions

  • Receive Ether function: A contract can have at most one receive() external payable { ... } (without the function keyword). The receive function is executed on a call to the contract with empty calldata. This is the function that is executed on plain Ether transfers (e.g. via .send() or .transfer()).
  • Fallback function: A contract can have at most one fallback function, declared using either fallback () external [payable] or fallback (bytes calldata input) external [payable] returns (bytes memory output) (both without the function keyword). The fallback function is executed on a call to the contract if none of the other functions match the given function signature, or if no data was supplied at all and there is no receive Ether function. The fallback function always receives data, but in order to also receive Ether it must be marked payable.

5.4 Events

Solidity events give an abstraction on top of the EVM’s logging functionality. Applications can subscribe and listen to these events through the RPC interface of an Ethereum client.

When you call emit events, they cause the arguments to be stored in the transaction’s log - a special data structure in the blockchain. The Log and its events are not accessible from within contracts.

You can add the attribute indexed to up to three parameters which adds them to a special data structure known as “topics” instead of the data part of the log. All parameters without the indexed attribute are ABI-encoded into the data part of the log.

5.5 Errors and the revert Statement

Errors in Solidity provide a convenient and gas-efficient way to explain to the user why an operation failed. They can be defined inside and outside of contracts (including interfaces and libraries). They have to be used together with the revert statement which causes all changes in the current call to be reverted and passes the error data back to the caller.

5.6 Inheritance

Solidity supports multiple inheritance and polymorphism.

Polymorphism means that a function call (internal and external) always executes the function of the same name (and parameter types) in the most derived contract in the inheritance hierarchy. This has to be explicitly enabled on each function in the hierarchy using the virtual and override keywords. It is possible to call functions further up in the inheritance hierarchy internally by explicitly specifying the contract using ContractName.functionName() or using super.functionName() if you want to call the function one level higher up in the flattened inheritance hierarchy.

When a contract inherits from other contracts, only a single contract is created on the blockchain, and the code from all the base contracts is compiled into the created contract. This means that all internal calls to functions of base contracts also just use internal function calls.

State variable shadowing is considered as an error.

Both functions and modifiers can override other functions/modifiers.

Contracts need to be marked as abstract when at least one of their functions is not implemented. Contracts may be marked as abstract even though all functions are implemented.

5.7 Interfaces

Interfaces are similar to abstract contracts, but they cannot have any functions implemented. They are basically limited to what the Contract ABI can represent.

All functions declared in interfaces are implicitly virtual and any functions that override them do not need the override keyword.

5.8 Libraries

Libraries are similar to contracts, but their purpose is that they are deployed only once at a specific address and their code is reused using the DELEGATECALL. This means that if library functions are called, their code is executed in the context of the calling contract, i.e. this points to the calling contract, and especially the storage from the calling contract can be accessed.

Library functions can only be called directly (i.e. without the use of DELEGATECALL) if they do not modify the state (i.e. if they are view or pure functions), because libraries are assumed to be stateless.

5.9 Using for

The directive using A for B; can be used to attach functions A as member functions to any type B. These functions will receive the object they are called on as their first parameter (like the self variable in Python).

It is valid either at file level or inside a contract, at contract level.

6 Iinline Assembly

You can interleave Solidity statements with inline assembly in a language called Yul. An inline assembly block is marked by assembly { ... }.