Solidity

Solidity Inline Assembly Syntax


Solidity Inline Assembly Syntax

In assembly, comments are parsed, identifiers and literals are exactly as Solidity, therefore the usual // and /* */ comments. When writing inline assembly, you have to mark it with assembly { … } with the assembly code going inside these curly braces, the members mentioned in the list below may be used. We go into detail further in the tutorial:

  • blocks with local variables being scoped inside
  • functional style assignments
  • identifiers (labels or variables and externals that are assembly local, if they are used as inline assembly)
  • assignments (in “instruction style”)
  • declarations of variables
  • labels
  • “instruction style”
  • functional style opcodes
  • literals

Solidity Inline Assembly Opcodes

Although this tutorial is not a complete overview of the EVM, we will include a list of opcodes that may come to use when writing inline assembly code in Solidity.

When an opcode takes arguments, it must follow a pair of conditions:

Must be provided in parentheses

Must be from the top of the stack

If an opcode were to take arguments (they must be from the top of the stack), they would need to be given in parentheses.

Keep in mind that the argument order may be seen reversed, although not in functional style (as explained later in this tutorial). Opcodes that are marked with the  symbol are not going to push any item onto the stack, and the opcodes that are marked with the *  symbol are special, meanwhile, all others would push a single item onto the stack.

In the list below, mem[a…b) would show the memory bytes which start at the position a up to (excluding) position b and storage[p] would signify the contents of storage at position p.

Note that the opcodes jumpdest and pushi are not allowed to be used directly.

In grammar, pre-defined identifiers are meant to represent opcodes.

mul(a, b) a * b
sub(a, b) a – b
div(a, b) a / b
add(a, b) a + b
stop stops execution, similar to return(0,0)
mod(a, b) a % b
sdiv(a, b) a / b, used with signed numbers in two’s complement
not(a) ~a, each bit of a gets negated
smod(a, b) a % b, for signed numbers in two’s complement
exp(a, b) a to the power of b
gt(a, b) 1 if a is greater than b, 0 otherwise
lt(a, b) 1 if a is less then b, 0 otherwise
sgt(a, b) 1 if a is greater than b, 0 otherwise, for signed numbers in two’s complement
slt(a, b) 1 if a is less than b, 0 otherwise, for signed numbers in two’s complement
iszero(a) 1 if a equals 0, 0 otherwise
eq(a, b) 1 if a equals b, 0 otherwise
xor(a, b) bitwise xor of a and b
and(a, b) bitwise and of a and b
or(a, b) bitwise or of a and b
addmod(a, b, m) (a + b) % m with arbitrary precision arithmetics
byte(n, a) nth byte of a, where the most significant byte is the 0th byte
mulmod(a, b, m) (a * b) % m with arbitrary precision arithmetics
signextend(i, a) sign extend from (i*8+7)th bit counting from least significant
sha3(position, n) keccak(mem[position…(position+n)))
keccak256(position, n) keccak(mem[position…(position+n)))
jumpi(label, condition) jump to label if condition is nonzero
jump(label) jump to label / code position
pop(a) remove that was pushed by a
pc current position in code
swap1 … swap16 * swap topmost and ith stack slot below it
dup1 … dup16 copy slot of ith stack to the top (counts from top)
mload(position) mem[position..(position+32))
mstore8(position, v) mem[position] := v & 0xff – only modifies a single byte
mstore(position, v) mem[position..(position+32)) := v
sstore(position, v) storage[position] := v
msize size of memory, i.e. largest accessed memory index
sload(position) storage[position]
address address of the current contract / execution context
gas gas still available to execution
caller call sender (not including delegatecall)
balance(address) wei balance at address a
calldataload(position) call data starting from position (32 bytes)
callvalue wei sent together with the current call
calldatacopy(p2, p1, size) copy size bytes from the calldata in position p1 to mem at position p2
codesize size of the code of the current contract / execution context
calldatasize size of call data in bytes
codecopy(p2, p1, size) copy size bytes from code at position p1 to mem at position p2
extcodecopy(address, p2, p1, size) like codecopy(p2, p1, size) but take code at address
returndatasize size of the last returndata
returndatacopy(p2, p1, size) copy size bytes from returndata at position p1 to mem at position p2
create(wei, position, size) create new contract with code mem[position..(position+s)) sending wei, returning a new address
extcodesize(address) size of the code at address a
call(gas, address, wei, input, insize, output, outsize) call contract at the address along with input mem[input..(input+insize)) providing gas and wei in addition to output area mem[output..(output+outsize)) will return 0 in case of an error (an out of gas exception, for example) and 1 if successful
callcode(gas, address, wei, input, insize, output, outsize) identical to call but only use the code from a and stay in the context of the current contract otherwise
delegatecall(gas, address, input, insize, output, outsize) identical to callcode but also keep caller and callvalue
return(position, size) end execution, return data mem[position..(position+size))
create2(wei, n, position, size) create new contract with code mem[position..(position+size)) at address keccak256(<address> . n . keccak256(mem[position..(position+size))) sending wei, returning a new address
revert(position, size) end execution, revert state changes, return data mem[position..(position+size))
invalid end execution with invalid instruction
staticcall(gas, address, input, insize, output, outsize) identical to call(gas, address, 0, input, insize, output, outsize) but do not allow state modifications
selfdestruct(address) end execution, destroy current contract, sending funds to address
log0(position, size) log without topics and data mem[position..(position+size))
log1(position, size, topic1) log with topic topic1 and data mem[position..(position+size))
log2(position, size, topic1, topic2) log with topics topic1, topic2 and data mem[position..(position+size))
log3(position, size, topic1, topic2, topic3) log with topics topic1, topic2, topic3 and data mem[position..(position+size))
log4(position, size, topic1, topic2, topic3, topic4) log with topics topic1, topic2, topic3, topic4 and data mem[position..(position+size))
origin sender of the transaction
blockhash(blockNum) hash of block nr blockNum – only for last 256 blocks excluding current
gasprice gas price of the transaction
timestamp current block’s timestamp in seconds past the epoch
difficulty current block’s difficulty
coinbase current mining beneficiary
number number of the current block
gaslimit current block’s gas limit

Solidity Inline Assembly Literals

Integer constants may be used by typing them in hexadecimals or decimal notation along with an appropriate PUSHi instruction that will be automatically generated.

The following line will create code which will add 2 and 3. This would result in 5 and then compute the bitwise along with the “abc” string. Strings get stored left-aligned and must be no longer than 32 bytes.

assembly { 2 3 add "abc" and }

Solidity Inline Assembly Functional Style

You can type a single opcode after another similarly to how they end up in bytecode. For example, if you add 4 to the memory contents at position 0x40, this is what you would need to write:

4 0x40 mload add 0x40 mstore

Since it can often be difficult to see certain opcodes’ actual arguments are, the inline assembly of Solidity would also provide a notation referred to as “functional style” in which the same code is written like this:

mstore(0x40, add(mload(0x40), 4))

Expressions in this notation cannot internally utilize instructional style, i.e. 3 1 mstore(0x40, add) would not be valid assembly code, it must be typed in as mstore(0x40, add(1, 3)). If an opcode takes no arguments, however, the parentheses may be omitted.

Keep in mind that the argument order gets reversed when writing in functional-style, differently than to the instruction-style way. When using functional-style, the argument on the top is going to also be the first argument.


Solidity Inline Assembly Access to External Variables and Functions

In Solidity identifiers, such as variables, are accessible simply by using the name. In the case of memory variable, however, this pushes the address instead of the value onto the stack. It’s different for storage variable though: storage values may not occupy a whole storage slot, therefore their “address” is composed of a byte-offset inside a slot, which is also included in the address. For retrieving the slot pointed to with the variable x, you access x_slot and in order to retrieve the byte-offset you must use x_offset.

As shown below, in assignments you may even use local Solidity variables for assigning to.

Functions outside of inline assembly are accessible too: the assembly pushes their entry label (having applied virtual resolution).

The semantics regarding calling in Solidity are as follows:

  • the caller pushes return label, a1, a2, …, an
  • the call returns with r1, r2, …, rm

Using this feature is not very convenient, since the stack offset may change during the call, and the references to local variables might turn out wrong.

Example

pragma solidity ^0.4.11;

contract Cont {
    uint c;
    function func(uint a) returns (uint b) {
        assembly {
            b := mul(a, sload(c_slot)) // the offset is ignored, it is zero
        }
    }
}

 

Try on Remix Try live on Hosting


Solidity Inline Assembly Labels

There is another problem with the EVM assembly – jump and jumpi require absolute addresses which may change very easily. For this reason, Solidity’s inline assembly would provide labels for making the usage of jumps more convenient. Keep in mind that as a low-level feature, labels are not obligatory to use, since you can do with simply making use of assembly loops, functions, if and switch instructions (more about them further below). This code example will compute a Fibonacci series element.

Example

{
    let x := calldataload(4)
    let y := 1
    let z := y
loop:
    jumpi(loopend, eq(x, 0))
    y add swap1
    x := sub(x, 1)
    jump(loop)
loopend:
    mstore(0, y)
    return(0, 0x20)
}

 

Try on Remix Try live on Hosting

Keep in mind that accessing stack variables automatically will only work when the assembler is aware of the height of the stack at the moment. This will not work in case the target and the jump source are of different stack heights. Using such jumps is still alright, however, do not try accessing stack variables (that would also include assembly variables) when using jumps like that.

Moreover, the stack height analyzer will run throughout the code reading one opcode after another (not according to the control flow). Because of that, in this example the assembler will not tell the stack height of label two correctly:

Example

{
    let a := 8
    jump(two)
    one:
        // However, the stack height here is 2 (since a and 7 were pushed),
        // the assembler, though, thinks it is 1, as it reads
        // downwards from the top.
        // Here, accessing the stack variable would lead to errors.
        a := 9
        jump(three)
    two:
        7 // pushing something to the stack
        jump(one)
    three:
}

 

Try on Remix Try live on Hosting


Solidity Inline Assembly Declaring Assembly-Local Variables

The keyword let can be used or declaring variables which are visible inside inline assembly only and really exclusively inside the {…}-block. The instruction let is going to create a whole new stack slot which gets reserved for this particular variable and then automatically deleted once again once block’s end has been reached. You have to give the variable an initial value which may be simply 0, but you also have the option of using a complex expression written in functional-style.

Example

pragma solidity ^0.4.0;

contract Cont {
    function func(uint a) returns (uint b) {
        assembly {
            let c := add(a, 1)
            mstore(0x80, c)
            {
                let d := add(sload(c), 1)
                b := d
            } // d is "deallocated" here
            b := add(b, c)
        } // c is "deallocated" here
    }
}

 

Try on Remix Try live on Hosting


Solidity Inline Assembly Assignments

You can use assign to function-local variables and to assembly-local variables. Make sure that when assigning to variables which point to storage or memory, you are only going to modify the pointer instead of the data.

Assignments can be of two types: instruction-style and functional-style. In the case of assignments in functional style ( variable := value ), you must provide a value using a functional-style expression which will result in precisely a single stack value and as for instruction-style ( =: variable ), the value will just be taken from the top of the stack. Either way, the colon is meant to point to the variable name. Via replacement of the variable’s value on the stack with the new value, assignment is performed.

Example

{
    let x := 0 // a part of variable declaration in the form of functional-style assignment
    let y := add(x, 2)
    sload(10)
    =: x // assignment in instruction style, will put the outcome of sload(10) into x
}

 

Try on Remix Try live on Hosting


Solidity Inline Assembly If

As in most other languages, the if statement is used for executing code when a certain condition is met. No “else” part is present thouhg, so “switch” should be considered if you would like to utilize multiple alternative.

Example

{
    if eq(value, 0) { revert(0, 0) }
}

 

Try on Remix Try live on Hosting

Note: It is required that you the body is surrounded with curly braces.


Solidity Inline Assembly Switch

A switch statement may be used as a simple alternative for “if/else” statements. This statement is meant to take an expression value and make comparisons to a multiple constants. If there is a corresponding constant, that branch gets taken. Unlike the error-prone behavior of certain programming languages, the control flow will not keep going from one case to the following one. A fallback or a default case named default is possible.

Example

{
    let a := 0
    switch calldataload(4)
    case 0 {
        a := calldataload(0x24)
    }
    default {
        a := calldataload(0x44)
    }
    sstore(0, div(a, 2))
}

 

Try on Remix Try live on Hosting

Note: A case list does not need to be surrounded with curly braces, but the case’s body needs them.


Solidity Inline Assembly Loops

A basic for-style loop is supported by assembly. For-style loops need a header needed for intialization, a condition under which the loop keeps running (or stops) and a part defining post-iteration, which defines how the loop iterates. The condition must be an expression written in functional-style, whereas the other two would be blocks. If any variables are declared in the initializing part, these variables’ scope get extended into the body of the loop (which includes the condition and post-iteration statements as well).

This example computes the area’s sum in memory.

Example

{
    let a := 0
    for { let n := 0 } lt(n, 0x100) { n := add(n, 0x20) } {
        a := add(a, mload(n))
    }
}

 

Try on Remix Try live on Hosting

You can write for loops so their behavior is like a while loop: You simply have to omit initialization and post-iteration statements.

Example

{
    let a := 0
    let n := 0
    for { } lt(n, 0x100) { } {     // while(n < 0x100)
        a := add(a, mload(n))
        n := add(n, 0x20)
    }
}

 

Try on Remix Try live on Hosting


Solidity Inline Assembly Functions

Assembly allows the definition of low-level functions. These take their arguments (and a return PC) from the stack and also put the results onto the stack. Calling a function looks the same way as executing a functional-style opcode.

Functions can be defined anywhere and are visible in the block they are declared in. Inside a function, you cannot access local variables defined outside of that function. There is no explicit return statement.

If you call a function that returns multiple values, you have to assign them to a tuple using a, b := f(x) or let a, b := f(x).

The following example implements the power function by square-and-multiply.

Example

{
    function powerSwitch(baseRes, exponentSwitch) -> result {
        switch exponentSwitch
        case 0 { result := 1 }
        case 1 { result := baseRes }
        default {
            result := power(mul(baseRes, baseRes), div(exponentSwitch, 2))
            switch mod(exponentSwitch, 2)
                case 1 { result := mul(baseRes, result) }
        }
    }
}

 

Try on Remix Try live on Hosting


Solidity Inline Assembly Things to Avoid

While looking like it’s high-level, inline assembly actually is low-level and extremely so.

Loops, function calls, if and switch statements get converted by simply rewriting rules and afterwards, all the assembler helps you with is re-arranging opcodes that are functional-style, jump label management,  stack height counting for accessing variables as well as removing stack slots for variable that are assembly-local when the block’s end has been reached. Notably for the latter two scenarios, it is crucial to know the assembler will only count stack height downwards from the top, meaning that it won’t necessarilly follow the control flow.

Moreover, operations such as swap are only going to swap the stack content instead of the variable locations.


Solidity Inline Assembly Conventions in Solidity

Unline the Ethereum Virtual Machine assembly, Solidity is aware of types that are narrower than 256 bits, such as uint24. To make them more efficient, the majority of arithmetic operations simply see them as 256-bit numbers while the greater-order bits only get cleaned when they have to be, for example, right before they get written into memory or comparison being performed with them. That means that accessing a variable from inside of inline assembly, you may have to clean the higher order bits manually first.

Memory in Solidity is managed in a very simply: You have a “free memory pointer” at the 0x40  position inside of memory. For allocating memory, you would simply use the memory from then on while updating the pointer as required.

Memory array elements inside Solidity constantly occupy 32 byte multiples (which is true for byte[ ] as well, however, it does not apply to string and bytes). Memory arrays that are multi-dimensional are memory array pointers. A dynamic array’s length is stored in the aray’s first slot and then exclusively the elements of the array are going to follow.

Example

pragma solidity ^0.4.0;

library GetCode {
    function at(address _codeAddr) returns (bytes o_codeArray) {
        assembly {
            // get the code size, need assembly for this
            let codeSize := externalcodesize(_codeAddr)
            // allocate output byte array - you can do this without assembly as well
            // with o_codeArray = new bytes(codeSize)
            o_codeArray := mload(0x40)
            // newly created "memory end" that includes padding
            mstore(0x40, add(o_codeArray, and(add(add(codeSize, 0x20), 0x1f), not(0x1f))))
            // length of store in the memory
            mstore(o_codeArray, codeSize)
            // get the code , need assembly for this
            externalcodecopy(_codeAddr, add(o_codeArray, 0x20), 0, codeSize)
        }
    }
}

 

Try on Remix Try live on Hosting

Warning: Memory arrays that are statically-sized have no length field, however it is being added soon in order to allow better quality convertibility between static and dynamic size arrays, so it is not recommended to rely on that.

Read previous post:
Solidity Inline Assembly Basics

Solidity Inline Assembly Basics Main Tips Solidity defines an assembly language, which is also usable without Solidity. The assembly language this tutorial...

Close