Solidity

Solidity Assembly Grammar And Parsing


Solidity Assembly Grammar And Parsing

This list describes the main goals of the parser:

  • Turning the byte stream into a token stream, getting rid of comments in C++-style (for source references, there is a special comment, that is not mentioned here)
  • Turning the token stream into an AST the way that is dictated by the grammar described below
  • Registering identifiers along with the block they get defined in (the AST mode annotation) as well as noting from what point the variables may be accessed.

The lexer of the assembly is created to follow the one that Solidity itself defines.

The main use of Whitespace is delimiting tokens. Whitespace consists of Linefeed, Tab and character Space. Comments are the regular JavaScript/C++ ones which are interpreted similarly to Whitespace.


Solidity Assembly Grammar And Parsing Elements

Here’s an overview of the grammar of standalone assembly in Solidity and how it is put together.

Blocks

Blocks in the assembly are surrounded by curly braces and contain assembly items.

Assembly blocks are written like this:

'{' AssemblyItem* '}'

Items

Assembly items include various elements commonly found in solidity such as:

  1. Identifiers
  2. Blocks
  3. Functional expressions
  4. Local definitions
  5. Functional assignments
  6. Assignments
  7. Label definitions
  8. If statements
  9. Switch statements
  10. Function definitions
  11. For statements
  12. ‘break’ statements
  13. ‘continue’ statements
  14. Sub-Assemblies
  15. ‘dataSize’ statements
  16. ‘(‘ Identifier ‘)’ statements
  17. Linker Symbols
  18. ‘errorLabel’ statements
  19. ‘bytecodeSize’ statements
  20. Number Literals
  21. String Literals
  22. Hex Literals

Identifiers

Identifiers in standalone assembly work similarly to Solidity.

An identifier is a name used to reference certain items. It consists of various UTF-8 characters.

Identifiers are written in this format:

[a-zA-Z_$] [a-zA-Z_0-9]*

Functional Expressions

Assembly expressions which notation of which goes by the guidelines of functional style, meaning that their arguments are surrounded with parentheses.

Functional expression are written in this format:

Identifier '(' ( Item ( ',' Item )* )? ')'

Local Definitions

Local definitions refer members, which are defined in the same context as the assembly.

Local definitions are written in this format:

'let' IdentifierOrList ':=' FunctionalExpression

Functional Assignments

Functional-style assignment, which assigns function-local variables to variables that are assembly-local.

Functional assignments are written in this format:

IdentifierOrList ':=' FunctionalExpression

Identifier Or Lists

Identifier Or Lists are written in this format:

Identifier | '(' IdentifierList ')'

Identifier Lists

Identifier lists are function declarators that are not part of definitions of that function.

Identifier lists are written in this format:

Identifier ( ',' Identifier)*

Assembly Assignments

Assignments in assembly assign function-local variables to assembly-local ones.

Assembly assignments are written in this format:

'=:' Identifier

Label Definitions

Labels are used for making jumps between addresses more convenient.

Label definitions are written in this format:

Identifier ':'

Conditional Statements

If statements execute a piece of code if a condition is met. Keep in mind though, that there are no “else” statements in Solidity assembly.

If statements are written in this format:

'if' FunctionalExpression Block

To deal with multiple alternatives, you can use switch statements in Solidity assembly. Unlike other languages though, Solidity assembly switch statements stop going through the cases when they find the corresponding one.

Switch statements are written in this format:

'switch' FunctionalExpression Case*
    ( 'default' Block )?

When using switch statements, you will also use cases to specify the alternative scenarios for which you will write code.

Cases are written in this format:

'case' FunctionalExpression Block

Function Definitions

Assembly function definitions define a block of executable code that may be called.

Function definitions are written in this format:

'function' Identifier '(' IdentifierList? ')'
    ( '->' '(' IdentifierList ')' )? Block

For

These statements are meant to loop a block of executable code based on conditions specified on the initialization of the code.

For statements are written in this format:

'for' ( Block | FunctionalExpression)
    FunctionalExpression ( Block | FunctionalExpression) Block

Sub-Assemblies

A sub-assembly is an assembly that exists in the context of another assembly.

Sub-assemblies are written in this format:

'assembly' Identifier Block

Literals

Literals, in general, are values that are not assigned to a variable.

Number literals are written in this format:

HexNumber | DecimalNumber

Hex literals are written in this format:

'hex' ('"' ([0-9a-fA-F]{2})* '"' | '\'' ([0-9a-fA-F]{2})* '\'')

String literals are written in this format:

'"' ([^"\r\n\\] | '\\' .)* '"'

Numbers

Hex numbers are written in this format:

'0x' [0-9a-fA-F]+

Decimal numbers are written in this format:

[0-9]+
Read previous post:
Solidity Standalone Assembly

Solidity Standalone Assembly Main Tips Inline assembly is also used alone. This way of writing assembly in Solidity is referred...

Close