A .sbbe file is a single translation unit: a flat list of top-level forms (source directives, globals, externs, and functions) in declaration order.

This page covers the surface syntax only. For instruction semantics see the Instruction Set page; for type names see the Types page.

Minimal example

file "test.c"

func $add(i32, i32) -> i32 {
entry:
	ldl 0
	ldl 1
	add.s i32
	ret
}

Whitespace and comments

Spaces, tabs, carriage returns, and newlines are insignificant between tokens. Line comments begin with // and extend to end of line. There are no block comments.

ldi 42    // a trailing comment is fine
// a whole-line comment too

Identifiers

Names always appear with a leading $ sigil: $x, $add, $_counter. The sigil is part of the reference syntax, not the stored name. An identifier body is one or more characters drawn from [A-Za-z0-9_.]. There is no maximum length in the grammar but the parser currently truncates at 63 characters.

Locals, globals, functions, and externs share the same $name syntax; scope resolves at the use site. Inside a function body, ld $x checks locals first, then globals. This means locals (and will) shadow a global of the same name.

Block labels

Labels are bare words (no $) followed by :, appearing at the start of a line:

entry:
loop_header:
done:

Label bodies use the same character class as identifiers. Labels are function-scoped and resolved after the body is parsed, so forward references are allowed.

Integer literals

Integers are decimal or hexadecimal. Hex uses a 0x or 0X prefix. A leading - negates. There are no digit separators, no binary or octal prefixes, and no unsigned suffix.

ldi 42
ldi -1
ldc i32 0xDEADBEEF
ldc i64 0x7FFFFFFFFFFFFFFF

An integer token followed immediately by ., e, or E is rejected as an integer and re-parsed as a float.

Float literals

Float literals are anything accepted by strtod: a decimal significand with optional fraction and optional exponent. The parser does not distinguish between f32 and f64 lexically. The instruction’s type operand selects the target width and the value is rounded at encode time.

ldc f64 3.14
ldc f32 -0.5
ldc f64 1e-9
ldc f64 6.022e23

String literals

Strings appear in two places: the file directive and as initializers for ptr constants. They are enclosed in double quotes. The recognized escapes are \n, \t, \r, \\, \", and \0; any other \x passes x through literally. String constants are automatically null-terminated when stored as data.

ldc ptr "hello\n"

Byte arrays

Byte arrays are an alternative ptr initializer: a comma-separated list of integer literals in square brackets. Each element is truncated to 8 bits. No null terminator is appended.

ldc ptr [0x48, 0x69, 0x00]

Top-level forms

The top level accepts exactly these forms, each introduced by a keyword:

KeywordPurpose
file "path"Sets the source file tag for subsequent declarations
var $name T = vMutable global with initializer
const $name T = vImmutable global with initializer
extern var $name TImported mutable global, no initializer
extern const $name TImported immutable global, no initializer
extern func ...Imported function declaration, no body
func $name(...) ...Function declaration with body
data $idx T = vWrites a typed literal into constant-pool slot $idx

Order matters only for the file directive (which is sticky for everything following it). Globals and functions may reference each other in any order; the parser resolves cross- references after the full unit is parsed.

The file directive

Multiple source files can be interleaved in a single translation unit. A file directive tags every subsequent declaration with the given path for diagnostics and debugging:

file "src/math.c"
func $add(i32, i32) -> i32 { ... }

file "src/io.c"
func $puts(ptr) -> i32 { ... }

The directive has no effect on linkage or symbol visibility (it’s purely metadata).

Global Containers

Globals are named, typed storage that lives for the lifetime of the program. A var global is mutable and may be written from any function; a const global is immutable after its initializer runs. Both are declared with a single type and a literal initializer joined by =. The initializer is a literal only: no instructions, no references to other globals. For ptr globals the literal may be a string or byte array, which the assembler lowers into a data segment and replaces with the resulting offset.

var   $counter i32 = 0          // mutable global, zero-initialized
const $max     i32 = 100        // immutable global
const $banner  ptr = "hello\n"  // ptr initialized from a string literal

Functions

A function declaration pairs a typed signature with a body of local declarations and labeled blocks of instructions. The signature names the function with $name, lists its parameter types in order, and optionally declares a return type with -> T. The body is delimited by { ... } and contains var declarations and labeled blocks of instructions, which may be freely interleaved. Execution begins at the first block in source order regardless of its label, though entry is the conventional choice.

func $name(param-types) -> return-type {
	var $x i32       // named local
	var i32          // unnamed local, referenced by index

entry:
	ret
}

Parameters are a comma-separated list of types (no names in the signature). The -> return-type clause is omitted when the function returns nothing. local is accepted as a synonym for var inside function bodies.

Locals are indexed after parameters: with two parameters and two declared locals, the indices run 0 (param 0), 1 (param 1), 2 (first local), 3 (second local). Named locals are addressable both by $name and by index; unnamed locals only by index.

Only extern functions may omit the body. Non-extern functions must have a body, even if it’s just an empty block with a ret instruction.

Externs

extern var  $errno i32        // imported from another unit / runtime
extern func $puts(ptr) -> i32
extern func $exit(i32)

The initializer is a literal only — no instructions, no references to other globals. For ptr globals, the initializer may be a string literal or byte array as described above.

Source mapping

Any instruction, global, or function signature may carry an @ line:column suffix. The parser and printer round-trip these positions unchanged:

var $x i32 = 0 @ 3:1

func $add(i32, i32) -> i32 @ 10:1 {
entry:
	ldl 0       @ 11:3
	ldl 1       @ 11:12
	add.s i32   @ 11:7
	ret         @ 12:3
}

Both numbers are 1-based. A missing suffix means “no source location” and encodes as zeros. The file directive supplies the file component; the @ suffix supplies the line and column within that file.

Instruction syntax

An instruction occupies one line. It begins with a mnemonic, followed by operands separated by whitespace, optionally ending with an @ line:column suffix. Trailing text on the line is ignored after the first // comment.

Operands are drawn from a small vocabulary:

Operand order, count, and types are determined by the mnemonic. The parser is strict: extra or missing operands are a parse error.

Mnemonic conventions

Mnemonics follow a few consistent patterns:

Alignment hint

Memory instructions accept an optional align=N flag, where N is a power of two:

ldm i32
ldm i32 align=4
stm i64 align=8
vldm  align=16

When omitted, the backend may assume natural alignment for the access width.

Memory orderings

Atomic instructions end with a memory-ordering keyword (bare word, no punctuation):

ald  i32 seq_cst
ast  i32 release
armw.add i32 acq_rel
fence relaxed

The accepted orderings are relaxed, acquire, release, acq_rel, and seq_cst.

Control-flow targets

jmp and jmp.if take a block label (bare word, no $):

jmp      loop_header
jmp.if   done

jmpt takes a branch-table index. Branch-table data is not yet exposed in the text format.

Constant loading

ldc is the general typed-constant form:

ldc i32 1000000
ldc f64 3.14
ldc ptr "hello\n"
ldc ptr [0x00, 0x01, 0x02]

The assembler automatically rewrites ldc to ldi whenever the value is an integer that fits in a signed 24-bit immediate, so ldc i32 42 encodes as ldi 42. Use ldi directly only when the 24-bit range is guaranteed; otherwise prefer ldc and let the assembler choose.

Variable access

ld   $x      // load local or global by name
str  $x      // store local or global by name
tee  $x      // store-without-pop (locals only)
ldl  0       // load local by numeric index
strl 2       // store local by numeric index

ld and str resolve $name against locals first, then globals. tee is restricted to locals and using it with a global will produce a parse error.

Constant pool and data declarations

The data keyword writes directly into a slot of the translation unit’s constant pool:

data $0 i32 = 42
data $1 f64 = 3.14
data $2 ptr = "hello"

The $N in a data declaration is a decimal index, not a symbolic name. Slots referenced but not declared default to zero-initialized constants of the appropriate type. Most hand-written IR never needs data and it primarily exists to let the printer round-trip constant tables produced by other tools.