The assembler - introduction

As already mentioned in the previous sections about programming, an assembler is a utility to translate assembly-language into machine-code.

This section shows how to use the assembler in more detail.

By now, the reader is assumed to be familiar with the concepts of assembly-language, macros, and the semantics of some basic (pseudo-)instructions. (Refer back to previous sections if necessary.)

The assembler is implemented in the Tcl programming language, which is excellent for implementing Domain-Specific Languages (DSL) such as the assembly-language used in this project.

Address-labels

Labels are placeholders for program-addresses.

Labels already appeared in the previous sections. An example of a label-definition and -reference is shown below:

        ibc1    1000    done    ;# Invert the bit at address 1000. 
                                ;#
                                ;# If the inverted bit is 0, 
                                ;# jump straight to "done",
                                ;# skipping the next instruction.

        set1    2000            ;# If the inverted bit was 1,
                                ;# set the bit at 2000 to 1.

: done 

The label "done" can be referenced by branch-like instructions occurring before or after the label-definition. (In the previous example, the label-reference occurred before the label-definition).

It's not allowed to reference a label defined outside the current macro.

Note the required whitespace between the colon (":") and the label-name ("done").

Variables

Variables are symbolic names for RAM-addresses. They already occurred in the section about extended assembly-language.

Each variable must be declared inside a macro-body, before its first use. Variables can be used as (part of) a data-address in an instruction.

An example of the use of variables is shown here:

macro MyUselessMacro {

    . UselessVar

    set1    $UselessVar     ;# Set variable "UselessVar" to 1. 
                            ;#
                            ;# Since the variable is lost when
                            ;# the macro-body ends, this is
                            ;# a useless operation.
} 

Note the required whitespace between the dot (".") and the label-name ("UselessVar").

When a variable is declared, the next free RAM-address is allocated to it. When a macro-body ends, the RAM-addresses corresponding to all its local variables are freed.

Expressions

The assembly-language used here is in fact Tcl-code. To assemble a program simply means to execute it as Tcl-code. During execution, the resulting program-binary is written.

Because assembly-language is Tcl-code, assembly-statements can be mixed with Tcl-expressions and -statements. Additional helper-variables, loops and conditionals can thus be used:

include "1bit.asm"      ;# The assembler offers "include",
                        ;# which literally inserts another file.
                        ;#
                        ;# The file "1bit.asm" contains macros
                        ;# like "clr1", used in the code below.

main  {

    # This code clears 16 bits starting at RAM-address 8000.
    #
    # (Only "clr1" is an actual assembly-instruction; the rest
    # are generic Tcl-statements.)

    set base 8000

    for  { set offset 0 }  { $offset < 16 }  { incr offset } {

        clr1  [ + $base $offset ]   ;# "+ A B": Polish notation
    }
} 

(For an explanation of what each Tcl-statement does exactly, consult a Tcl language-reference.)

In the above example, a Tcl-variable "base" was created. This variable can be referenced using Tcl's "$"-notation, as shown above.

In the example before this one, another variable named "UselessVar" was declared using the assembler's dot-notation (". UselessVar"). By doing this, a Tcl-variable with the same name came into existence. This variable could also be referenced using the "$"-notation.

Every expression that's allowed in Tcl is allowed in the assembly-language, since that is also Tcl. Furthermore, most binary numeric operators are made available in Polish notation, e.g. ">= 3 2", or "+ $base $offset" as shown above.

However, expressions can not be used with labels.

(This has to do with the fact that it's impossible to know the label-address if it hasn't been defined yet at the time of use. Remember that labels are not simply Tcl-variables.)

Macro-arguments

A macro can have arguments (positional parameters), if so specified in the macro-definition.

The previous example-macro "MyUselessMacro" had no arguments. The following macro has 2 arguments - "FirstAddr" and "SecondAddr":

macro  SomeOtherMacro  FirstAddr  SecondAddr {

    ...
} 

These arguments "FirstAddr" and "SecondAddr" in the macro-definition are called the macro's formal arguments.

When the macro is inserted, actual values - called actual arguments - must be given in place of these symbolic argument-names:

...

SomeOtherMacro 1000 2000

...  

Within the macro-body, the given actual arguments will then be substituted for the formal arguments:

macro  SomeOtherMacro  FirstAddr  SecondAddr {

    clr1    $FirstAddr    ;# Clear the bit at address 1000...

    set1    $SecondAddr   ;# ...and set the bit at 2000...
}

SomeOtherMacro 1000 2000  ;# ...because those values were passed.

Actual arguments can be literals, variables, expressions containing either or both, or label-names. Expressions using labels are not allowed. (This has to do with the fact that labels are not actual Tcl-variables.)

A somewhat more complex example, using variables to pass a label-name and RAM-address around:

macro  ClearAndBranch  reg  branch {

    ibc1    $reg    $branch
    ibc1    $reg    $branch
}


macro  UnconditionalBranch  branch {

    . tmp   ;# (dummy-variable)

    ClearAndBranch  $tmp    $branch
}


    ...

    UnconditionalBranch done

    set1 1000   ;# This instruction is always skipped.

: done

    ...  

(Since macro-definitions must precede their use, it may help to read this example from bottom to top.)