1432 lines
31 KiB
Plaintext
1432 lines
31 KiB
Plaintext
.HTML "A Manual for the Plan 9 assembler
|
|
.ft CW
|
|
.ta 8n +8n +8n +8n +8n +8n +8n
|
|
.ft
|
|
.TL
|
|
A Manual for the Plan 9 assembler
|
|
.AU
|
|
Rob Pike
|
|
rob@plan9.bell-labs.com
|
|
.SH
|
|
Machines
|
|
.PP
|
|
There is an assembler for each of the MIPS, SPARC, Intel 386,
|
|
Intel 960, AMD 29000, Motorola 68020 and 68000, Motorola Power PC,
|
|
AMD64, DEC Alpha, and Acorn ARM.
|
|
The 68020 assembler,
|
|
.CW 2a ,
|
|
is the oldest and in many ways the prototype.
|
|
The assemblers are really just variations of a single program:
|
|
they share many properties such as left-to-right assignment order for
|
|
instruction operands and the synthesis of macro instructions
|
|
such as
|
|
.CW MOVE
|
|
to hide the peculiarities of the load and store structure of the machines.
|
|
To keep things concrete, the first part of this manual is
|
|
specifically about the 68020.
|
|
At the end is a description of the differences among
|
|
the other assemblers.
|
|
.PP
|
|
The document, ``How to Use the Plan 9 C Compiler'', by Rob Pike,
|
|
is a prerequisite for this manual.
|
|
.SH
|
|
Registers
|
|
.PP
|
|
All pre-defined symbols in the assembler are upper-case.
|
|
Data registers are
|
|
.CW R0
|
|
through
|
|
.CW R7 ;
|
|
address registers are
|
|
.CW A0
|
|
through
|
|
.CW A7 ;
|
|
floating-point registers are
|
|
.CW F0
|
|
through
|
|
.CW F7 .
|
|
.PP
|
|
A pointer in
|
|
.CW A6
|
|
is used by the C compiler to point to data, enabling short addresses to
|
|
be used more often.
|
|
The value of
|
|
.CW A6
|
|
is constant and must be set during C program initialization
|
|
to the address of the externally-defined symbol
|
|
.CW a6base .
|
|
.PP
|
|
The following hardware registers are defined in the assembler; their
|
|
meaning should be obvious given a 68020 manual:
|
|
.CW CAAR ,
|
|
.CW CACR ,
|
|
.CW CCR ,
|
|
.CW DFC ,
|
|
.CW ISP ,
|
|
.CW MSP ,
|
|
.CW SFC ,
|
|
.CW SR ,
|
|
.CW USP ,
|
|
and
|
|
.CW VBR .
|
|
.PP
|
|
The assembler also defines several pseudo-registers that
|
|
manipulate the stack:
|
|
.CW FP ,
|
|
.CW SP ,
|
|
and
|
|
.CW TOS .
|
|
.CW FP
|
|
is the frame pointer, so
|
|
.CW 0(FP)
|
|
is the first argument,
|
|
.CW 4(FP)
|
|
is the second, and so on.
|
|
.CW SP
|
|
is the local stack pointer, where automatic variables are held
|
|
(SP is a pseudo-register only on the 68020);
|
|
.CW 0(SP)
|
|
is the first automatic, and so on as with
|
|
.CW FP .
|
|
Finally,
|
|
.CW TOS
|
|
is the top-of-stack register, used for pushing parameters to procedures,
|
|
saving temporary values, and so on.
|
|
.PP
|
|
The assembler and loader track these pseudo-registers so
|
|
the above statements are true regardless of what has been
|
|
pushed on the hardware stack, pointed to by
|
|
.CW A7 .
|
|
The name
|
|
.CW A7
|
|
refers to the hardware stack pointer, but beware of mixed use of
|
|
.CW A7
|
|
and the above stack-related pseudo-registers, which will cause trouble.
|
|
Note, too, that the
|
|
.CW PEA
|
|
instruction is observed by the loader to
|
|
alter SP and thus will insert a corresponding pop before all returns.
|
|
The assembler accepts a label-like name to be attached to
|
|
.CW FP
|
|
and
|
|
.CW SP
|
|
uses, such as
|
|
.CW p+0(FP) ,
|
|
to help document that
|
|
.CW p
|
|
is the first argument to a routine.
|
|
The name goes in the symbol table but has no significance to the result
|
|
of the program.
|
|
.SH
|
|
Referring to data
|
|
.PP
|
|
All external references must be made relative to some pseudo-register,
|
|
either
|
|
.CW PC
|
|
(the virtual program counter) or
|
|
.CW SB
|
|
(the ``static base'' register).
|
|
.CW PC
|
|
counts instructions, not bytes of data.
|
|
For example, to branch to the second following instruction, that is,
|
|
to skip one instruction, one may write
|
|
.P1
|
|
BRA 2(PC)
|
|
.P2
|
|
Labels are also allowed, as in
|
|
.P1
|
|
BRA return
|
|
NOP
|
|
return:
|
|
RTS
|
|
.P2
|
|
When using labels, there is no
|
|
.CW (PC)
|
|
annotation.
|
|
.PP
|
|
The pseudo-register
|
|
.CW SB
|
|
refers to the beginning of the address space of the program.
|
|
Thus, references to global data and procedures are written as
|
|
offsets to
|
|
.CW SB ,
|
|
as in
|
|
.P1
|
|
MOVL $array(SB), TOS
|
|
.P2
|
|
to push the address of a global array on the stack, or
|
|
.P1
|
|
MOVL array+4(SB), TOS
|
|
.P2
|
|
to push the second (4-byte) element of the array.
|
|
Note the use of an offset; the complete list of addressing modes is given below.
|
|
Similarly, subroutine calls must use
|
|
.CW SB :
|
|
.P1
|
|
BSR exit(SB)
|
|
.P2
|
|
File-static variables have syntax
|
|
.P1
|
|
local<>+4(SB)
|
|
.P2
|
|
The
|
|
.CW <>
|
|
will be filled in at load time by a unique integer.
|
|
.PP
|
|
When a program starts, it must execute
|
|
.P1
|
|
MOVL $a6base(SB), A6
|
|
.P2
|
|
before accessing any global data.
|
|
(On machines such as the MIPS and SPARC that cannot load a register
|
|
in a single instruction, constants are loaded through the static base
|
|
register. The loader recognizes code that initializes the static
|
|
base register and treats it specially. You must be careful, however,
|
|
not to load large constants on such machines when the static base
|
|
register is not set up, such as early in interrupt routines.)
|
|
.SH
|
|
Expressions
|
|
.PP
|
|
Expressions are mostly what one might expect.
|
|
Where an offset or a constant is expected,
|
|
a primary expression with unary operators is allowed.
|
|
A general C constant expression is allowed in parentheses.
|
|
.PP
|
|
Source files are preprocessed exactly as in the C compiler, so
|
|
.CW #define
|
|
and
|
|
.CW #include
|
|
work.
|
|
.SH
|
|
Addressing modes
|
|
.PP
|
|
The simple addressing modes are shared by all the assemblers.
|
|
Here, for completeness, follows a table of all the 68020 addressing modes,
|
|
since that machine has the richest set.
|
|
In the table,
|
|
.CW o
|
|
is an offset, which if zero may be elided, and
|
|
.CW d
|
|
is a displacement, which is a constant between -128 and 127 inclusive.
|
|
Many of the modes listed have the same name;
|
|
scrutiny of the format will show what default is being applied.
|
|
For instance, indexed mode with no address register supplied operates
|
|
as though a zero-valued register were used.
|
|
For "offset" read "displacement."
|
|
For "\f(CW.s\fP" read one of
|
|
.CW .L ,
|
|
or
|
|
.CW .W
|
|
followed by
|
|
.CW *1 ,
|
|
.CW *2 ,
|
|
.CW *4 ,
|
|
or
|
|
.CW *8
|
|
to indicate the size and scaling of the data.
|
|
.IP
|
|
.TS
|
|
l lfCW.
|
|
data register R0
|
|
address register A0
|
|
floating-point register F0
|
|
special names CAAR, CACR, etc.
|
|
constant $con
|
|
floating point constant $fcon
|
|
external symbol name+o(SB)
|
|
local symbol name<>+o(SB)
|
|
automatic symbol name+o(SP)
|
|
argument name+o(FP)
|
|
address of external $name+o(SB)
|
|
address of local $name<>+o(SB)
|
|
indirect post-increment (A0)+
|
|
indirect pre-decrement -(A0)
|
|
indirect with offset o(A0)
|
|
indexed with offset o()(R0.s)
|
|
indexed with offset o(A0)(R0.s)
|
|
external indexed name+o(SB)(R0.s)
|
|
local indexed name<>+o(SB)(R0.s)
|
|
automatic indexed name+o(SP)(R0.s)
|
|
parameter indexed name+o(FP)(R0.s)
|
|
offset indirect post-indexed d(o())(R0.s)
|
|
offset indirect post-indexed d(o(A0))(R0.s)
|
|
external indirect post-indexed d(name+o(SB))(R0.s)
|
|
local indirect post-indexed d(name<>+o(SB))(R0.s)
|
|
automatic indirect post-indexed d(name+o(SP))(R0.s)
|
|
parameter indirect post-indexed d(name+o(FP))(R0.s)
|
|
offset indirect pre-indexed d(o()(R0.s))
|
|
offset indirect pre-indexed d(o(A0))
|
|
offset indirect pre-indexed d(o(A0)(R0.s))
|
|
external indirect pre-indexed d(name+o(SB))
|
|
external indirect pre-indexed d(name+o(SB)(R0.s))
|
|
local indirect pre-indexed d(name<>+o(SB))
|
|
local indirect pre-indexed d(name<>+o(SB)(R0.s))
|
|
automatic indirect pre-indexed d(name+o(SP))
|
|
automatic indirect pre-indexed d(name+o(SP)(R0.s))
|
|
parameter indirect pre-indexed d(name+o(FP))
|
|
parameter indirect pre-indexed d(name+o(FP)(R0.s))
|
|
.TE
|
|
.in
|
|
.SH
|
|
Laying down data
|
|
.PP
|
|
Placing data in the instruction stream, say for interrupt vectors, is easy:
|
|
the pseudo-instructions
|
|
.CW LONG
|
|
and
|
|
.CW WORD
|
|
(but not
|
|
.CW BYTE )
|
|
lay down the value of their single argument, of the appropriate size,
|
|
as if it were an instruction:
|
|
.P1
|
|
LONG $12345
|
|
.P2
|
|
places the long 12345 (base 10)
|
|
in the instruction stream.
|
|
(On most machines,
|
|
the only such operator is
|
|
.CW WORD
|
|
and it lays down 32-bit quantities.
|
|
The 386 has all three:
|
|
.CW LONG ,
|
|
.CW WORD ,
|
|
and
|
|
.CW BYTE .
|
|
The AMD64 adds
|
|
.CW QUAD
|
|
to that for 64-bit values.
|
|
The 960 has only one,
|
|
.CW LONG .)
|
|
.PP
|
|
Placing information in the data section is more painful.
|
|
The pseudo-instruction
|
|
.CW DATA
|
|
does the work, given two arguments: an address at which to place the item,
|
|
including its size,
|
|
and the value to place there. For example, to define a character array
|
|
.CW array
|
|
containing the characters
|
|
.CW abc
|
|
and a terminating null:
|
|
.P1
|
|
DATA array+0(SB)/1, $'a'
|
|
DATA array+1(SB)/1, $'b'
|
|
DATA array+2(SB)/1, $'c'
|
|
GLOBL array(SB), $4
|
|
.P2
|
|
or
|
|
.P1
|
|
DATA array+0(SB)/4, $"abc\ez"
|
|
GLOBL array(SB), $4
|
|
.P2
|
|
The
|
|
.CW /1
|
|
defines the number of bytes to define,
|
|
.CW GLOBL
|
|
makes the symbol global, and the
|
|
.CW $4
|
|
says how many bytes the symbol occupies.
|
|
Uninitialized data is zeroed automatically.
|
|
The character
|
|
.CW \ez
|
|
is equivalent to the C
|
|
.CW \e0.
|
|
The string in a
|
|
.CW DATA
|
|
statement may contain a maximum of eight bytes;
|
|
build larger strings piecewise.
|
|
Two pseudo-instructions,
|
|
.CW DYNT
|
|
and
|
|
.CW INIT ,
|
|
allow the (obsolete) Alef compilers to build dynamic type information during the load
|
|
phase.
|
|
The
|
|
.CW DYNT
|
|
pseudo-instruction has two forms:
|
|
.P1
|
|
DYNT , ALEF_SI_5+0(SB)
|
|
DYNT ALEF_AS+0(SB), ALEF_SI_5+0(SB)
|
|
.P2
|
|
In the first form,
|
|
.CW DYNT
|
|
defines the symbol to be a small unique integer constant, chosen by the loader,
|
|
which is some multiple of the word size. In the second form,
|
|
.CW DYNT
|
|
defines the second symbol in the same way,
|
|
places the address of the most recently
|
|
defined text symbol in the array specified by the first symbol at the
|
|
index defined by the value of the second symbol,
|
|
and then adjusts the size of the array accordingly.
|
|
.PP
|
|
The
|
|
.CW INIT
|
|
pseudo-instruction takes the same parameters as a
|
|
.CW DATA
|
|
statement. Its symbol is used as the base of an array and the
|
|
data item is installed in the array at the offset specified by the most recent
|
|
.CW DYNT
|
|
pseudo-instruction.
|
|
The size of the array is adjusted accordingly.
|
|
The
|
|
.CW DYNT
|
|
and
|
|
.CW INIT
|
|
pseudo-instructions are not implemented on the 68020.
|
|
.SH
|
|
Defining a procedure
|
|
.PP
|
|
Entry points are defined by the pseudo-operation
|
|
.CW TEXT ,
|
|
which takes as arguments the name of the procedure (including the ubiquitous
|
|
.CW (SB) )
|
|
and the number of bytes of automatic storage to pre-allocate on the stack,
|
|
which will usually be zero when writing assembly language programs.
|
|
On machines with a link register, such as the MIPS and SPARC,
|
|
the special value -4 instructs the loader to generate no PC save
|
|
and restore instructions, even if the function is not a leaf.
|
|
Here is a complete procedure that returns the sum
|
|
of its two arguments:
|
|
.P1
|
|
TEXT sum(SB), $0
|
|
MOVL arg1+0(FP), R0
|
|
ADDL arg2+4(FP), R0
|
|
RTS
|
|
.P2
|
|
An optional middle argument
|
|
to the
|
|
.CW TEXT
|
|
pseudo-op is a bit field of options to the loader.
|
|
Setting the 1 bit suspends profiling the function when profiling is enabled for the rest of
|
|
the program.
|
|
For example,
|
|
.P1
|
|
TEXT sum(SB), 1, $0
|
|
MOVL arg1+0(FP), R0
|
|
ADDL arg2+4(FP), R0
|
|
RTS
|
|
.P2
|
|
will not be profiled; the first version above would be.
|
|
Subroutines with peculiar state, such as system call routines,
|
|
should not be profiled.
|
|
.PP
|
|
Setting the 2 bit allows multiple definitions of the same
|
|
.CW TEXT
|
|
symbol in a program; the loader will place only one such function in the image.
|
|
It was emitted only by the Alef compilers.
|
|
.PP
|
|
Subroutines to be called from C should place their result in
|
|
.CW R0 ,
|
|
even if it is an address.
|
|
Floating point values are returned in
|
|
.CW F0 .
|
|
Functions that return a structure to a C program
|
|
receive as their first argument the address of the location to
|
|
store the result;
|
|
.CW R0
|
|
is unused in the calling protocol for such procedures.
|
|
A subroutine is responsible for saving its own registers,
|
|
and therefore is free to use any registers without saving them (``caller saves'').
|
|
.CW A6
|
|
and
|
|
.CW A7
|
|
are the exceptions as described above.
|
|
.SH
|
|
When in doubt
|
|
.PP
|
|
If you get confused, try using the
|
|
.CW -S
|
|
option to
|
|
.CW 2c
|
|
and compiling a sample program.
|
|
The standard output is valid input to the assembler.
|
|
.SH
|
|
Instructions
|
|
.PP
|
|
The instruction set of the assembler is not identical to that
|
|
of the machine.
|
|
It is chosen to match what the compiler generates, augmented
|
|
slightly by specific needs of the operating system.
|
|
For example,
|
|
.CW 2a
|
|
does not distinguish between the various forms of
|
|
.CW MOVE
|
|
instruction: move quick, move address, etc. Instead the context
|
|
does the job. For example,
|
|
.P1
|
|
MOVL $1, R1
|
|
MOVL A0, R2
|
|
MOVW SR, R3
|
|
.P2
|
|
generates official
|
|
.CW MOVEQ ,
|
|
.CW MOVEA ,
|
|
and
|
|
.CW MOVESR
|
|
instructions.
|
|
A number of instructions do not have the syntax necessary to specify
|
|
their entire capabilities. Notable examples are the bitfield
|
|
instructions, the
|
|
multiply and divide instructions, etc.
|
|
For a complete set of generated instruction names (in
|
|
.CW 2a
|
|
notation, not Motorola's) see the file
|
|
.CW /sys/src/cmd/2c/2.out.h .
|
|
Despite its name, this file contains an enumeration of the
|
|
instructions that appear in the intermediate files generated
|
|
by the compiler, which correspond exactly to lines of assembly language.
|
|
.PP
|
|
The MC68000 assembler,
|
|
.CW 1a ,
|
|
is essentially the same, honoring the appropriate subset of the instructions
|
|
and addressing modes.
|
|
The definitions of these are, nonetheless, part of
|
|
.CW 2.out.h .
|
|
.SH
|
|
Laying down instructions
|
|
.PP
|
|
The loader modifies the code produced by the assembler and compiler.
|
|
It folds branches,
|
|
copies short sequences of code to eliminate branches,
|
|
and discards unreachable code.
|
|
The first instruction of every function is assumed to be reachable.
|
|
The pseudo-instruction
|
|
.CW NOP ,
|
|
which you may see in compiler output,
|
|
means no instruction at all, rather than an instruction that does nothing.
|
|
The loader discards all
|
|
.CW NOP 's.
|
|
.PP
|
|
To generate a true
|
|
.CW NOP
|
|
instruction, or any other instruction not known to the assembler, use a
|
|
.CW WORD
|
|
pseudo-instruction.
|
|
Such instructions on RISCs are not scheduled by the loader and must have
|
|
their delay slots filled manually.
|
|
.SH
|
|
MIPS
|
|
.PP
|
|
The registers are only addressed by number:
|
|
.CW R0
|
|
through
|
|
.CW R31 .
|
|
.CW R29
|
|
is the stack pointer;
|
|
.CW R30
|
|
is used as the static base pointer, the analogue of
|
|
.CW A6
|
|
on the 68020.
|
|
Its value is the address of the global symbol
|
|
.CW setR30(SB) .
|
|
The register holding returned values from subroutines is
|
|
.CW R1 .
|
|
When a function is called, space for the first argument
|
|
is reserved at
|
|
.CW 0(FP)
|
|
but in C (not Alef) the value is passed in
|
|
.CW R1
|
|
instead.
|
|
.PP
|
|
The loader uses
|
|
.CW R28
|
|
as a temporary. The system uses
|
|
.CW R26
|
|
and
|
|
.CW R27
|
|
as interrupt-time temporaries. Therefore none of these registers
|
|
should be used in user code.
|
|
.PP
|
|
The control registers are not known to the assembler.
|
|
Instead they are numbered registers
|
|
.CW M0 ,
|
|
.CW M1 ,
|
|
etc.
|
|
Use this trick to access, say,
|
|
.CW STATUS :
|
|
.P1
|
|
#define STATUS 12
|
|
MOVW M(STATUS), R1
|
|
.P2
|
|
.PP
|
|
Floating point registers are called
|
|
.CW F0
|
|
through
|
|
.CW F31 .
|
|
By convention,
|
|
.CW F24
|
|
must be initialized to the value 0.0,
|
|
.CW F26
|
|
to 0.5,
|
|
.CW F28
|
|
to 1.0, and
|
|
.CW F30
|
|
to 2.0;
|
|
this is done by the operating system.
|
|
.PP
|
|
The instructions and their syntax are different from those of the manufacturer's
|
|
manual.
|
|
There are no
|
|
.CW lui
|
|
and kin; instead there are
|
|
.CW MOVW
|
|
(move word),
|
|
.CW MOVH
|
|
(move halfword),
|
|
and
|
|
.CW MOVB
|
|
(move byte) pseudo-instructions. If the operand is unsigned, the instructions
|
|
are
|
|
.CW MOVHU
|
|
and
|
|
.CW MOVBU .
|
|
The order of operands is from left to right in dataflow order, just as
|
|
on the 68020 but not as in MIPS documentation.
|
|
This means that the
|
|
.CW Bcond
|
|
instructions are reversed with respect to the book; for example, a
|
|
.CW va
|
|
.CW BGTZ
|
|
generates a MIPS
|
|
.CW bltz
|
|
instruction.
|
|
.PP
|
|
The assembler is for the R2000, R3000, and most of the R4000 and R6000 architectures.
|
|
It understands the 64-bit instructions
|
|
.CW MOVV ,
|
|
.CW MOVVL ,
|
|
.CW ADDV ,
|
|
.CW ADDVU ,
|
|
.CW SUBV ,
|
|
.CW SUBVU ,
|
|
.CW MULV ,
|
|
.CW MULVU ,
|
|
.CW DIVV ,
|
|
.CW DIVVU ,
|
|
.CW SLLV ,
|
|
.CW SRLV ,
|
|
and
|
|
.CW SRAV .
|
|
The assembler does not have any cache, load-linked, or store-conditional instructions.
|
|
.PP
|
|
Some assembler instructions are expanded into multiple instructions by the loader.
|
|
For example the loader may convert the load of a 32 bit constant into an
|
|
.CW lui
|
|
followed by an
|
|
.CW ori .
|
|
.PP
|
|
Assembler instructions should be laid out as if there
|
|
were no load, branch, or floating point compare delay slots;
|
|
the loader will rearrange\(em\f2schedule\f1\(emthe instructions
|
|
to guarantee correctness and improve performance.
|
|
The only exception is that the correct scheduling of instructions
|
|
that use control registers varies from model to model of machine
|
|
(and is often undocumented) so you should schedule such instructions
|
|
by hand to guarantee correct behavior.
|
|
The loader generates
|
|
.P1
|
|
NOR R0, R0, R0
|
|
.P2
|
|
when it needs a true no-op instruction.
|
|
Use exactly this instruction when scheduling code manually;
|
|
the loader recognizes it and schedules the code before it and after it independently. Also,
|
|
.CW WORD
|
|
pseudo-ops are scheduled like no-ops.
|
|
.PP
|
|
The
|
|
.CW NOSCHED
|
|
pseudo-op disables instruction scheduling
|
|
(scheduling is enabled by default);
|
|
.CW SCHED
|
|
re-enables it.
|
|
Branch folding, code copying, and dead code elimination are
|
|
disabled for instructions that are not scheduled.
|
|
.SH
|
|
SPARC
|
|
.PP
|
|
Once you understand the Plan 9 model for the MIPS, the SPARC is familiar.
|
|
Registers have numerical names only:
|
|
.CW R0
|
|
through
|
|
.CW R31 .
|
|
Forget about register windows: Plan 9 doesn't use them at all.
|
|
The machine has 32 global registers, period.
|
|
.CW R1
|
|
[sic] is the stack pointer.
|
|
.CW R2
|
|
is the static base register, with value the address of
|
|
.CW setSB(SB) .
|
|
.CW R7
|
|
is the return register and also the register holding the first
|
|
argument to a C (not Alef) function, again with space reserved at
|
|
.CW 0(FP) .
|
|
.CW R14
|
|
is the loader temporary.
|
|
.PP
|
|
Floating-point registers are exactly as on the MIPS.
|
|
.PP
|
|
The control registers are known by names such as
|
|
.CW FSR .
|
|
The instructions to access these registers are
|
|
.CW MOVW
|
|
instructions, for example
|
|
.P1
|
|
MOVW Y, R8
|
|
.P2
|
|
for the SPARC instruction
|
|
.P1
|
|
rdy %r8
|
|
.P2
|
|
.PP
|
|
Move instructions are similar to those on the MIPS: pseudo-operations
|
|
that turn into appropriate sequences of
|
|
.CW sethi
|
|
instructions, adds, etc.
|
|
Instructions read from left to right. Because the arguments are
|
|
flipped to
|
|
.CW SUBCC ,
|
|
the condition codes are not inverted as on the MIPS.
|
|
.PP
|
|
The syntax for the ASI stuff is, for example to move a word from ASI 2:
|
|
.P1
|
|
MOVW (R7, 2), R8
|
|
.P2
|
|
The syntax for double indexing is
|
|
.P1
|
|
MOVW (R7+R8), R9
|
|
.P2
|
|
.PP
|
|
The SPARC's instruction scheduling is similar to the MIPS's.
|
|
The official no-op instruction is:
|
|
.P1
|
|
ORN R0, R0, R0
|
|
.P2
|
|
.SH
|
|
i960
|
|
.PP
|
|
Registers are numbered
|
|
.CW R0
|
|
through
|
|
.CW R31 .
|
|
Stack pointer is
|
|
.CW R29 ;
|
|
return register is
|
|
.CW R4 ;
|
|
static base is
|
|
.CW R28 ;
|
|
it is initialized to the address of
|
|
.CW setSB(SB) .
|
|
.CW R3
|
|
must be zero; this should be done manually early in execution by
|
|
.P1
|
|
SUBO R3, R3
|
|
.P2
|
|
.CW R27
|
|
is the loader temporary.
|
|
.PP
|
|
There is no support for floating point.
|
|
.PP
|
|
The Intel calling convention is not supported and cannot be used; use
|
|
.CW BAL
|
|
instead.
|
|
Instructions are mostly as in the book. The major change is that
|
|
.CW LOAD
|
|
and
|
|
.CW STORE
|
|
are both called
|
|
.CW MOV .
|
|
The extension character for
|
|
.CW MOV
|
|
is as in the manual:
|
|
.CW O
|
|
for ordinal,
|
|
.CW W
|
|
for signed, etc.
|
|
.SH
|
|
i386
|
|
.PP
|
|
The assembler assumes 32-bit protected mode.
|
|
The register names are
|
|
.CW SP ,
|
|
.CW AX ,
|
|
.CW BX ,
|
|
.CW CX ,
|
|
.CW DX ,
|
|
.CW BP ,
|
|
.CW DI ,
|
|
and
|
|
.CW SI .
|
|
The stack pointer (not a pseudo-register) is
|
|
.CW SP
|
|
and the return register is
|
|
.CW AX .
|
|
There is no physical frame pointer but, as for the MIPS,
|
|
.CW FP
|
|
is a pseudo-register that acts as
|
|
a frame pointer.
|
|
.PP
|
|
Opcode names are mostly the same as those listed in the Intel manual
|
|
with an
|
|
.CW L ,
|
|
.CW W ,
|
|
or
|
|
.CW B
|
|
appended to identify 32-bit,
|
|
16-bit, and 8-bit operations.
|
|
The exceptions are loads, stores, and conditionals.
|
|
All load and store opcodes to and from general registers, special registers
|
|
(such as
|
|
.CW CR0,
|
|
.CW CR3,
|
|
.CW GDTR,
|
|
.CW IDTR,
|
|
.CW SS,
|
|
.CW CS,
|
|
.CW DS,
|
|
.CW ES,
|
|
.CW FS,
|
|
and
|
|
.CW GS )
|
|
or memory are written
|
|
as
|
|
.P1
|
|
MOV\f2x\fP src,dst
|
|
.P2
|
|
where
|
|
.I x
|
|
is
|
|
.CW L ,
|
|
.CW W ,
|
|
or
|
|
.CW B .
|
|
Thus to get
|
|
.CW AL
|
|
use a
|
|
.CW MOVB
|
|
instruction. If you need to access
|
|
.CW AH ,
|
|
you must mention it explicitly in a
|
|
.CW MOVB :
|
|
.P1
|
|
MOVB AH, BX
|
|
.P2
|
|
There are many examples of illegal moves, for example,
|
|
.P1
|
|
MOVB BP, DI
|
|
.P2
|
|
that the loader actually implements as pseudo-operations.
|
|
.PP
|
|
The names of conditions in all conditional instructions
|
|
.CW J , (
|
|
.CW SET )
|
|
follow the conventions of the 68020 instead of those of the Intel
|
|
assembler:
|
|
.CW JOS ,
|
|
.CW JOC ,
|
|
.CW JCS ,
|
|
.CW JCC ,
|
|
.CW JEQ ,
|
|
.CW JNE ,
|
|
.CW JLS ,
|
|
.CW JHI ,
|
|
.CW JMI ,
|
|
.CW JPL ,
|
|
.CW JPS ,
|
|
.CW JPC ,
|
|
.CW JLT ,
|
|
.CW JGE ,
|
|
.CW JLE ,
|
|
and
|
|
.CW JGT
|
|
instead of
|
|
.CW JO ,
|
|
.CW JNO ,
|
|
.CW JB ,
|
|
.CW JNB ,
|
|
.CW JZ ,
|
|
.CW JNZ ,
|
|
.CW JBE ,
|
|
.CW JNBE ,
|
|
.CW JS ,
|
|
.CW JNS ,
|
|
.CW JP ,
|
|
.CW JNP ,
|
|
.CW JL ,
|
|
.CW JNL ,
|
|
.CW JLE ,
|
|
and
|
|
.CW JNLE .
|
|
.PP
|
|
The addressing modes have syntax like
|
|
.CW AX ,
|
|
.CW (AX) ,
|
|
.CW (AX)(BX*4) ,
|
|
.CW 10(AX) ,
|
|
and
|
|
.CW 10(AX)(BX*4) .
|
|
The offsets from
|
|
.CW AX
|
|
can be replaced by offsets from
|
|
.CW FP
|
|
or
|
|
.CW SB
|
|
to access names, for example
|
|
.CW extern+5(SB)(AX*2) .
|
|
.PP
|
|
Other notes: Non-relative
|
|
.CW JMP
|
|
and
|
|
.CW CALL
|
|
have a
|
|
.CW *
|
|
added to the syntax.
|
|
Only
|
|
.CW LOOP ,
|
|
.CW LOOPEQ ,
|
|
and
|
|
.CW LOOPNE
|
|
are legal loop instructions. Only
|
|
.CW REP
|
|
and
|
|
.CW REPN
|
|
are recognized repeaters. These are not prefixes, but rather
|
|
stand-alone opcodes that precede the strings, for example
|
|
.P1
|
|
CLD; REP; MOVSL
|
|
.P2
|
|
Segment override prefixes in
|
|
.CW MOD/RM
|
|
fields are not supported.
|
|
.SH
|
|
AMD64
|
|
.PP
|
|
The assembler assumes 64-bit mode unless a
|
|
.CW MODE
|
|
pseudo-operation is given:
|
|
.P1
|
|
MODE $32
|
|
.P2
|
|
to change to 32-bit mode.
|
|
The effect is mainly to diagnose instructions that are illegal in
|
|
the given mode, but the loader will also assume 32-bit operands and addresses,
|
|
and 32-bit PC values for call and return.
|
|
The assembler's conventions are similar to those for the 386, above.
|
|
The architecture provides extra fixed-point registers
|
|
.CW R8
|
|
to
|
|
.CW R15 .
|
|
All registers are 64 bit, but instructions access low-order 8, 16 and 32 bits
|
|
as described in the processor handbook.
|
|
For example,
|
|
.CW MOVL
|
|
to
|
|
.CW AX
|
|
puts a value in the low-order 32 bits and clears the top 32 bits to zero.
|
|
Literal operands are limited to signed 32 bit values, which are sign-extended
|
|
to 64 bits in 64 bit operations; the exception is
|
|
.CW MOVQ ,
|
|
which allows 64-bit literals.
|
|
The external registers in Plan 9's C are allocated from
|
|
.CW R15
|
|
down.
|
|
There are many new instructions, including the MMX and XMM media instructions,
|
|
and conditional move instructions.
|
|
MMX registers are
|
|
.CW M0
|
|
to
|
|
.CW M7 ,
|
|
and
|
|
XMM registers are
|
|
.CW X0
|
|
to
|
|
.CW X15 .
|
|
As with the 386 instruction names,
|
|
all new 64-bit integer instructions, and the MMX and XMM instructions
|
|
uniformly use
|
|
.CW L
|
|
for `long word' (32 bits) and
|
|
.CW Q
|
|
for `quad word' (64 bits).
|
|
Some instructions use
|
|
.CW O
|
|
(`octword') for 128-bit values, where the processor handbook
|
|
variously uses
|
|
.CW O
|
|
or
|
|
.CW DQ .
|
|
The assembler also consistently uses
|
|
.CW PL
|
|
for `packed long' in
|
|
XMM instructions, instead of
|
|
.CW Q ,
|
|
.CW DQ
|
|
or
|
|
.CW PI .
|
|
Either
|
|
.CW MOVL
|
|
or
|
|
.CW MOVQ
|
|
can be used to move values to and from control registers, even when
|
|
the registers might be 64 bits.
|
|
The assembler often accepts the handbook's name to ease conversion
|
|
of existing code (but remember that the operand order is uniformly
|
|
source then destination).
|
|
C's
|
|
.CW "long long"
|
|
type is 64 bits, but passed and returned by value, not by reference.
|
|
More notably, C pointer values are 64 bits, and thus
|
|
.CW "long long"
|
|
and
|
|
.CW "unsigned long long"
|
|
are the only integer types wide enough to hold a pointer value.
|
|
The C compiler and library use the XMM floating-point instructions, not
|
|
the old 387 ones, although the latter are implemented by assembler and loader.
|
|
Unlike the 386, the first integer or pointer argument is passed in a register, which is
|
|
.CW BP
|
|
for an integer or pointer (it can be referred to in assembly code by the pseudonym
|
|
.CW RARG ).
|
|
.CW AX
|
|
holds the return value from subroutines as before.
|
|
Floating-point results are returned in
|
|
.CW X0 ,
|
|
although currently the first floating-point parameter is not passed in a register.
|
|
All parameters less than 8 bytes in length have 8 byte slots reserved on the stack
|
|
to preserve alignment and simplify variable-length argument list access,
|
|
including the first parameter when passed in a register,
|
|
even though bytes 4 to 7 are not initialized.
|
|
.SH
|
|
Alpha
|
|
.PP
|
|
On the Alpha, all registers are 64 bits. The architecture handles 32-bit values
|
|
by giving them a canonical format (sign extension in the case of integer registers).
|
|
Registers are numbered
|
|
.CW R0
|
|
through
|
|
.CW R31 .
|
|
.CW R0
|
|
holds the return value from subroutines, and also the first parameter.
|
|
.CW R30
|
|
is the stack pointer,
|
|
.CW R29
|
|
is the static base,
|
|
.CW R26
|
|
is the link register, and
|
|
.CW R27
|
|
and
|
|
.CW R28
|
|
are linker temporaries.
|
|
.PP
|
|
Floating point registers are numbered
|
|
.CW F0
|
|
to
|
|
.CW F31 .
|
|
.CW F28
|
|
contains
|
|
.CW 0.5 ,
|
|
.CW F29
|
|
contains
|
|
.CW 1.0 ,
|
|
and
|
|
.CW F30
|
|
contains
|
|
.CW 2.0 .
|
|
.CW F31
|
|
is always
|
|
.CW 0.0
|
|
on the Alpha.
|
|
.PP
|
|
The extension character for
|
|
.CW MOV
|
|
follows DEC's notation:
|
|
.CW B
|
|
for byte (8 bits),
|
|
.CW W
|
|
for word (16 bits),
|
|
.CW L
|
|
for long (32 bits),
|
|
and
|
|
.CW Q
|
|
for quadword (64 bits).
|
|
Byte and ``word'' loads and stores may be made unsigned
|
|
by appending a
|
|
.CW U .
|
|
.CW S
|
|
and
|
|
.CW T
|
|
refer to IEEE floating point single precision (32 bits) and double precision (64 bits), respectively.
|
|
.SH
|
|
Power PC
|
|
.PP
|
|
The Power PC follows the Plan 9 model set by the MIPS and SPARC,
|
|
not the elaborate ABIs.
|
|
The 32-bit instructions of the 60x and 8xx PowerPC architectures are supported;
|
|
there is no support for the older POWER instructions.
|
|
Registers are
|
|
.CW R0
|
|
through
|
|
.CW R31 .
|
|
.CW R0
|
|
is initialized to zero; this is done by C start up code
|
|
and assumed by the compiler and loader.
|
|
.CW R1
|
|
is the stack pointer.
|
|
.CW R2
|
|
is the static base register, with value the address of
|
|
.CW setSB(SB) .
|
|
.CW R3
|
|
is the return register and also the register holding the first
|
|
argument to a C function, with space reserved at
|
|
.CW 0(FP)
|
|
as on the MIPS.
|
|
.CW R31
|
|
is the loader temporary.
|
|
The external registers in Plan 9's C are allocated from
|
|
.CW R30
|
|
down.
|
|
.PP
|
|
Floating point registers are called
|
|
.CW F0
|
|
through
|
|
.CW F31 .
|
|
By convention, several registers are initialized
|
|
to specific values; this is done by the operating system.
|
|
.CW F27
|
|
must be initialized to the value
|
|
.CW 0x4330000080000000
|
|
(used by float-to-int conversion),
|
|
.CW F28
|
|
to the value 0.0,
|
|
.CW F29
|
|
to 0.5,
|
|
.CW F30
|
|
to 1.0, and
|
|
.CW F31
|
|
to 2.0.
|
|
.PP
|
|
As on the MIPS and SPARC, the assembler accepts arbitrary literals
|
|
as operands to
|
|
.CW MOVW ,
|
|
and also to
|
|
.CW ADD
|
|
and others where `immediate' variants exist,
|
|
and the loader generates sequences
|
|
of
|
|
.CW addi ,
|
|
.CW addis ,
|
|
.CW oris ,
|
|
etc. as required.
|
|
The register indirect addressing modes use the same syntax as the SPARC,
|
|
including double indexing when allowed.
|
|
.PP
|
|
The instruction names are generally derived from the Motorola ones,
|
|
subject to slight transformation:
|
|
the
|
|
.CW . ' `
|
|
marking the setting of condition codes is replaced by
|
|
.CW CC ,
|
|
and when the letter
|
|
.CW o ' `
|
|
represents `OE=1' it is replaced by
|
|
.CW V .
|
|
Thus
|
|
.CW add ,
|
|
.CW addo.
|
|
and
|
|
.CW subfzeo.
|
|
become
|
|
.CW ADD ,
|
|
.CW ADDVCC
|
|
and
|
|
.CW SUBFZEVCC .
|
|
As well as the three-operand conditional branch instruction
|
|
.CW BC ,
|
|
the assembler provides pseudo-instructions for the common cases:
|
|
.CW BEQ ,
|
|
.CW BNE ,
|
|
.CW BGT ,
|
|
.CW BGE ,
|
|
.CW BLT ,
|
|
.CW BLE ,
|
|
.CW BVC ,
|
|
and
|
|
.CW BVS .
|
|
The unconditional branch instruction is
|
|
.CW BR .
|
|
Indirect branches use
|
|
.CW "(CTR)"
|
|
or
|
|
.CW "(LR)"
|
|
as target.
|
|
.PP
|
|
Load or store operations are replaced by
|
|
.CW MOV
|
|
variants in the usual way:
|
|
.CW MOVW
|
|
(move word),
|
|
.CW MOVH
|
|
(move halfword with sign extension), and
|
|
.CW MOVB
|
|
(move byte with sign extension, a pseudo-instruction),
|
|
with unsigned variants
|
|
.CW MOVHZ
|
|
and
|
|
.CW MOVBZ ,
|
|
and byte-reversing
|
|
.CW MOVWBR
|
|
and
|
|
.CW MOVHBR .
|
|
`Load or store with update' versions are
|
|
.CW MOVWU ,
|
|
.CW MOVHU ,
|
|
and
|
|
.CW MOVBZU .
|
|
Load or store multiple is
|
|
.CW MOVMW .
|
|
The exceptions are the string instructions, which are
|
|
.CW LSW
|
|
and
|
|
.CW STSW ,
|
|
and the reservation instructions
|
|
.CW lwarx
|
|
and
|
|
.CW stwcx. ,
|
|
which are
|
|
.CW LWAR
|
|
and
|
|
.CW STWCCC ,
|
|
all with operands in the usual data-flow order.
|
|
Floating-point load or store instructions are
|
|
.CW FMOVD ,
|
|
.CW FMOVDU ,
|
|
.CW FMOVS ,
|
|
and
|
|
.CW FMOVSU .
|
|
The register to register move instructions
|
|
.CW fmr
|
|
and
|
|
.CW fmr.
|
|
are written
|
|
.CW FMOVD
|
|
and
|
|
.CW FMOVDCC .
|
|
.PP
|
|
The assembler knows the commonly used special purpose registers:
|
|
.CW CR ,
|
|
.CW CTR ,
|
|
.CW DEC ,
|
|
.CW LR ,
|
|
.CW MSR ,
|
|
and
|
|
.CW XER .
|
|
The rest, which are often architecture-dependent, are referenced as
|
|
.CW SPR(n) .
|
|
The segment registers of the 60x series are similarly
|
|
.CW SEG(n) ,
|
|
but
|
|
.I n
|
|
can also be a register name, as in
|
|
.CW SEG(R3) .
|
|
Moves between special purpose registers and general purpose ones,
|
|
when allowed by the architecture,
|
|
are written as
|
|
.CW MOVW ,
|
|
replacing
|
|
.CW mfcr ,
|
|
.CW mtcr ,
|
|
.CW mfmsr ,
|
|
.CW mtmsr ,
|
|
.CW mtspr ,
|
|
.CW mfspr ,
|
|
.CW mftb ,
|
|
and many others.
|
|
.PP
|
|
The fields of the condition register
|
|
.CW CR
|
|
are referenced as
|
|
.CW CR(0)
|
|
through
|
|
.CW CR(7) .
|
|
They are used by the
|
|
.CW MOVFL
|
|
(move field) pseudo-instruction,
|
|
which produces
|
|
.CW mcrf
|
|
or
|
|
.CW mtcrf .
|
|
For example:
|
|
.P1
|
|
MOVFL CR(3), CR(0)
|
|
MOVFL R3, CR(1)
|
|
MOVFL R3, $7, CR
|
|
.P2
|
|
They are also accepted in
|
|
the conditional branch instruction, for example
|
|
.P1
|
|
BEQ CR(7), label
|
|
.P2
|
|
Fields of the
|
|
.CW FPSCR
|
|
are accessed using
|
|
.CW MOVFL
|
|
in a similar way:
|
|
.P1
|
|
MOVFL FPSCR, F0
|
|
MOVFL F0, FPSCR
|
|
MOVFL F0, $7, FPSCR
|
|
MOVFL $0, FPSCR(3)
|
|
.P2
|
|
producing
|
|
.CW mffs ,
|
|
.CW mtfsf
|
|
or
|
|
.CW mtfsfi ,
|
|
as appropriate.
|
|
.SH
|
|
ARM
|
|
.PP
|
|
The assembler provides access to
|
|
.CW R0
|
|
through
|
|
.CW R14
|
|
and the
|
|
.CW PC .
|
|
The stack pointer is
|
|
.CW R13 ,
|
|
the link register is
|
|
.CW R14 ,
|
|
and the static base register is
|
|
.CW R12 .
|
|
.CW R0
|
|
is the return register and also the register holding
|
|
the first argument to a subroutine.
|
|
The assembler supports the
|
|
.CW CPSR
|
|
and
|
|
.CW SPSR
|
|
registers.
|
|
It also knows about coprocessor registers
|
|
.CW C0
|
|
through
|
|
.CW C15 .
|
|
Floating registers are
|
|
.CW F0
|
|
through
|
|
.CW F7 ,
|
|
.CW FPSR
|
|
and
|
|
.CW FPCR .
|
|
.PP
|
|
As with the other architectures, loads and stores are called
|
|
.CW MOV ,
|
|
e.g.
|
|
.CW MOVW
|
|
for load word or store word, and
|
|
.CW MOVM
|
|
for
|
|
load or store multiple,
|
|
depending on the operands.
|
|
.PP
|
|
Addressing modes are supported by suffixes to the instructions:
|
|
.CW .IA
|
|
(increment after),
|
|
.CW .IB
|
|
(increment before),
|
|
.CW .DA
|
|
(decrement after), and
|
|
.CW .DB
|
|
(decrement before).
|
|
These can only be used with the
|
|
.CW MOV
|
|
instructions.
|
|
The move multiple instruction,
|
|
.CW MOVM ,
|
|
defines a range of registers using brackets, e.g.
|
|
.CW [R0-R12] .
|
|
The special
|
|
.CW MOVM
|
|
addressing mode bits
|
|
.CW W ,
|
|
.CW U ,
|
|
and
|
|
.CW P
|
|
are written in the same manner, for example,
|
|
.CW MOVM.DB.W .
|
|
A
|
|
.CW .S
|
|
suffix allows a
|
|
.CW MOVM
|
|
instruction to access user
|
|
.CW R13
|
|
and
|
|
.CW R14
|
|
when in another processor mode.
|
|
Shifts and rotates in addressing modes are supported by binary operators
|
|
.CW <<
|
|
(logical left shift),
|
|
.CW >>
|
|
(logical right shift),
|
|
.CW ->
|
|
(arithmetic right shift), and
|
|
.CW @>
|
|
(rotate right); for example
|
|
.CW "R7>>R2" or
|
|
.CW "R2@>2" .
|
|
The assembler does not support indexing by a shifted expression;
|
|
only names can be doubly indexed.
|
|
.PP
|
|
Any instruction can be followed by a suffix that makes the instruction conditional:
|
|
.CW .EQ ,
|
|
.CW .NE ,
|
|
and so on, as in the ARM manual, with synonyms
|
|
.CW .HS
|
|
(for
|
|
.CW .CS )
|
|
and
|
|
.CW .LO
|
|
(for
|
|
.CW .CC ),
|
|
for example
|
|
.CW ADD.NE .
|
|
Arithmetic
|
|
and logical instructions
|
|
can have a
|
|
.CW .S
|
|
suffix, as ARM allows, to set condition codes.
|
|
.PP
|
|
The syntax of the
|
|
.CW MCR
|
|
and
|
|
.CW MRC
|
|
coprocessor instructions is largely as in the manual, with the usual adjustments.
|
|
The assembler directly supports only the ARM floating-point coprocessor
|
|
operations used by the compiler:
|
|
.CW CMP ,
|
|
.CW ADD ,
|
|
.CW SUB ,
|
|
.CW MUL ,
|
|
and
|
|
.CW DIV ,
|
|
all with
|
|
.CW F
|
|
or
|
|
.CW D
|
|
suffix selecting single or double precision.
|
|
Floating-point load or store become
|
|
.CW MOVF
|
|
and
|
|
.CW MOVD .
|
|
Conversion instructions are also specified by moves:
|
|
.CW MOVWD ,
|
|
.CW MOVWF ,
|
|
.CW MOVDW ,
|
|
.CW MOVWD ,
|
|
.CW MOVFD ,
|
|
and
|
|
.CW MOVDF .
|
|
.SH
|
|
AMD 29000
|
|
.PP
|
|
For details about this assembly language, which was built for the AMD 29240,
|
|
look at the sources or examine compiler output.
|