# Sketch of Next Gen Stack CPU ISA
A. Instruction Reference
The instructions documented below assume a 64-bit cell width. However, the ISA
is designed to be extensible, ranging from 16-bits (minimum width supported) to
up to 1024 bits wide. Note that some CPUs may choose to re-allocate
opcodes intended for unsupported widths for other purposes. These opcodes are
machine-specific, and not guaranteed to be upward compatible with future
revisions of the ISA, however.
A.1. Group 0 Instructions
Instructions in group 0 are hard to categorize elsewhere.
A.1.1. BRK ( -- )
IF COND THEN
Trap(BREAKPOINT)
END
Performs a breakpoint trap.
A.1.2. SC ( ... -- ... )
IF COND THEN
Trap(SYSCALL)
END
Performs a system call trap. Generally speaking, a service number is placed
onto the top of the data stack, indicating which service the operating system
is to perform. The input and output stack effects are, much like any
subroutine, defined by the service performed.
A.1.3. POP ( -- x ) (R: x -- )
Move a cell from the return stack to the data stack.
A.1.4. PUSH ( x -- ) (R: -- x )
Move a cell from the data stack to the return stack.
A.1.5. JMPDI ( a -- )
address := POP(D)
IF COND=TRUE THEN
PC := address
END
Jump to the absolute address on the data stack. This instruction may jump
conditionally if prefixed with the COND prefix.
A.1.6. CALLDI ( a -- ) (R: -- pc+1 )
IF COND=TRUE THEN
address := POP(D)
PUSH(R, PC+1)
PC := address
END
Call the subroutine whose address is on the data stack. This instruction may
jump conditionally if prefixed with the COND prefix.
A.1.7. RET (aka JMPRI) ( -- ) (R: a -- )
address := POP(R)
IF COND=TRUE THEN
PC := address
END
Return from the current subroutine by jumping to the address at the top of the
return stack. This instruction may jump conditionally if prefixed with the
COND prefix.
A.1.8. SWITCH (aka CALLRI) ( -- ) (R: a -- pc+1 )
IF COND=TRUE THEN
address := POP(R)
PUSH(R, PC+1)
PC := address
END
Switch co-routines by swapping the next instruction's address and the address
at the top of the return stack. This instruction may jump conditionally if
prefixed with the COND prefix.
A.1.9. CR! ( x1 x2 -- )
Stores x1 into control register x2. The side-effects this has is control
register dependent.
NOTE: On some processor variants and/or control registers, this instruction may
trap for emulation in software.
A.1.10. CR@ ( x -- x )
Reads a control register's current value and places it onto the data stack.
This might have side-effects; refer to the control register's documentation for
more details.
NOTE: On some processor variants and/or control registers, this instruction may
trap for emulation in software.
A.1.11. DI ( -- )
Disable interrupts. This is typically a faster and more atomic shortcut for
regNum CR@ mask BIC regNum CR! .
NOTE: On some processor variants, this instruction may trap for emulation in
software. Some processor variants may not support interrupts.
A.1.12. EI ( -- )
Enable interrupts. This is typically a faster and more atomic shortcut for
regNum CR@ mask OR regNum CR! .
NOTE: On some processor variants, this instruction may trap for emulation in
software. Some processor variants may not support interrupts.
A.1.13. SEC ( -- )
The COND instruction typically performs its comparisons against the top of the
data stack (T). This prefix alters COND so that it works with the second top
of stack (S). It has no other effects.
A.1.14. R@ ( -- x ) (R: x -- x )
Fetches the current top of the return stack and places it onto the data stack.
It does NOT pop the return stack. This is a faster and more atomic equivalent
of POP DUP PUSH.
A.2. Group 1 Instructions
Instructions in group 1 push a signed or unsigned literal onto the data stack.
.-----------+---+---------------.
| siz | S | 0 0 0 1 |
`-----------+---+---------------'
The siz-bits indicates the size of the datum to push onto the stack. The S bit
is true if the value is to be sign-extended; false for zero-extended.
A minimum of 8- and 16-bit quantities must be supported.
A.3. BOOL2 ( x1 x2 -- x ) and BOOL2 ( x1 x2 -- x1 x2 x )
Instructions in groups 2 and 3 comprise the BOOL2 instruction. These two
groups differ in whether the input parameters are first popped off the stack
(group 3) or not (group 2).
The BOOL2 instruction computes a boolean function given two parameters from the
data stack. The upper four bits of the opcode forms a look-up table which
determines the operation to perform.
.---------------+-----------+---.
| a b c d | 0 0 1 | D |
`---------------+-----------+---'
Given two bits (one each from x1 and x2), use abcd above to calculate
the result according to this table:
x1 x2 || r AND NAND OR NOR XOR XNOR BIC
-----------
0 0 a 0 1 0 1 0 1 0
0 1 b 0 1 1 0 1 0 0
1 0 c 0 1 1 0 1 0 1
1 1 d 1 0 1 0 0 1 0
You can also use BOOL2 to push zero and negative-1 constants onto the stack
more quickly and compactly than you can with any of the LIT instructions.
Setting abcd=0000 pushes zero, while setting abcd=1111 pushes negative one.
Many stack manipulation operations are implemented using BOOL2 as well.
DROP NIP OVER DUP
D=1 D=1 D=0 D=0
0 0 0 0
0 1 0 1
1 0 1 0
1 1 1 1
Some common operations are encoded as follows:
00000010 0 00000011 2DROP 0
00010010 2DUP AND 00010011 AND
00100010 2DUP BIC 00100011 BIC
00110010 OVER 00110011 DROP
01000010 2DUP SWAP BIC 01000011 SWAP BIC
01010010 DUP 01010011 NIP
01100010 2DUP XOR 01100011 XOR
01110010 2DUP OR 01110011 OR
10000010 2DUP NOR 10000011 NOR
10010010 2DUP XNOR 10010011 XNOR
10100010 DUP INVERT 10100011 INVERT NIP
10110010 2DUP INVERT OR 10110011 INVERT OR
11000010 OVER INVERT 11000011 DROP INVERT
11010010 2DUP SWAP INVERT OR 11010011 SWAP INVERT OR
11100010 2DUP NAND 11100011 NAND
11110010 -1 11110011 2DROP -1
A.4. BOOL1 ( x1 -- x ) and BOOL1 ( x1 -- x1 x )
Instructions in groups 4 and 5 comprise the BOOL1 instruction. These two
groups differ in whether the input parameter is first popped off the stack
(group 5) or not (group 4).
The BOOL1 instruction computes a boolean function given one parameter from the
data stack. The upper four bits of the opcode forms a look-up table which
determines the operation to perform.
.---------------+-----------+---.
| 0 . 0 . c . d | 0 . 1 . 0 | D |
`---------------+-----------+---'
For each bit in x1, use cd above to calculate the result according to
this table:
x1 || r ZERO NEG1 INVERT NOP
-------
0 c 0 1 1 0
1 d 0 1 0 1
You can also use BOOL1 to push zero and negative-1 constants onto the stack
more quickly and compactly than you can with any of the LIT instructions.
Setting cd=00 pushes zero, while setting cd=11 pushes negative one. While this
overlaps with BOOL2 instruction encodings, it's useful to have these
instructions for those cases where you only want to encode DROP 0 or DROP -1
instead of 2DROP 0 or 2DROP -1.
Some common operations are encoded as follows:
00000100 0 00000101 DROP 0
00010100 DUP 00010101 NOP
00100100 DUP INVERT 00100101 INVERT
00110100 -1 00110101 DROP -1
A.4. ADDSUB2 ( x1 x2 -- x ) and ADDSUB2 ( x1 x2 -- x1 x2 x )
Instructions in groups 6 and 7 comprise the ADDSUB2 instruction. These two
groups differ in whether the input parameter is first popped off the stack
(group 7) or not (group 6).
The ADDSUB2 instruction computes a 2's-compliment sum given two parameters from
the data stack. The upper bits of the opcode controls the precise data path
through the ALU to calculate this sum. The result can be used for addition or
subtraction, depending on configuration.
.---+---+-------+-----------+---.
| 0 | b | cc | 0 1 1 | D |
`---+---+-------+-----------+---'
The b bit inverts the x2 operand if set; otherwise, it leaves the value
unchanged. The cc field offers input carry control:
00 Ignore carry flag; assume carry is clear.
01 Ignore carry flag; assume carry is set.
10 Use carry flag as-is.
11 Use inverted carry flag.
Note that this instruction always updates carry.
Some common operations are encoded as follows:
00000110 2DUP ADD 00000111 ADD
00010110 2DUP ADD 1+ 00010111 ADD 1+
00100110 2DUP ADC 00100111 ADC
00110110 2DUP ADC 1- 00110111 ADC 1-
01000110 2DUP SUB 1- 01000111 SUB 1-
01010110 2DUP SUB 01010111 SUB
01100110 2DUP SBC 01100111 SBC
01110110 2DUP SBC 1- 01110111 SBC 1-
A.5. COND ( x2 -- ) and SEC COND ( x1 x2 -- x2 )
IF not prefixed with SEC THEN
value := POP(D)
ELSE
value := POP(S)
END
COND := ((value < 0) & a) | ((value = 0) & b) | (carry & c) == y
The COND prefix alters the behavior of a subsequent control flow instruction,
such as the instructions in the JUMPI or BRANCHES groups. The COND prefix has
no effect on instructions which do not transfer control.
.---+---+---+---+-----------+---.
| 0 | a | b | c | 1 0 0 | y |
`---+---+---+---+-----------+---'
Assuming the c bit is set to 0, the encoding of a, b, and y can give the
following checks:
Signed Unsigned
a b y Check Check
---------------------------------
0 0 0 always always
0 0 1 never never
0 1 0 value != 0 value > 0
0 1 1 value = 0 value = 0
1 0 0 value >= 0
1 0 1 value < 0
1 1 0 value > 0
1 1 1 value <= 0
With the c bit set, there is an additional check on the carry flag.
SEC is a prefix that modifies the COND prefix to work with the second top of
stack instead of the direct top of stack. This is most often used for
emulating S16X4A control flow instructions.
A.6. Direct Jumps and Calls
Instructions in group 10 are responsible for direct transfer of program
control. Without the COND prefix, the control flow transfers are
unconditional; with the COND prefix, they are conditional.
.---+---+---+---+---------------.
| sss | C | 1 0 1 0 |
`---+---+---+---+---------------'
The size (sss) field indicates how big the displacement to the program counter
is. The C bit is set for a subroutine call, clear for a simple jump.
A.7. Shifts and Rotations
The instructions in group 11 perform bitwise rotations and shifts.
.---+-----------+---------------.
| 0 | fff | 1 0 1 1 |
`---+-----------+---------------'
The function (fff) field selects the precise operation to perform, according to
the opcode encodings below:
00001011 LSL ( n cnt -- n' )
00011011 LSR ( n cnt -- n' )
00101011 ASR ( n cnt -- n' )
00111011 PERMUTE ( n idx -- n' )
01001011 RL ( n cnt -- n' )
01011011 RLC ( n cnt -- n' )
01101011 RR ( n cnt -- n' )
01111011 RRC ( n cnt -- n' )
The LSL and LSR operations perform logical shifts (left and right,
respectively). ASR also performs a right shift, but does so arithmetically.
RL and RR perform rotations left and right without cycling through the carry
flag. RLC and RRC do so including the carry flag as an additional bit. For
example, if we execute the instructions SLIT8 $88 ULIT8 $01, then the following
instructions will produce the following results:
LSL
N-1 7 6 5 4 3 2 1 0 C
.---+-/-+---+---+---+---+---+---+---+---. .---.
| 1 |...| 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | | 1 |
`---+-/-+---+---+---+---+---+---+---+---' `---'
| ^
| |
`-------------------------------------------'
LSR
N-1 7 6 5 4 3 2 1 0 C
.---+-/-+---+---+---+---+---+---+---+---. .---.
| 0 |...| 1 | 1 | 0 | 0 | 0 | 1 | 0 | 0 |-->| 0 |
`---+-/-+---+---+---+---+---+---+---+---' `---'
ASR
N-1 7 6 5 4 3 2 1 0 C
.---+-/-+---+---+---+---+---+---+---+---. .---.
| 1 |...| 1 | 1 | 0 | 0 | 0 | 1 | 0 | 0 |-->| 0 |
`---+-/-+---+---+---+---+---+---+---+---' `---'
RL
N-1 7 6 5 4 3 2 1 0 C
.---+-/-+---+---+---+---+---+---+---+---. .---.
| 1 |...| 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | | 1 |
`---+-/-+---+---+---+---+---+---+---+---' `---'
| ^ ^
| | |
`-----------------------------------'-------'
RLC
N-1 7 6 5 4 3 2 1 0 C
.---+-/-+---+---+---+---+---+---+---+---. .---.
| 1 |...| 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 |<--| 1 |
`---+-/-+---+---+---+---+---+---+---+---' `---'
| ^
| |
`-------------------------------------------'
RR
N-1 7 6 5 4 3 2 1 0 C
.---+-/-+---+---+---+---+---+---+---+---. .---.
| 0 |...| 1 | 1 | 0 | 0 | 0 | 1 | 0 | 0 |-->| 0 |
`---+-/-+---+---+---+---+---+---+---+---' `---'
^ |
| |
`-----------------------------------'
RRC
N-1 7 6 5 4 3 2 1 0 C
.---+-/-+---+---+---+---+---+---+---+---. .---.
| 0 |...| 1 | 1 | 0 | 0 | 0 | 1 | 0 | 0 |-->| 0 |
`---+-/-+---+---+---+---+---+---+---+---' `---'
^ |
| |
`-------------------------------------------'
TODO: Look into ways of unifying rotations and shifts using option bits. It
might not be possible due to having too few bits available; but, if it can be
done, we should use that approach instead of function decodes.
The PERMUTE instruction is useful for re-arranging the bytes within a
multi-byte cell. You have direct control of which nybbles in an input cell
land in an output cell. If multiple source nybbles are routed to the same
destination nybble, they are logically-ORed.
For example, on a 32-bit processor, a do-nothing permutation would look like
this: ULIT32 $12345678 ULIT32 $76543210 PERMUTE. To reverse the bytes: ULIT32
$12345678 ULIT32 $10325476 PERMUTE.
This instruction is quite useful for implementing conversions to/from
big-endian representation.
A.8. PC-Relative Effective Addresses
The group 12 set of instructions are used to push PC-relative addresses onto
the stack for subsequent access by loads and stores (group 13 and 14
instructions).
.---+---+---+---+---------------.
| sss | S | 1 1 0 0 |
`---+---+---+---+---------------'
The size (sss) field indicates how big the PC-relative displacement is (8 bits,
16 bits, etc.). The S bit indicates which stack to push the effective address
onto; if clear, the return stack. If set, the data stack.
A.9. Stores and Loads
The group 13 instructions allows data to be stored into memory. Group 14
instructions can retrieve this data back via a set of signed and unsigned
loads.
Note that loads and stores may cause a trap. Some CPUs may offer memory
protection. Others require loads and stores to occur only on naturally aligned
fields in memory. Etc.
A.9.1. Stores
.---+---+---+-------------------.
| sss | 0 1 1 0 1 |
`---+---+---+-------------------'
The size field (sss) indicates the size of the data to store into memory. As
indicated below, the data stored into memory comes from the lowest set of bits.
6 3 1
3 1 5 7 0
.---------------------------+---.
|///////////////////////////| | sss=000 Byte
+-----------------------+---+---+
|///////////////////////| | sss=001 Half-word
+---------------+-------+-------+
|///////////////| | sss=010 Word
+---------------+---------------+
| | sss=011 Double-word
`-------------------------------'
A.9.2. Loads
.---+---+---+---+---------------.
| sss | S | 1 1 1 0 |
`---+---+---+---+---------------'
The size field (sss) indicates the size of the data to load from memory. The S
bit indicates if the data retrieved is interpreted as an unsigned (0;
zero-extended) or signed (1; sign-extended) quantity. A load always affects
the full cell width in the data stack.
6 3 1
3 1 5 7 0
.---------------------------+---.
|///////////////////////////| | sss=000 Byte
+-----------------------+---+---+
|///////////////////////| | sss=001 Half-word
+---------------+-------+-------+
|///////////////| | sss=010 Word
+---------------+---------------+
| | sss=011 Double-word
`-------------------------------'
.---.
|///| Sign-extension or zero-extension, depending on S bit.
`---'
B. S16X4(A) to ISA/NG Migration
Just as the 8086 was intended to be source-code compatible with the 8008 and
8080, so too is ISA/NG intended to be as source-code compatible with the
S16X4(A) MISC processors as possible. However, they are not binary compatible.
Here's the complete mapping of instruction sequences from the S16X4 to the
ISA/NG.
S16X4A ISA/NG
====== ======
NOP NOP
LIT ULIT16 or PEAD16 depending on context
FWM ULD16
SWM UST16
ADD ADD
AND AND
XOR XOR
LIT/ZGO COND(T=0)/JMP or ULIT16/SEC/COND/JMPDI
ZGO SEC/COND/JMPDI
FBM ULD8
SBM UST8
LCALL CALL
ICALL CALLDI or SWITCH (depending on context)
LIT/NZGO COND(T<>0)/JMP or ULIT16/SEC/COND/JMPDI
NZGO SEC/COND/JMPDI
LIT/GO JMP or ULIT16/JMPDI
GO JMPDI or RET (depending on context)
C. Instruction Mapping
C.1. Binary Encoding
00000000 BRK EEPROM Patch Breakpoint
00010000 SC System Call
001f0000 PUSHPOP PUSH and POP instructions
01sc0000 JUMPI Indirect Control Flow
100f0000 CRLDST Control Register accessors
101e0000 EIDI Enable/Disable interrupts
11000000 SEC Prefix for COND
11010000 R@ Return stack accessor
sssS0001 ... LIT Literal load group
ffff001d BOOL2
00ff010d BOOL1
0bcc011d ADDSUB2
0abc100y COND
sssc1010 ... BRANCHES Direct control flow
ffff1011 MULSHF2
sssy1100 ... PEA Support for PC-relative programs
sss01101 STORES
sssS1110 LOADS
Illegal Opcode Encodings (These will trap)
111x0000
01xx010x
1xxx01xx
1xxx1011
xxx11101
xxxx1111
JUMPI Group
01000000 JMPDI PC=T
01010000 JSRDI R=PC+1; PC=T
01100000 RET PC=R
01110000 SWITCH R=PC+1; PC=R
BRANCHES
sss01010 JMP PC+ea
sss11010 CALL PC+ea
BOOL2
00000010 0
00010010 2DUP AND
00100010 2DUP BIC
00110010 OVER
01000010 2DUP SWAP BIC
01010010 DUP
01100010 2DUP XOR
01110010 2DUP OR
10000010 2DUP NOR
10010010 2DUP XNOR
10100010 DUP INVERT
10110010 2DUP INVERT OR
11000010 OVER INVERT
11010010 2DUP SWAP INVERT OR
11100010 2DUP NAND
11110010 -1
00000011 2DROP 0
00010011 AND
00100011 BIC
00110011 DROP
01000011 SWAP BIC
01010011 NIP
01100011 XOR
01110011 OR
10000011 NOR
10010011 XNOR
10100011 INVERT NIP
10110011 INVERT OR
11000011 DROP INVERT
11010011 SWAP INVERT OR
11100011 NAND
11110011 2DROP -1
BOOL1
00000100 0
00010100 DUP
00100100 DUP INVERT
00110100 -1
00000101 DROP 0
00010101 NOP
00100101 INVERT
00110101 DROP -1
ADDSUB2
00000110 2DUP ADD
00010110 2DUP ADD 1+
00100110 2DUP ADC
00110110 2DUP ADC 1-
01000110 2DUP SUB 1-
01010110 2DUP SUB
01100110 2DUP SBC
01110110 2DUP SBC 1-
00000111 ADD
00010111 ADD 1+
00100111 ADC
00110111 ADC 1-
01000111 SUB 1-
01010111 SUB
01100111 SBC
01110111 SBC 1-
MULSHF2
00001011 LSL
00011011 LSR
00101011 ASR
00111011 PERMUTE
01001011 RL
01011011 RLC
01101011 RR
01111011 RRC
1xxx1011 illegal
C.2. Opcode Map
0 1 2 3 4 5 6 7
0 BRK ULIT8 BOOL2 BOOL2 BOOL1 BOOL1 ADDSUB2 ADDSUB2
1 SC SLIT8 BOOL2 BOOL2 BOOL1 BOOL1 ADDSUB2 ADDSUB2
2 POP ULIT16 BOOL2 BOOL2 BOOL1 BOOL1 ADDSUB2 ADDSUB2
3 PUSH SLIT16 BOOL2 BOOL2 BOOL1 BOOL1 ADDSUB2 ADDSUB2
4 JMPDI (JUMPI) ULIT32 BOOL2 BOOL2 --- --- ADDSUB2 ADDSUB2
5 CALLDI(JUMPI) SLIT32 BOOL2 BOOL2 --- --- ADDSUB2 ADDSUB2
6 RET (JUMPI) ULIT64 BOOL2 BOOL2 --- --- ADDSUB2 ADDSUB2
7 SWITCH(JUMPI) SLIT64 BOOL2 BOOL2 --- --- ADDSUB2 ADDSUB2
8 CR! --- BOOL2 BOOL2 --- --- --- ---
9 CR@ --- BOOL2 BOOL2 --- --- --- ---
A DI --- BOOL2 BOOL2 --- --- --- ---
B EI --- BOOL2 BOOL2 --- --- --- ---
C SEC[1] --- BOOL2 BOOL2 --- --- --- ---
D R@ --- BOOL2 BOOL2 --- --- --- ---
E --- --- BOOL2 BOOL2 --- --- --- ---
F --- --- BOOL2 BOOL2 --- --- --- ---
8 9 A B C D E F
0 COND[2] COND[2] JMP8 LSL PEAR8 ST8 ULD8 ---
1 COND[2] COND[2] CALL8 LSR PEAD8 --- SLD8 ---
2 COND[2] COND[2] JMP16 ASR PEAR16 ST16 ULD16 ---
3 COND[2] COND[2] CALL16 PERMUTE PEAD16 --- SLD16 ---
4 COND[2] COND[2] JMP32 RL PEAR32 ST32 ULD32 ---
5 COND[2] COND[2] CALL32 RLC PEAD32 --- SLD32 ---
6 COND[2] COND[2] JMP64 RR PEAR64 ST64 ULD64 ---
7 COND[2] COND[2] CALL64 RRC PEAD64 --- SLD64 ---
8 --- --- --- --- --- --- --- ---
9 --- --- --- --- --- --- --- ---
A --- --- --- --- --- --- --- ---
B --- --- --- --- --- --- --- ---
C --- --- --- --- --- --- --- ---
D --- --- --- --- --- --- --- ---
E --- --- --- --- --- --- --- ---
F --- --- --- --- --- --- --- ---
[1] - Instruction Prefix. Modifies behavior of COND.
[2] - Instruction Prefix. Modifies behavior of control flow instructions.