TMS9900 Emulator Framework

A TMS9900 Emulator Framework
Version 0.8

Bill Seymour
2011-01-15

Copyright Bill Seymour 2011.
Distributed under the Boost Software License, Version 1.0.
(See accompanying file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)

Overview

This paper describes an open-source C++ framework for emulating the Texas Instruments TMS9900 microprocessor. It provides the three CPU registers and a 64kb memory; and it executes all the TMS9900 instructions.

The current version of the library also provides an abstract base class for Communications Register Unit (CRU) devices but doesn’t provide any other I/O.

Comming Real Soon Now are support for the TMS9901 parallel port, the TMS9902 serial port, a serial TTY (which defaults to std::cin and std::cout), and a parallel printer (which may be a serial device behind the scenes, and indeed defaults to std::cerr).

No support for disk or tape drives is planned; but the author has a design in mind for “file system” and “file set” (collection of open files) pseudo-devices which occupy 32 bits of CRU address space each.

Also planned for the future, and occupying 32 bits of CRU address space, is a real-time clock that provides C’s time(), mktime(), localtime(), and gmtime() functionality along with an 11-byte time breakdown inspired by C’s struct tm.

Users may provide additional I/O functionality (the default in the current version of the library is no output and reading all binary 1s); and they may also supply behavior for the five user-defined instructions (the default is to change the IDLE instruction into a proper halt) and trapping of unrecognized instructions (the default is to halt).

The library is usable on any C++ implementation that provides 2’s-complement and unsigned integer types of exactly 8 and 16 bits. The 8-bit types must be signed char and unsigned char. If the standard short and unsigned short types aren’t exactly 16 bits, the library will try to #include <cstdint>, which might be a problem if the shorts won’t do and the implementation doesn’t provide the C99 headers.

Appendix A provides a brief overview of the TMS9900 architecture and instruction set; Appendix B describes library-supplied locking alternatives for those who wish to use the library for multi-threaded emulators that support interrupts.

All identifiers are declared in the tms9900 namespace directly or in other namespaces inside tms9900.

This open-source library is distributed under the Boost Software License (which isn’t viral like the GPL and others are believed to be). The distribution includes the files:

tms9900.html - this document
LICENSE_1_0.txt - the Boost license
namespace tms9900
- tms9900.hpp - the library’s main header
- tms9900_types.hpp - typedefs for the library’s integer types
- tms9900.cpp - the library’s implementation
- namespace user - two files that provide default implementations for optional user-supplied functions
  - tms9900_default_userdef.cpp
  - tms9900_default_badop.cpp
- namespace lock_detail - four files that provide locking alternatives for interrupts (see Appendix B)
  - tms9900_locks.hpp
  - tms9900_locks_posix.cpp
  - tms9900_locks_win32_spin.cpp
  - tms9900_locks_win32_cs.cpp
- namespace peripherals (comming Real Soon Now)

Endianness:

The TMS9900 is byte-addressible; and 16-bit words are stored in memory in big-endian fashion and must be aligned on word boundaries.

If you know that the emulator will run on a big-endian architecture, you can get the most efficient execution by compiling tms9900.cpp with the macro, TMS9900_EMULATOR_BIG_ENDIAN, defined. Similiarly, if you know that the emulator will run on a little-endian architecture, you can get the least inefficient execution by compiling tms9900.cpp with the macro, TMS9900_EMULATOR_LITTLE_ENDIAN, defined. If neither macro is defined, the library will still work correctly; but it will use a type-punning hack to determine the endianness at run time every time a 16-bit word is read from or written to the TMS9900 memory.

Values passed to, and returned from, library functions will always have the correct endianness for the emulator’s target platform; so aside from defining the macros above, the only interesting problem is the way 16-bit words are stored in memory when the emulator is running on a little-endian box. If you access bytes through ubyte*s, you’ll always see more significant bytes of 16-bit words at lower-numbered, even-numbered addresses regardless of the endianness of the target box; but if you use the library-supplied word_iterator to access 16-bit words in the TMS9900 memory, you’ll see the correct endianness for the emulator’s target platform.

Thread Safety:

The CPU registers and the TMS9900 memory are deliberately not thread-safe; and running a TMS9900 program in one thread while accessing the internals in another yields a data race.

The optional user-supplied functions run in the same thread, and so TMS9900 instruction execution blocks until the functions return; thus it’s OK to touch the registers or memory from within those functions. For example, the user might want to do a DMA transfer within some I/O instruction. This is OK. What would not be OK would be to spawn another thread to do the DMA. (The author is thinking about how to implement HOLD/HOLDA; but don’t look for a solution comming soon. Instruction execution would have to block while the DMA is being performed anyway; and the author can’t think of a use case where it would be better to block later than sooner.)

An interrupt request would almost certainly come from a different thread, so we need a lock for requesting and servicing interrupts. The library provides three implementations described in Appendix B.

A Word about the Halting Problem:

The TMS9900 doesn’t halt. 8-)

The closest it comes is the IDLE instruction which puts 010₂ on its three high address pins and repeatedly pulses its CRUCLK pin waiting for an interrupt.

This framework is capable of mimicking that behavior; but since most users would probably want their emulator to halt some day, the library-supplied default behavior is for IDLE, and any of the several undefined opcodes as well, to just halt. The library also supplies a force_halt() function that multi-threaded emulators can call to halt the TMS9900 at the end of the current instruction.

Users who change the default behavior by supplying their own user_defined_operation() and/or invalid_opcode() functions will want to be aware of this issue.

Synopsis:

#define TMS9900_HPP_INCLUDED

namespace tms9900 {

typedef /*  8-bit 2’s-comp */ sbyte;
typedef /*  8-bit unsigned */ ubyte;
typedef /* 16-bit 2’s-comp */ sword;
typedef /* 16-bit unsigned */ uword;

uword& workspace_pointer();
uword& program_counter();
uword& status_register();

ubyte* memory_begin();
ubyte* memory_end();

class word_iterator;
word_iterator memory_word_begin();
word_iterator memory_word_end();

void reset();
void request_interrupt(unsigned level);

void start();
void load();
void force_halt();
bool exit_idle();
void reboot();

class cru_device
{
private:
    cru_device(const cru_device&);
    cru_device& operator=(const cru_device&);

protected:
    cru_device(unsigned base_addr, unsigned size);

public:
    virtual ~cru_device();

    unsigned begin() const;
    unsigned end() const;
    unsigned size() const;

    virtual bool tb(uword bit_nbr) = 0;
    virtual void sb(uword bit_nbr, bool value) = 0;

    virtual uword stcr(uword relative_addr, unsigned bit_count) = 0;
    virtual void  ldcr(uword relative_addr, unsigned bit_count, uword value) = 0;
};

namespace user
{
    enum high_addr_bits { IDLE = 2, RSET = 3, CKON = 5, CKOF = 6, LREX = 7 };
    void user_defined_operation(high_addr_bits);

    void invalid_opcode(uword offending_instruction);
}

} // namespace tms9900

Library-Supplied Features:

The following are defined in the tms9900 namespace.

Exact-width types:

typedef /*  8-bit 2’s-comp */ sbyte;
typedef /*  8-bit unsigned */ ubyte;
typedef /* 16-bit 2’s-comp */ sword;
typedef /* 16-bit unsigned */ uword;

In order to model the TMS9900 memory both easily and correctly, the emulator requires integer types of exactly 8 and 16 bits; and the signed versions must have two’s-complement representations. The author doesn’t believe that that’s too Draconian a limitation since it’s exactly what seems to have won out in the architecture marketplace.

The CPU registers:

uword& workspace_pointer();
uword& program_counter();
uword& status_register();

These three functions return non-const references to the CPU registers. The registers are stored internally with the correct endianness for the emulator’s target platform, so no byte swapping is necessary.

These functions are not thread-safe. If you call any of them while a TMS9900 program is running in a separate thread, you’ll get a data race.

The TMS9900 memory:

ubyte* memory_begin();
ubyte* memory_end();

class word_iterator;
word_iterator memory_word_begin();
word_iterator memory_word_end();

These functions return non-const iterators to the beginning, and one past the end, of the TMS9900 memory. They can be passed to Standard-Library algorithms for initializing or otherwise directly accessing the emulated memory.

A word_iterator is a random-access iterator. Objects of its reference type adjust the endianness on a little-endian box when dereferenced.

A word_iterator is explicitly constructible from a ubyte*; and a ubyte* may be assigned to it. Such a ubyte* is required to represent an even-numbered address; and a debug build will assert if it doesn’t. A release build will quietly mask off the LSB.

The emulated memory is not thread-safe. If you try to access it while a TMS9900 program is running in a separate thread, you’ll get a data race.

Vectored interrupts:

void request_interrupt(unsigned level);
inline void reset() { request_interrupt(0U); }

request_interrupt(unsigned) queues up a request for one of the vectored interrupts, 0 – 15. A debug build will assert if the argument is greater than 15; a release build will quietly use just the four LSBs of the argument.

reset() requests a level-0 interrupt, which is equivalent to bringing the TMS9900 chip’s RESET pin low.

Running programs:

Two functions are provided for starting TMS9900 program execution.

void start();
void load();

The start() function begins program execution using whatever values are currently in the workspace pointer, program counter and status register, and blocks until instruction execution halts, at which time the function returns.

The load() function first loads the workspace pointer and program counter from the load signal vectors at addresses 0xFFFC and 0xFFFE, clears the status register, resets any pending interrupts, and finally calls start(). Calling load() is thus equivalent to bringing the TMS9900 chip’s LOAD pin low (plus resetting interrupts which peripherals presumably would do on a power-up).

The emulated memory and CPU registers are not thread-safe. If you you call one of these functions in one thread and then try to access the internals in another, you’ll get a data race.

One additional function:

void force_halt();

which may be called from another thread, will cause instruction execution to halt if the emulator is stuck in an endless IDLE loop because there’s no interrupt of sufficiently high priority pending.

I/O:

class cru_device
{
protected:
    cru_device(unsigned base_addr, unsigned size);

public:
    virtual ~cru_device();

    unsigned begin() const;
    unsigned end() const;
    unsigned size() const;

    virtual bool tb(uword bit_nbr) = 0;
    virtual void sb(uword bit_nbr, bool value) = 0;

    virtual uword stcr(uword relative_addr, unsigned bit_count) = 0;
    virtual void  ldcr(uword relative_addr, unsigned bit_count, uword value) = 0;
};

This abstract base class is provided to support the Communications Register Unit (CRU).

There is no default constructor, and the copy constructor and copy-assignment operator are private and undefined, thus instances of derived classes are expected to be singletons.

The protected constructor takes two arguments: the CRU address that corresponds to the device’s bit 0, and the number of CRU bits that the device uses. The constructor will install *this at the appropriate place in the CRU address space; and the destructor will correctly remove the device.

Three non-virtual member functions are provided for determining what CRU bits the device occupies. They’re intended as helpers for the CRU instructions themselves; but calling them in user code can do no harm.

begin() returns the constructor’s base_addr argument.
end() returns one past the end of the device’s CRU address space (base_addr + size).
size() returns the constructor’s size argument.

The four pure-virtual member functions are the ones that do the deed. In all cases, the bit_nbr and relative_addr arguments are relative to the device’s base CRU address, and so will always have values in the half-open interval, [0, size()).

The TB instruction calls tb() and uses the returned value to set or clear the “equal” bit in the status register.

The SBO and SBZ instructions call sb() passing true (SBO) or false (SBZ) as the second argument.

As expected, the STCR instruction calls stcr() and the LDCR instruction calls ldcr(). The value returned from stcr() should have, and the value argument passed to ldcr() will have, the correct endianness for the emulator’s target platform. The library takes care of masking unused bits in the value. For example, if bit_count is 10, the library will apply the mask, 0x03FF to both the value returned from stcr() and the value passed to ldcr(). In other words, the user needn’t worry about unused upper bits in the value returned from stcr(), and the value passed to ldcr() is guaranteed to have the unused upper bits all 0.

Instruction execution blocks until the four pure-virtual functions return; so accessing the CPU registers or the TMS9900 memory from within these functions cannot, by itself, yield a data race.

It’s not an error to execute a CRU instruction for a CRU address at which no device is installed. The instructions will set and clear status register bits as appropriate but will perform no other operation. In such cases, the TB and STCR instructions will behave as if tb() returned true and stcr() returned all binary 1s to simulate the absence of a peripheral device driving the TMS9900 chip’s input pins.

User-Supplied Functions:

The following are declared in the user namespace inside the tms9900 namespace.

Users may provide their own definitions, although default definitions are included in separate source files distributed with the library.

User-defined instructions:

enum high_addr_bits { IDLE = 2, RSET = 3, CKON = 5, CKOF = 6, LREX = 7 };
void user_defined_operation(high_addr_bits);

This function is called when any of the IDLE, RSET, CKON, CKOF and LREX instructions is executed. The argument is the 3-bit value that would be placed on the TMS9900 chip’s A0–A2 address pins.

Instruction execution blocks until this function returns; so accessing the CPU registers or the TMS9900 memory from within this function cannot, by itself, yield a data race.

The RSET instruction clears the interrupt mask after calling this function, so the function can’t change that behavior; but it can test the status register for the value of the interrupt mask when the function was called.

The file, tms9900_default_userdef.cpp, provides a default implementation for users who don’t need to recognize these instructions. If the argument is user::IDLE, it does a proper halt; otherwise, it performs no operation.

If the user provides a definition of this function, in order to correctly implement the IDLE instruction, the function should sit in a loop waiting for tms9900::exit_idle() to return true, which it will if there’s an unmasked interrupt pending or if force_halt() has been called. If the user wishes to implement one or more of the other user-defined instructions while sticking with the library’s default behavior for IDLE, just explicitly call force_halt() if the argument is user::IDLE.

The library also provides tms9900::reboot() that can be called from user::user_defined_operation() if the user wants LREX to set WP and PC to the LOAD vectors at 0xFFFC, clear ST, and reset any pending interrupts. This is what the author believes that the TI-990’s LREX does.

Invalid opcodes:

void invalid_opcode(uword offending_instruction);

This function is called when an unrecognized instruction appears in the instruction stream. The argument is the 16-bit instruction word. The argument’s endianness is correct for the emulator’s target platform, so no byte swapping is needed.

Instruction execution blocks until this function returns; so accessing the CPU registers or the TMS9900 memory from within this function cannot, by itself, yield a data race.

The file, tms9900_default_badop.cpp, provides a default implementation that just quietly halts instruction execution.

Testing:

A unit test is comming Real Soon Now.

Appendix A, The TMS9900 Registers and Instruction Set:

This is just a brief overview. For more detailed information, see the TMS 9900 Microprocessor Data Manual.

Registers

The TMS9900 has 16 general-purpose “registers” which, in a fit of premature optimization that leaves one absolutely speechless, are implemented in memory so that different parts of a program can use different sets of registers. In principle, this allows really fast context switches; in practice, it yields really slow registers. (Wikipedia’s TMS9900 article claims that this was a rational decision because, at the time, RAM was as fast as, or faster than, the CPU. That’s not the way I remember it; but maybe I remember incorrectly.)

The three actual CPU registers are:

WP Workspace Pointer The address of the current general-purpose register 0

PC Program Counter The address of the next instruction

ST Status Register

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

L> A> = C O P X Interrupt Mask

Status register bits (a subset of the status bits on the TI-990 minicomputer):

Bit(s) Use

0 “Logical” (unsigned) greater than

1 “Arithmetic” (signed) greater than

2 Equal

3 Carry

4 Overflow

5 Odd parity (byte-oriented instructions only)

6 An XOP instruction just got executed

7–11 Don’t care when setting, always read as 0

12–15 The lowest-priority (highest-numbered) interrupt
that may interrupt the processor

Bit(s)	Use
0	“Logical” (unsigned) greater than
1	“Arithmetic” (signed) greater than
2	Equal
3	Carry
4	Overflow
5	Odd parity (byte-oriented instructions only)
6	An `XOP` instruction just got executed
7–11	Don’t care when setting, always read as 0
12–15	The lowest-priority (highest-numbered) interrupt that may interrupt the processor

(Note that a level-0 interrupt is unmaskable because the interrupt mask can’t be less than 0.)

Some general-purpose registers have special properties or uses:

Reg. Use

0 Cannot be used as index register

Bit count for shift instructions

11 Return address after BL instruction

Address passed by XOP instruction

12 CRU address

13 Previous WP after “context switch”
(interrupt, BLWP, XOP)

14 PC

15 ST

Reg.	Use
0	Cannot be used as index register
Bit count for shift instructions
11	Return address after `BL` instruction
Address passed by `XOP` instruction
12	CRU address
13	Previous	`WP`	after “context switch” (interrupt, `BLWP`, `XOP`)
14	`PC`
15	`ST`

Memory Organization

Memory is byte-addressible. Sixteen-bit words are stored in big-endian fashion and must be aligned on word boundaries. Signed integers have two’s-complement representation.

Addresses Reserved for

0000 - 003F Interrupt vectors

0040 - 007F XOP instruction vectors

FFFC - FFFF LOAD vector

Addresses	Reserved for
`0000 - 003F`	Interrupt vectors
`0040 - 007F`	`XOP` instruction vectors
`FFFC - FFFF`	`LOAD` vector

Interrupts

There are 16 vectored interrupts with the new WP and PC values stored, WP first, beginning at absolute address 0. Interrupt n is effectively a BLWP 4×n with the additional behavior of clearing the interrupt mask and disabling interrupts for one instruction. Pulsing the TMS9900 chip’s RESET pin causes a level-0 interrupt.

There is also a WP/PC pair stored at absolute address FFFC₁₆ which, presumably, is part of the bootstrap ROM. When the TMS9900 chip’s LOAD pin transitions from low to high, we get what can be thought of as “interrupt −1”. This is presumably what happens on a power-up.

Communications Register Unit (CRU)

The TMS9900 implements I/O as 4096 individually addressible bits; and all multiple-bit I/O is serial (unless a peripheral does a DMA). The 12-bit “CRU address” is a base address in register 12 with an optional signed offset.

The way the CRU works is to set the TMS9900’s address pins A0–A2 to zero, A3–A14 to the base address in bits 3–14 of R12 (maybe with an offset), and pulse the CRUCLK pin. An input bit will be read from the CRUIN pin; an output bit will be present on the CRUOUT pin.

Single-bit CRU instructions add a signed offset to the base address in R12; multiple-bit CRU instructions increment A3–A14 for each bit.

Dual Operand Instructions with Multiple Addressing Modes for Source and Destination

Instruction Format

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Opcode Destination Source

Immediate Address?

Immediate Address?

Instructions

Opcode Mnemonic Operation Notes

1010 A Add

1011 AB Add byte

1000 C Compare

1001 CB Compare byte

0110 S Subtract

0111 SB Subtract byte

1110 SOC Set ones corresponding bitwise OR

1111 SOCB Set ones corresponding byte

0100 SZC Set zeros corresponding bitwise AND with
complement of source

0101 SZCB Set zeros corresponding byte

1100 MOV Move

1101 MOVB Move byte

6-Bit Multi-Mode Addresses

2-Bit
Mode 4-Bit
Register Effect Notes

0 any register

1 register indirect

2 0 absolute The word following the instruction contains the base address.
If both operands use absolute or indexed addressing,
the source address comes first.

1–15 indexed

3 any register indirect with post-increment Increments by one for byte instructions, by two for word instructions.

Instruction Format
0	1	2	3	4	5	6	7	8	9	10	11	12	13	14	15
Opcode	Destination	Source
Immediate Address?
Immediate Address?

Instructions
Opcode	Mnemonic	Operation	Notes
1010	`A`	Add
1011	`AB`	Add byte
1000	`C`	Compare
1001	`CB`	Compare byte
0110	`S`	Subtract
0111	`SB`	Subtract byte
1110	`SOC`	Set ones corresponding	bitwise OR
1111	`SOCB`	Set ones corresponding byte
0100	`SZC`	Set zeros corresponding	bitwise AND with complement of source
0101	`SZCB`	Set zeros corresponding byte
1100	`MOV`	Move
1101	`MOVB`	Move byte

6-Bit Multi-Mode Addresses
2-Bit Mode	4-Bit Register	Effect	Notes
0	any	register
1	register indirect
2	0	absolute	The word following the instruction contains the base address. If both operands use absolute or indexed addressing, the source address comes first.
1–15	indexed
3	any	register indirect with post-increment	Increments by one for byte instructions, by two for word instructions.

Dual Operand Instructions with Multiple Addressing Modes for Source and Register Addressing for Destination
The `XOP` Instruction
Multiple Bit CRU Instructions

Instruction Format

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

001 Opcode Destination Source

Immediate Address?

Instructions

Opcode Mnemonic Operation Notes

000 COC Compare ones corresponding For each 1 bit in the source, set ST bit 2 if the
corresponding bit in the destination is 1 (COC) or 0 (CZC).

001 CZC Compare zeros corresponding

010 XOR Bitwise exclusive OR

110 MPY Multiply The 32-bit product or dividend is in Rdest and Rdest+1,
big-endian. If Rdest is 15, Rdest+1 is the next word in memory,
not R0. For DIV, quotient->Rdest, remainder->Rdest+1.

111 DIV Divide

011 XOP Extended operation

100 LDCR Load CRU Write

101 STCR Store CRU Read

Instruction Format
0	1	2	3	4	5	6	7	8	9	10	11	12	13	14	15
001	Opcode	Destination	Source
Immediate Address?

Instructions
Opcode	Mnemonic	Operation	Notes
000	`COC`	Compare ones corresponding	For each 1 bit in the source, set `ST` bit 2 if the corresponding bit in the destination is 1 (`COC`) or 0 (`CZC`).
001	`CZC`	Compare zeros corresponding
010	`XOR`	Bitwise exclusive OR
110	`MPY`	Multiply	The 32-bit product or dividend is in Rdest and Rdest+1, big-endian. If Rdest is 15, Rdest+1 is the next word in memory, not R0. For `DIV`, quotient`->`Rdest, remainder`->`Rdest+1.
111	`DIV`	Divide
011	`XOP`	Extended operation
100	`LDCR`	Load CRU	Write
101	`STCR`	Store CRU	Read

The source addressing modes are the same as above. With the possible exception of LDCR and STCR, all are word-oriented instructions, so the source register is incremented by two when using register indirect with auto-increment (addressing mode 3).

XOP is a mechanism for calling one of up to 16 particular subroutines passing a memory address in R11. These subroutines, presumably part of the operating system, would probably do things like read and write characters and strings to/from the TTY or provide other operating-system services.

The new WP and PC for XOP-called routines are stored in pairs of words, WP first, beginning at absolute address 40₁₆; and the XOP’s “destination” (the value 0 to 15, not the contents of a register) times 4 plus 40₁₆, points to the new WP. The computed source address (the address itself, not the contents of that location) is saved in the new R11; and the old WP, PC and ST are saved in the new R13, R14 and R15, respectively. Finally, ST bit 6 is set in case the routine needs to know whether it was called by an XOP or by some other means. The routine is expected to return with the RTWP instruction.

For LDCR, which does output, and STCR, which does input, the instruction’s “destination” field holds a bit count. If the value is 0, the bit count is 16. If the bit count is 1 through 8, this is a byte-oriented instruction; otherwise, it’s a word-oriented instruction. The base CRU address is in bits 3–14 of R12. The external CRU address will be incremented after each bit is transferred, but the contents of R12 are not affected.

Single Operand Instructions with Multiple Addressing Modes

Instruction Format

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

0000 01 Opcode Operand

Immediate Address?

Immediate Address or Operand for X Instruction?

Instructions

Opcode Mnemonic Operation Notes

0001 B Branch Load the operand address into PC

1010 BL Branch and link Save PC in R11, then load the operand address into PC

0000 BLWP Branch and load workspace pointer Call subroutine with “context switch” (new WP)

0011 CLR Clear 0 -> operand

1100 SETO Set to ones FFFF₁₆ -> operand

0101 INV Invert One’s complement

0100 NEG Negate Two’s complement

1101 ABS Absolute value

1011 SWPB Swap bytes

0110 INC Increment

0111 INCT Increment by two

1000 DEC Decrement

1001 DECT Decrement by two

0010 X Execute Execute the instruction at the operand address.
If the instruction requires an immediate address or other immediate operand,
it will be taken from the word following this X instruction,
not the word following the executed instruction.

Instruction Format
0	1	2	3	4	5	6	7	8	9	10	11	12	13	14	15
0000 01	Opcode	Operand
Immediate Address?
Immediate Address or Operand for `X` Instruction?

Instructions
Opcode	Mnemonic	Operation	Notes
0001	`B`	Branch	Load the operand address into `PC`
1010	`BL`	Branch and link	Save `PC` in R11, then load the operand address into `PC`
0000	`BLWP`	Branch and load workspace pointer	Call subroutine with “context switch” (new `WP`)
0011	`CLR`	Clear	0 `->` operand
1100	`SETO`	Set to ones	`FFFF`₁₆ `->` operand
0101	`INV`	Invert	One’s complement
0100	`NEG`	Negate	Two’s complement
1101	`ABS`	Absolute value
1011	`SWPB`	Swap bytes
0110	`INC`	Increment
0111	`INCT`	Increment by two
1000	`DEC`	Decrement
1001	`DECT`	Decrement by two
0010	`X`	Execute	Execute the instruction at the operand address. If the instruction requires an immediate address or other immediate operand, it will be taken from the word following this `X` instruction, not the word following the executed instruction.

The addressing modes are the same as above. All are word-oriented instructions, so the register is incremented by two when using register indirect with auto-increment (addressing mode 3).

BL and BLWP are the two ways of calling user-defined subroutines.

BL just saves PC in R11. Return with B *R11 (000001 0001 01 1011).

BLWP is used when you want a “context switch,” that is, a whole new set of general-purpose registers. It gets the new WP and PC from the two words beginning at the computed operand address, WP first, and then saves the old WP, PC and ST in the new registers 13, 14 and 15, respectively. As with XOP, return with the RTWP instruction.

Jump Instructions
Single Bit CRU Instructions

Instruction Format

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

0001 Opcode Signed Displacement

Instructions

Opcode Mnemonic Operation

0011 JEQ Jump equal

0101 JGT Jump greater than

1011 JH Jump high

0100 JHE Jump high or equal

1010 JL Jump low

0010 JLE Jump low or equal

0001 JLT Jump less than

0000 JMP Jump unconditionally

0111 JNC Jump no carry

0110 JNE Jump not equal

1001 JNO Jump no overflow

1000 JOC Jump on carry

1100 JOP Jump odd parity

1101 SBO Set bit to one

1110 SBZ Set bit to zero

1111 TB Test bit

Instruction Format
0	1	2	3	4	5	6	7	8	9	10	11	12	13	14	15
0001	Opcode	Signed Displacement

Instructions
Opcode	Mnemonic	Operation
0011	`JEQ`	Jump equal
0101	`JGT`	Jump greater than
1011	`JH`	Jump high
0100	`JHE`	Jump high or equal
1010	`JL`	Jump low
0010	`JLE`	Jump low or equal
0001	`JLT`	Jump less than
0000	`JMP`	Jump unconditionally
0111	`JNC`	Jump no carry
0110	`JNE`	Jump not equal
1001	`JNO`	Jump no overflow
1000	`JOC`	Jump on carry
1100	`JOP`	Jump odd parity
1101	`SBO`	Set bit to one
1110	`SBZ`	Set bit to zero
1111	`TB`	Test bit

The jump instructions can branch within a range of +127 to –128 words (not bytes) from the current PC (which points to the word following the instruction while the instruction is executing). Instructions whose names contain “high” and “low” test ST0 (unsigned greater than); instructions whose names contain “greater” and “less” test ST1 (signed greater than).

JMP $+0 (0001 0000 0000 0000) is the canonical encoding for a no-op.

SBO, SBZ and TB are the single-bit CRU instructions. The CRU address is the base address in bits 3–14 of R12 plus the signed displacement in the instruction. TB’s input is into ST2, the “equal” status bit.

Shift Instructions

Instruction Format

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

0000 1 Opcode Bit Count Register

Instructions

Opcode Mnemonic Operation Notes

010 SLA Shift left arithmetic LSB <- 0

000 SRA Shift right arithmetic old MSB -> MSB

011 SRC Shift right circular old LSB -> MSB

001 SRL Shift right logical 0 -> MSB

Instruction Format
0	1	2	3	4	5	6	7	8	9	10	11	12	13	14	15
0000 1	Opcode	Bit Count	Register

Instructions
Opcode	Mnemonic	Operation	Notes
010	`SLA`	Shift left arithmetic	LSB `<-` 0
000	`SRA`	Shift right arithmetic	old MSB `->` MSB
011	`SRC`	Shift right circular	old LSB `->` MSB
001	`SRL`	Shift right logical	0 `->` MSB

SLA is the only shift instruction that can affect ST4 (overflow). All can affect ST0–ST3.

If the bit count field of the instruction is 0, the bit count is the four LSBs of R0. If the four LSBs of R0 are also 0, the bit count is 16. (The joy of shifting a 16-bit word by 16 bits this author has never understood.)

Other Instructions

Instruction Format

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

0000 001 Opcode ? Register?

Immediate Operand?

Bit 11 is always don’t-care.
Bits 12–15 are don’t-care if the register field isn’t used.

Instructions

Opcode Mnemonic Operation Uses
Register
Field Uses
Immediate
Operand

0001 AI Add immediate Yes

0010 ANDI AND immediate

0100 CI Compare immediate

0000 LI Load immediate

0011 ORI OR immediate

0111 LWPI Load workspace pointer immediate No Yes

1000 LIMI Load interrupt mask immediate

0110 STST Store status Yes No

0101 STWP Store workspace pointer

1100 RTWP Return workspace pointer No

1010 IDLE Idle

1011 RSET Reset

1110 CKOF [externally defined]

1101 CKON

1111 LREX

Instructions
Opcode	Mnemonic	Operation	Uses Register Field	Uses Immediate Operand
0001	`AI`	Add immediate	Yes
0010	`ANDI`	AND immediate
0100	`CI`	Compare immediate
0000	`LI`	Load immediate
0011	`ORI`	OR immediate
0111	`LWPI`	Load workspace pointer immediate	No	Yes
1000	`LIMI`	Load interrupt mask immediate
0110	`STST`	Store status	Yes	No
0101	`STWP`	Store workspace pointer
1100	`RTWP`	Return workspace pointer	No
1010	`IDLE`	Idle
1011	`RSET`	Reset
1110	`CKOF`	[externally defined]
1101	`CKON`
1111	`LREX`

RTWP returns from an interrupt or a subroutine called by BLWP or XOP. It restores WP, PC and ST from registers 13, 14 and 15, respectively.

IDLE, RSET, CKON, CKOF and LREX all put bits 8 to 10 of the instruction (the three LSBs of the opcode) onto the chip’s three high address pins and pulse the CRUCLK pin. Presumably, there exists some hardware that will behave appropriately in response. (Note that, despite pulsing CRUCLK, this is not actually a CRU operation because A0–A2 are not all zero.)

RSET has the additional behavior of clearing the interrupt mask.

IDLE has the additional behavior of repeatedly pulsing CRUCLK while waiting for an interrupt, which is as close as we get to a halt instruction. This lets us optimize the operating system’s idle loop 8-):

loop  0300  LIMI 15
      000F
      0340  IDLE
      10FC  JMP loop

I’ve heard that “CKON”, “CKOF” and “LREX” were mnemonic for things that happened on the TI-990.

Undefined Instructions

The TMS9900 microcomputer had a subset of the TI-990 minicomputer instruction set. On the TMS9900:

Instructions of the form 0000 11xx xxxx xxxx are undefined. These come between the shift instructions (0000 10xx xxxx xxxx) and the jump/1-bit-CRU instructions (0001 xxxx xxxx xxxx).
Instructions of the form 0000 0111 1xxx xxxx are undefined. These come between ABS (0000 0111 01xx xxxx) and the shift instructions (0000 10xx xxxx xxxx).
Instructions of the form 0000 0011 001x xxxx are undefined. These come between LIMI (0000 0011 000x xxxx) and IDLE (0000 0011 010x xxxx).

Appendix B, Notes on Interrupt Concurrency:

Requesting and servicing of interrupts will likely happen in different threads, so we’ll need some kind of mutex to control access to any data structures that those two operations have in common. The library provides three kinds of locks; and users may choose one of those or write their own.

tms9900_locks.hpp declares lock() and unlock() functions in the lock_detail namespace inside the tms9900 namespace. These are the two functions that get implemented in the ways described below.

The library doesn’t provide any kind of try-lock because it’s not clear what to do when the lock fails.

POSIX

For use in a POSIX environment, tms9900_lock_posix.cpp provides a lock that uses a pthread_mutex_t.

Microsoft Windows

Two locks are provided for users of Wintel boxes.

tms9900_lock_win32_spin.cpp provides a spinlock that uses Windows’ InterlockedExchange(long*,long). It might be useful in a small program that runs as relatively few threads all at the same priority; but it’ll be a bad choice in larger programs with many threads running on a single processor, or with threads that have different priorities. Indeed, you could even have a deadlock if a high-priority thread is waiting for a lock to be released by a low-priority thread that never gets scheduled.

tms9900_lock_win32_cs.cpp provides a lock that uses a Windows CRITICAL_SECTION for use where spinlocks won’t do.

All corrections and suggestions will be welcome, all flames will be amusing.
Mail to was at pobox dot com.

A TMS9900 Emulator FrameworkVersion 0.8

Bill Seymour2011-01-15