Computer Architecture

© Tim Margush 2006

Instruction Set Architecture (ISA)

Historically, this is probably the first level of thinking about computer design. This is where the fetch-execute cycle is logically based. The instruction set architecture is the interface between hardware and software. High level language compilers target a particular ISA when translating programs. Hardware designers build circuits to carry out the instructions found at the ISA level. The abstract ISA specification allows multiple vendors to design underlying hardware with different strengths. It also allows vendors to develop software to run on different manufacturer's chips.

A good ISA design will include instructions that are efficient to implement in current and future technologies and at the same time provide a regular and complete set of operations to support the needs of compilers and other software developers. Failing to meet the first criteria means increased expense in the development and production of the hardware. Failing to meet the second results in more complex or inefficient software. Another compelling factor is backward compatibility. A new processor that can immediately run all or most existing software, but includes new features that will enhance future software, is generally going to be chosen over one with no software base.

The ISA of a processor consists of its memory model, registers, instructions and data types, and additional operational capabilities such as interrupts and timers. These are characteristics that must be understood by the system programmers (assembler, compiler, operating system developers). Although the underlying concepts such as cache memory, pipelining, out-of-order execution, multiple ALU's, etc, may impact design decisions made by system programmers, these topics are usually independent of the ISA.

Memory Models

Memory is easily understood as an array of cells; each cell is typically composed of 8-bit bytes (although a variety of cell sizes from 1-60 bits have been used). The 8-bit byte cell-size corresponds nicely to the ASCII character set (with an extra bit for parity), creating a convenient container for character data. A word is 2, 4, or 8 bytes, treated as a single unit. The word size is part of the ISA specification of the memory model and often corresponds to register sizes, data path size, and arithmetic capabilities.

ISA level addresses typically range from 0 through 2k-1 for some k. This implies that addresses are k-bits in size. Most memories are byte addressable, but some are accessed by word addresses. The storage of words in memory is sometimes restricted to what are called word boundaries. If words are stored consecutively starting at address 0, all will be on word boundaries.

Word size (bits) Word Boundary Examples
16 Divisible by 2 0x7342, 0x553D (last bit is 0)
32 Divisible by 4 0x675C, 0x0004 (last 2 bits are 0)
64 Divisible by 8 0xC878, 0x45C0 (last 3 bits are 0)

Early processors generally did not require word alignment (the 8088 had an 8-bit bus so word alignment had no bearing on performance). The Pentium 4 processor fetches 64-bit words from memory. The address bus does not even include the lower 3 bits of the address. If running a program targeting an earlier processor that did not require word alignment, the fetch of a 16-bit integer word might require reading 16 bytes (2 64-bit words) if the two bytes needed are at addresses 8n+7 and 8(n+1). The specific bytes must be isolated from the two words and reassembled. The process is complicated even more on this architecture in that words are stored in little-endian order.

Some ISA's separate program memory (instructions) from data storage. That is, each address can have two different meanings. The advantage of this is that more memory can be addressed (twice as much) without increasing the size of the address field. It also prevents a program from storing data into the instruction area unless the processor is operating in a privileged mode.

Memory semantics refers to how load and store operations are processed. In a complex architecture, multiple memory requests may be active at the same time. This might be due to pipelining, multiple processors (shared memory), cache management, or out-of-order execution. Serialized access simply means each memory access is completed before the next begins, in program order. More complicated schemes support shared memory and provide synchronization capabilities to ensure operations are completed in the appropriate order. Many of these details are creeping into the ISA level when they really belong to the microarchitecture level. This is necessary, however, to allow for the ISA level to utilize the underlying complexities of hardware to its greatest advantage.

Registers

Not all registers discussed at the microarchitecture level (such as MAR and H) are always visible or available at the ISA level. ISA registers are defined to support ISA level operations. Registers may be designated as special-purpose or general-purpose. Special-purpose registers include the PC, stack pointer, indirect addressing pointer, or the ACCumulator (the only register that can receive the result of an ALU operation). CISC architectures are more likely to have many special-purpose registers; RISC architectures usually have more general-purpose registers. The Intel 8088 family has progressed from a few more or less special purpose registers to many more or less general purpose registers. That is, although most registers can be used generally, some operations require the use of a specific register. Even when general registers are completely interchangeable, software practices often designate conventional usage of some registers (such as where a procedure places a return value).

Some architectures specify two operational modes: user and kernel. Some registers are only available when operating in kernel mode. This arrangement provides ISA level support for security features required by modern operating systems. One common ISA level register is the Program Status Word (PSW). Typically this contains the flags (negative, carry, etc), accessible to user mode programs, and other protected information such as interrupt flags, mode, and sometimes even the PC.

Instructions

The core of any ISA description is the instruction set. Many of the hardware details (at least the ISA level view of them) can be guessed by a careful study of the instructions. Instructions have an opcode and possibly operands; they may be of fixed or variable length. Variable length instructions conveniently accommodate 0 or more operands, varying the length depending on the number of options required. Fixed length instructions generally vary the internal format to efficiently utilize the bits allocated to each instruction.

Pentium 4 ISA (Intel ISA-32)

Operating modes: real mode, virtual 8086 mode, protected mode, which includes four privilege levels: 0 (kernel) through 3 (user mode) providing different levels of access.

Memory model: 246 bytes (70 terabytes - almost large enough to hold all of the content on the Internet), organized in 214 segments, each 232 bytes (4 GB) in size. Wordsize is 32 bits and words are stored in little-endian order (in memory).

Registers: 8 general purpose registers with some restrictions on usage. The E prefix indicates the extension added when the architecture changed from 16 to 32-bit words (80386). EAX's low half is accessed as AX, and each byte as AH and AL. The top four registers are the most general, but convention assigns specific uses: A for arithmetic, B for memory pointers, C for counters, and D for 64-bit data in multiplication and division. The next 4 registers can also serve as general purpose registers, however they have special functions as well. SI and DI are the source and destination index registers used in string operations. BP (base pointer) is used for array access or to indicate the base of the stack frame, and SP is the stack pointer. The xS registers are segment registers, historically used to point to different 64K memory segments (code, stack, data, extra...) in earlier versions of this architecture. These now are generally not needed as long as the processor is running in protected mode and using a single memory segment of 4 GB. The program counter (Instruction Pointer) and flags round out the register set.

Instructions: The instruction set is very much CISC. There are specialized instructions for string operations, and extensions for audio and video processing. There is a mixture of 8, 16, and 32-bit instructions and a variety of addressing modes. Calculations can directly involve memory locations (not just registers).

UltraSPARC ISA (Version 9 SPARC)

The SPARC architecture does not carry the legacy of the Pentium. It was introduced as a RISC processor from the start. As a result, the ISA level is much cleaner and consistent.

Operation modes: There are 2: Privileged and Non-Privileged.

Memory model: A straightforward array of 264 bytes, far more than can be provided now. Words (64-bit) are stored in big-endian order (but this is customizable). The memory model of this architecture will serve well into the future.

Registers: Backward compatibility has hampered register expansion somewhat. There are 32 general purpose 64-bit registers (R0-R31), although R0 is really quite special and all of them have alternate names indication conventional usage. In addition, there are 32 floating point registers. From the table, you can see that many of the registers are used to support function calls (parameters, stack frame, return value, etc.). FP is the Frame Pointer and SP the Stack Pointer. Note that the return address is placed in R31 when a procedure call occurs. In fact, there are more than 32 registers. A Register Window allows the upper registers (8 and above) to be relocated. This provides efficient support for procedure calls. The Current Window Pointer (CWP) points to the start of the current register window. This organization is efficient only for procedures with limited numbers of local variables and parameters, and for fairly shallow nesting.

Instructions: The SPARC is a load/store architecture. No arithmetic/logical instructions may involve a memory address. Memory operations are cleanly separated from operations utilizing the internal data path.

8051 ISA

Operating modes: Only 1. This is designed to execute a single application continuously, from power on to power off.

Memory model: This processor uses a simple Harvard architecture with 64K data and 64K instruction spaces. Most commonly, the program is ROM, the data is RAM. In smaller designs, all of the memory is on chip. To support larger memory needs, off chip program and data memories (unified or split) can be attached. Some data addresses are special. For example, the 16 bytes at addresses 32-47 are bit addressable. That is, each of the bit numbers 0-127 correspond to a bit in one of these bytes. This allows convenient access to bit-fields often needed for real-time control applications.

Registers: The 8051 contains four sets of 8 single byte registers. One set is active at any given time. This feature supports interrupt processing and multi-tasking as there is no need to save and restore registers. The PSW contains bits selecting the active register set. The registers are also addressable as data storage. This allows applications to use a variable to select a register. The registers appear at addresses 0-7, 8-15, 16-23, and 24-31. There are also special purpose registers (SFR's - Special Function Registers) located in memory at addresses 128-255. These include the ACCumulator and B registers, PSW, stack pointer, timers, address register for data, and IO control registers. The IE (Interrupt Enable) register allows enabling of 6 different interrupts and the IP (Interrupt Priority) register selects between low and high priority for each interrupt.

Instructions: This processor has a fairly compact instruction set, tailored to the 8-bit architecture, and designed for efficient implementation of real-time, embedded applications.

Data Types

Any type of data can be processed on any processor. The key issue is whether the processor supports a particular type. If it does, then operations will typically be carried out quickly at the hardware level. If not, software emulation is required.

Numeric data divides into several categories: integer, floating point, and decimal. Numbers can be signed or unsigned, fixed length, or of varying length. Common integer formats are unsigned, one's complement, two's complement, and biased. The most common floating point representation is specified in the IEE standard 754. There are 3 subtypes: single precision (32 bits), double precision (64-bits), and extended precision (80 bits). Decimal formats include ASCII, BCD, and packed BCD.

Non-numeric data includes Boolean and character data. Boolean is most compactly represented by a single bit, but most hardware does not directly support individual bit manipulation without working with an entire byte. Character data is mostly ASCII, EBCDIC, or UNICODE. ASCII is a 7-bit code and EBCDIC is an 8-bit code. UNICODE is not really a code, but a mapping of characters to numbers. There are several encoding schemes used to represent the UNICODE values as either fixed or variable length values. One common encoding is UTF-16 which encodes a subset of 65536 characters using a 16-bit code. Processor support for characters usually relies on the underlying support for integers, however some ISA's include special string processing instructions.

Address (pointer) data is the last common data type. These are unsigned integer values, however ISA's provide a variety of ways to use pointers to access memory. This topic is usually covered under the heading: addressing modes.

The following table indicates support for these common data types for each of the processors we have encountered. All of these processors assume two's complement encoding for signed integers. The UltraSPARC requires word alignment of operands.

Data Type Pentium 4 UltraSPARC III 8051
bit no no yes
signed int 8,16,32 8,16,32,64 8
unsigned int 8,16,32 8,16,32,64 no
BCD yes no no
Floating Point 32,64,80 32,64,128 no
Character/String yes no no

Instruction Formats

Instructions are encoded in binary. Usually they are designed to fit a byte or a multiple number of bytes. Instructions have an opcode and sometimes one or more operands. In some cases, instructions have additional bit fields to select different behaviors. Early instruction designs attempted to pack lots if content into small areas, not wasting any bits. This was due to limited memory, and slow access times. Now, longer instructions can be easily accommodated, and wasting bits in favor of easy (fast) decoding is probably a good tradeoff. However, longer instructions require more bandwidth, so as architectures increase fetch speeds, shorter instructions again see an advantage.

Instruction length depends on several factors. Of primary importance is the need to represent all operations. An n-bit opcode can represent 2n distinct operations, and no more. This determines the minimum size for the opcode field. To allow future expansion of the ISA instruction set, it is important to be generous in this area.

A second factor is addressing. Most instructions sets provide some instructions whose operand is a memory address. Sufficient bits must be available to access any byte or word of storage. Most modern instruction formats support byte-addressable memory with at least 32-bit addresses. Of course, this may imply that an instruction will occupy more than one 32-bit word. In practice, a variety of addressing modes are usually supported with different operand lengths.

The separation of opcode from operands is another interesting design issue. Some ISA's specify a fixed size opcode; others allow the opcode size to vary. Short opcodes provide more bits for operands whereas a longer opcode might use all of the available bits to specify an operation with no or implied operands. As an example, consider a machine that will use a fixed, 16-bit instruction format. Assume instructions need up to three addresses per instruction (registers are sometimes coded by their number which might be called an address). If there are 16 registers, 4-bits per register are needed. An instruction that names 3 registers will leave only 4 bits for the opcode. If there are less than 16 such instructions, at least one of the opcodes is unused. By treating this unused opcode as part of a longer opcode, we can code many more instructions. Two register instructions have another 4 bits that can be devoted to the opcode. Again, if all of these are not needed, some can imply an even longer opcode, perhaps to represent single register instructions. Finally, some opcodes could occupy all of the bits. It would also be probable that some instructions would use a different operand format, perhaps treating the last 8 bits as a byte value, or 9 bits as an address. Possibilities for these variations are shown below. The operands are destination and source (a or b) registers, a constant, or a pointer.

Initial bits Operand 1 Operand 2 Operand 3
0000-1110 (0-E) dddd aaaa bbbb
11110000-11110101 (F0-F5) dddd aaaa  
11110110-11110111 (F6-F7) cccccccc    
1111100-1111110 (F8-FD) ppppppppp    
111111110000-111111110011 (FF0-FF3) aaaa    
1111111101000000-1111111111111111 (FF40-FFFF)      

Of course, this idea might be implemented at the byte level, where some instructions are single bytes, others multiple bytes; the opcode in the first byte determines the number of bytes in the instruction. At the opposite extreme, instructions could have varying sizes of any number of bits, even overlapping instructions in bytes.

The Pentium architecture uses variable length instructions with up to 6 variable length fields. The 8088 instructions all had 1 byte opcodes plus the option of a prefix byte. As the instruction set expanded, the opcodes were exhausted. The 0xFF opcode was used to indicate the use of the second opcode byte.

Due to the evolutionary nature of this instruction format, decoding is extremely complex. Several addressing modes are available and nearly all instructions can involve up to one memory location in addition to a register. The ModR/M byte has several fields. The middle specifies an address or is sometimes used as part of the opcode. The Mod and R/M parts together specify one of 24 addressing modes and 8 registers, in various combinations. Some modes require the extra SIB (Scale, Base, Index) byte to name two registers and a scale factor. The rest of the instruction provides an address and/or constant needed for the operation.

The UltraSPARC uses 32-bit, fixed length instructions with expanding opcodes. Most instructions specify an operation and three registers, one of which is the destination for the result. Originally, there were 5 formats. Over time, additional formats were introduced so now there are 31. One feature of this format is that the first 2 bits determine (for the most part) the instruction format. Format 1 handles instructions with two source registers, or a source register and a constant (12 bits). The second format is used to create a 32-bit constant in a register using a sequence of two instructions. The SETHI instruction sets the upper 22 bits. A later instruction (presumably using format 1b) can set the rest. Format 3 provides branching capabilities using PC-relative addressing. The 22-it displacement (signed) is shortened to 19-bits to accommodate the predictive branch instructions (stealing bits from the displacement to implement new instructions). The CALL instruction has its own format. The opcode is implied by the first 2 bits. The 30-bit displacement is in words, not bytes.

The 8051 has 6 formats of 1, 2, or 3 bytes. The single byte instructions imply the operands (the ACCumulator for example) or use 3 bits to name a register. Usually the ACC is involved as a second operand and serves as the destination. Formats 3 and 4 are 2 bytes. In format 3, the second byte is a constant, offset, or bit number. Format 4 combines bits from both bytes to specify an 11-bit address in program memory (branch or call instructions). The 3 byte formats allow a full 16-bit address (format 5) or two separate byte operands which might be a constant and a memory address (format 6).

Addressing Modes

Instructions need to specify operands by location. Some instructions imply their operands and have no visible addressing data. Others use various pieces of information and methods to specify location.

Immediate The operand is encoded directly in the instruction. That is, the data is immediately available to the processor. No actual address is specified.
Direct The full address is coded in the instruction. Each use of the instruction will always access the same memory location. Although suitable for global data, this is not useful for local variables or parameters.
Register The data is in a register. This is not really an addressing mode, except that in many cases the instruction contains a numeric field corresponding to a register number, so it can be considered an address.
Register Indirect The instruction names or implies a register, but the register contains the address of the data, not the data itself. The register holds a pointer. The advantage of this mode is that the same instruction can easily refer to different memory addresses at different times. This is a common mode for array access and is the foundation for accessing data in stack frames.
Indexed

The instruction contains a constant that is added to a pointer (usually in another register) to determine the memory location referenced by the instruction. This is sometimes called indexed-indirect. Note that the index is a constant (signed or unsigned) that is coded into the instruction, thus the offset is fixed. This is commonly used in two ways.

  1. To access fields in a structure: Each field is at a fixed offset from the start of the structure; the register points to the structure.
  2. To index into an array: The offset is actually the address of the array (which remains constant). The register then contains the offset into the array (in bytes) and is typically modified in a loop.
Based-Indexed This mode uses two registers. The base register is a pointer. The index register contains an offset. The address accessed is calculated by adding the two registers. In some cases, this addressing mode allows an additional constant offset to be added into the mixture.
Stack This is a special case of implied addressing. Instructions operating on the stack need not specify an address at all. The address of the data is implied to be located at the address in the SP register.
PC-Relative This mode generally applies only to branch instructions as it specifies a location in memory relative to the current instruction. If data and code are mixed, it might be used to name data locations as well. Typically, this format includes a displacement (usually signed) that is added to the PC to determine an address. This is really a special case of Indexed addressing. The PC acts as the pointer.

Ideally, every instruction should provide all combinations of addressing modes and allow unrestricted use of any register. Such consistent (orthogonal and regular) instruction sets are rarely found in real ISA's. Backward compatibility issues and changing technologies coupled with the desire to provide more functionality generally result in instructions that only work with certain registers or allow only certain types of addressing.

The Pentium 4's addressing mode is generally determined by the ModR/M byte. This table summarizes the possibilities for 32-bit addressing only (there is also 16-bit addressing). This does not include the reg bits that specify another register that participates in the instruction. The first group indicates Indirect addressing, with two special cases. One uses the SIB byte (the next byte) to specify registers and a scale factor (indicating *1, *2, *4, or *8). The other is direct addressing in that there will be a 32-bit displacement following the ModR/M byte.

The second group is indexed addressing with an 8-bit offset. This is sign-extended before adding to the pointer. The third group uses a full 32-bit displacement. The displacement byte(s) follow the ModR/M byte.

The last group is used for register addressing. That is, both operands are in registers. The R/M bits simply determine which register. The actual opcode determines which of the options is used (such as selecting between EAX, AX, AL, or MM0).

The the SIB byte is used, it names two registers, a base and index, resulting in Based-Indexed addressing, possibly with a displacement.

When operating in real or virtual 8086 mode, 16-bit addressing is used. The meaning of the ModR/M byte is similar, but the details are quite different.

The UltraSPARC instructions neatly divide into memory access (LOAD and STORE) and the others that use register or immediate addressing modes. The LOAD and STORE instructions have 2 modes: Indexed which adds a 13-bit signed offset to a pointer register, and Register-Indirect-Indexed (Based-Indexed) which adds two registers to get a pointer. The branch and call instructions all use PC-Relative addressing. This ISA does not support direct addressing.

The 8051 Has 5 different addressing modes.

  1. Implied or implicit (operand is the ACC)
  2. Register (operand is in the register - the ACC is the other operand if applicable)
  3. Direct (instruction contains the internal RAM address 00-7F or SFR number 80-FF)
  4. Register Indirect (a register holds a pointer - restricted to 256 bytes of memory)
  5. Immediate (operand is part of the instruction)

In addition, to handle potentially large program addresses (16-bits), there are special instructions. LJMP and LCALL use 16-bit addresses. External data memory is also handled differently. A 16-bit pointer register (DPTR) is set as a pointer to external memory and then a read/write operation transfers the byte.

Instruction Types

Instructions are generally classified in groups based on similar actions. The common groups, a brief description, and a few comments follow.

Instruction Type Description Comments
Data Movement Copying data between registers, registers and memory, or memory to memory. This includes stack instructions. These instructions are commonly named MOVE, LOAD, STORE or some variation. Support to move a word is expected. Byte movement or larger block movement may be supported by some ISA's.
Dyadic Operations Also called binary operations since they require two operands and produce a result. These are usually sub-grouped as arithmetic (integer and floating point) or logical (Boolean). Direct support for bit operations is usually not provided at the ISA level. The logical instructions provide a way to set, check, or test individual bits within a byte or word.
Monadic Operations Also called unary operations since they take one operand and produce a result. Bit shifting and rotating as well as the usual NOT, CLR, INC, and NEG. Bit shifts are useful for divisions and multiplications by powers of two.
Comparisons and Branches Branch instructions alter the PC, usually depending on some condition. Conditional branches generally follow an instruction that sets one or more flags, although some ISA's provide instructions that test and branch in one operation. Most dyadic and monadic operations set flags that can be used to control branches. In addition, special compare or test instructions set flags without producing a result. Branch instructions can use a variety of addressing modes, however PC-relative is the most common.
Procedure Calls and Returns The difference between a branch and a call is the ability to return. Procedure calls place the PC into a register or memory (usually on a stack). Return instructions load the PC with the stored return address. In some ISA's, the return address is stored in a slot of program memory, usually at the beginning of the procedure. It is the responsibility of the programmer to store this elsewhere if the procedure might be called again before returning. The same responsibility exists if a register is used. Procedure calls and returns may also include provision for allocating and releasing arguments, and supplying return values. Usually this is the software's responsibility.
Loop Control Specialized loop instructions include support for counters that are automatically updated and tested as part of the conditional branch used to control the loop. Loop control instructions are generally placed at the end of a loop, allowing the branch (if taken) to restart the loop, or to sequence past the loop (exit). Special care must be taken to allow loops to repeat 0 times with this organization.
Programmed I/O with busy waiting A simple IN or OUT instruction allows data to be transferred between a device and register. The processor polls the device to determine when it is free to do the transfer. This often requires additional programming around the transfer to control the device. This type of I/O is most CPU intensive and generally is used only on low-end devices. To support many devices, the transfer instructions generally have an operand that specifies the device (or port).
Interrupt Driven I/O Devices have the capability to notify the processor via an interrupt signal that they are ready or finished. Interrupts only consume a CPU's time when the request occurs. The processor does not need to continually check on a device. Most processors have interrupt enable/disable capabilities and some priority scheme to handle multiple interrupts. This approach still requires frequent attention by the processor when transferring large blocks of data.
DMA I/O Direct Memory Access allows a dedicated processor (DMA controller) to directly access memory to transfer large blocks of data without intervention by the CPU. The CPU initiates a DMA operation and the allows the device to communicate with the DMA controller to transfer data directly to/from memory. The CPU is notified via one interrupt when the transfer is complete. DMA operations may slow the CPU because clock cycles might be stolen from it to fulfil I/O needs.

Comparisons (Pentium, UltraSPARC, 8051)

The ISA's we have been studying are quite different. The Pentium is a classic 2-address, 32-bit CISC architecture with too many addressing modes to remember. The UltraSPARC is a 3-address, 64-bit RISC Load/Store architecture with just 2 addressing modes. The 8051 falls in between, providing a very simple 2-address, 8-bit RISC architecture with 5 addressing modes. The Pentium ISA is complicated by the need for backward compatibility. The UltraSPARC is a great example of an ISA based on a design looking toward the future. The 8051 is a versatile RISC architecture tailored to the needs of embedded applications.

Flow of Control

The GOTO statement is the high-level language equivalent of a jump or branch. Historically, this statement was the basis of implementing every control structure that did not fit into one of those provided by the language. In 1968, Edsger W. Dijkstra published the article, Go To Statement Considered Harmful, that ignited a controversy (war) on the topic of structured programming. Even though modern programming practices discourage the use of GOTO, the ISA level code will by necessity contain many GOTO's (BRA, JMP) that implement the higher level structures (IF, WHILE, etc). When programming at the ISA level, it is a good idea to use branch instructions judiciously to limit the complexity of control structures that are created. Blocks of code with multiple entries and multiple exits are extremely difficult to debug and maintain.

Procedure calls are a variation of the GOTO, allowing the code receiving the thread of control to return to the statement following the call. In a parody of the structured programming war, R.L. Clark's proposed to introduce the COME FROM statement in an article published in 1973 in Datamation: A Linguistic Contribution to GOTO-less programming. The call statement is simply a branch that stores the current PC value before overwriting it with the destination address. To support nesting of calls, a stack structure is used to store return addresses (either directly as part of the call instruction, or via software at the start of each procedure that intends to call another procedure.

Two special control structures associated with procedure calls are recursive procedures and coroutines. A procedure that calls itself is called recursive. Coroutines are a pair of procedures that can alternate the flow of control between them. A RESUME or YIELD statement might be used to switch control to the opposite routine. Each routine executes sequentially, but upon the resume/yield, execution continues in the coroutine from the point where it was interrupted. Coroutine execution is similar to parallel execution, except that it is accomplished sequentially. Each switch between routines requires saving the resume address and local variables, restoration of the other routine's local variables, and a branch to the resume point in that routine. C# is one of the major modern languages that support coroutines through a YIELD command.

A trap is similar to an interrupt, but is triggered by an instruction, not an external event. Traps related to arithmetic overflow/underflow or an illegal memory operation are implemented in hardware by simulating a procedure call. The procedure is called a trap handler and may be located at a fixed location of memory, or its location may be specified in a table. Traps handle exceptional situations that are detected by hardware (or microprogram).

Interrupts are triggered by an event independent of the executing program. Whereas a trap can occur only during the execution of certain instructions, an interrupt can occur at any time. The interrupt handler processes the interrupt request and then returns control to the interrupted program. Interrupts are asserted by some logic signal readable by the control unit. When the CU is ready to respond to the interrupt, it initiates an interrupt call which is like a procedure call, but also usually temporarily disables interrupts, saves the processor status (PSW), and acknowledges the interrupt to the device that asserted it. The address of the interrupt handler is usually obtained from a table called the Interrupt Vector table. A special RETI (Return from Interrupt) instruction restores the PSW before returning to the original task. The interrupt service routine must save and restore all registers or risk corrupting the state of the executing program.

Interrupt routines must be transparent (leave the state of the program unchanged). This is technically not feasible, as interrupts typically alter memory locations that are used by programs. Programs must be designed carefully to avoid problems associated with shared memory updates. When reading two words, it is possible that their contents might be modified by an interrupt after the first is read, but before the second. If this would cause a problem, the program may need to disable interrupts during that section of the program. Interrupt handlers can also be interrupted (if allowed). Most systems provide some interrupt priority assignments to allow time-sensitive interrupts to interrupt less-important interrupt handlers.