Back to EECS 31L Index

7. EECS 31L / 07 Instruction Set Architecture (ISA)

EECS 31L • Study Notes • Instruction Set Architecture
Mahmoud Elfar • Spring 2026 • v0.1

v0.1: Initial version


Table of Contents


7.1. What is an ISA?

An Instruction Set Architecture (ISA) is the contract between software and hardware. It defines exactly four things:

  1. Instructions : the set of operations the processor can perform, and what each one means (semantics).
  2. Encoding : how each instruction is represented as a binary number.
  3. Program-visible state : the registers, memory locations, and program counter that software can observe and modify.
  4. Semantics : how each instruction changes the program-visible state when executed.

Notice what the ISA does not define: how the hardware is built, how fast it runs, how many pipeline stages it has, or how it is implemented in silicon. Those are microarchitecture decisions. The ISA is the what; the microarchitecture is the how.

Two processors with completely different internal designs (different transistor counts, different clock speeds, different die sizes) can run the same program without modification if they implement the same ISA. This separation is what makes software portable.

Even a better example: FF-X (Final Fantasy X) is a video game (a program, a long list of instructions) runs on PlayStation 2 because it uses the same ISA for the PS2’s CPU (called Emotion Engine). The actual console hardware implements the EE ISA, and so it can run the game. At the same time, you can write a software that decodes the EE ISA, and then run the decoded instructions (e.g., an addition) on your personal computer or laptop. The implementation of the EE ISA in this case is done in software (not hardware) that emulates the hardware, hence the name “emulator”. PCSX2 is an example of a PS2 emulator that we will use in one of the examples in class.

In this course, the ISA we use is RISC-V: a modern, open, and deliberately simple ISA designed for education and industry alike.

↑ Back to top

7.2. Instructions

An instruction is a single, indivisible command to the processor. It tells the processor to perform one specific operation — add two numbers, load a value from memory, store a value to memory, and so on.

Every instruction exists at three levels of abstraction simultaneously:

Semantics : what the instruction means, described in plain language or mathematical notation.

Bitwise-AND the values in registers x19 and x22, and store the result in register x5.

Assembly : a human-readable symbolic representation of the instruction, using mnemonics and register names.

and x5, x19, x22

Machine code : the binary encoding of the instruction, exactly as the processor reads it from memory.

0000000  10110  10011  111  00101  0110011

These three representations describe the same instruction. The compiler produces assembly from source code. The assembler produces machine code from assembly. The processor reads and executes machine code. You need to be fluent in moving between all three.

↑ Back to top

7.3. Instruction Types

Not all instructions need the same information. Consider:

If we tried to encode all instructions in a single fixed layout, we would waste bits for instructions that need fewer fields, or run out of space for instructions that need more. Instruction types are the solution: each type defines a specific field layout tailored to the class of instructions that share the same operand structure.

RISC-V uses several types. In this course we focus on three: R-type, I-type, and S-type.

There is a hardware motivation as well. In the datapath, the register file must read rs1 and rs2 before the instruction is decoded — there is not enough time to first determine the type and then fetch the registers. RISC-V solves this by keeping rs1 and rs2 at the same bit positions (bits [19:15] and [24:20]) across all types that use them. The hardware can always extract register indices from the same positions, regardless of type. This is not an accident — it is a deliberate constraint in the ISA design that simplifies the microarchitecture.

↑ Back to top

7.4. Terminology

Term Meaning
Encoding The process of converting an instruction from assembly (or semantics) into its binary machine code representation.
Decoding The process of converting a binary machine code word back into its fields, identifying the instruction, and determining the values of its operands.
Field A named group of bits within an instruction word that carries one piece of information.
Examples: opcode, rd, rs1, funct3. Each field occupies a fixed bit range.
Source An address – provided by the instruction – of the value to be read.
Examples: rs1 means the source register of the first operand.
Destination An address – provided by the instruction – of the value to be written.
Examples: rd means the destination register of the result.
Register operand A register used as a source or destination of an instruction. Referred to by name in assembly (x5, x19) and by index in machine code (the 5-bit binary value of the register number).
Address A general term for any value used to index into a memory or register file. Context determines whether it refers to a register address or a memory address.
Register address The index of a register in the register file. 5 bits wide in RISC-V (since there are 32 registers: log₂32 = 5). Not to be confused with a memory address.
Memory address The byte address of a location in data memory. 32 bits wide in RV32I. Computed by the ALU during execution (typically rs1 + immediate).
Immediate A constant value embedded directly in the instruction encoding. Abbreviated as imm. Unlike a register operand, it is not read from the register file; it is extracted from the instruction word itself and sign-extended before use.

About Addresses:
If you are still confused about what an address is, think about a 16-to-1 multiplexer.

The select line values 0000, 0001, 0010, etc. are the addresses. The output of the multiplexer is the data that corresponds to the applied address.
Register and memory addresses work the same way. In fact, both hardware compoenents (register file and data memory) include multiplexers in their construction that use the provided addresses to select which register or memory value to output.

About Register Names:

RISC-V registers are named x0 through x31, or r0 through r31.

Also, r1 and rs1 refer to two different things. r1 means the register with address 0x01. rs1 means the source register field (could be any register, depending on the specific instruction).

↑ Back to top

7.5. RISC-V Instruction Formats

Before we start: You do not need to memorize the instruction formats. You only need to understand how to read them and use them to encode and decode instructions. In this course, we are using RISC-V as an example. In the exam, you will be given excerpts from the ISA document that include the instruction formats. You should, however, memorize the meaning of each field.

Every RISC-V instruction is exactly 32 bits wide. Those 32 bits are divided into named fields. The following fields appear in the R/I/S formats we use in this course:

Field Bits Width Meaning
opcode [6:0] 7 Identifies the broad instruction category
rd [11:7] 5 Destination register index
funct3 [14:12] 3 Secondary opcode; distinguishes instructions within a category
rs1 [19:15] 5 First source register index
rs2 [24:20] 5 Second source register index
funct7 [31:25] 7 Tertiary opcode; used in R-type to distinguish e.g. ADD from SUB

Not every type uses all fields. The immediate field replaces some fields where a register index is not needed.

RISC-V Instruction Formats

RISC-V Instruction Formats

In this course, we will focus on three types only: R, I, and S.

7.5.1. R-type

RISC-V R-type

Used for register-to-register arithmetic and logic operations. Both source operands come from registers; the result goes to a register.

31      25 24    20 19    15 14  12 11     7 6      0
+--------+--------+--------+------+---------+-------+
| funct7 |  rs2   |  rs1   |funct3|   rd    |opcode |
| 7 bits | 5 bits | 5 bits |3 bits| 5 bits  | 7 bits|
+--------+--------+--------+------+---------+-------+

7.5.2. I-type

RISC-V I-type

For some reason, I-type is split into two subtypes: Arithmetic and Load. The difference is in semantics and opcode, not format. Both share the same field layout.

I-type (Arithmetic)

Used for arithmetic and logic operations where one operand is a constant (immediate) embedded in the instruction.

31          20 19    15 14  12 11     7 6      0
+------------+--------+------+---------+-------+
|  imm[11:0] |  rs1   |funct3|   rd    |opcode |
|  12 bits   | 5 bits |3 bits| 5 bits  | 7 bits|
+------------+--------+------+---------+-------+

I-type (Load)

Load instructions share the I-type field layout but use a different opcode and semantics. Instead of computing a result to store in a register, they compute a memory address and load a value from that address.

31          20 19    15 14  12 11     7 6      0
+------------+--------+------+---------+-------+
|  imm[11:0] |  rs1   |funct3|   rd    |opcode |
|  12 bits   | 5 bits |3 bits| 5 bits  | 7 bits|
+------------+--------+------+---------+-------+

7.5.4. S-type (Store)

RISC-V S-type

Store instructions write a register value to memory. They need two source registers (base address and data to store) but no destination register — the result goes to memory. The 12-bit immediate is split into two separate fields to keep rs1 and rs2 at the same bit positions as in R-type.

31      25 24    20 19    15 14  12 11     7 6      0
+--------+--------+--------+------+---------+-------+
|imm[11:5]|  rs2  |  rs1   |funct3|imm[4:0] |opcode |
| 7 bits | 5 bits | 5 bits |3 bits| 5 bits  | 7 bits|
+--------+--------+--------+------+---------+-------+

The split immediate is the price paid for keeping rs1 and rs2 at fixed positions. If the immediate were contiguous, one of the register fields would have to move, which would complicate the register file read logic in the datapath.

↑ Back to top

7.6. Encoding and Decoding

🕹️ Check out the Datapath-v2 Simulator. Use it to test your understanding of ISA encoding and decoding.

7.6.1. How to Encode an Instruction

Given an instruction in assembly, produce its 32-bit machine-code word.

Checklist:

  1. Identify the instruction mnemonic (e.g., and, addi, lw, sw).
  2. Look up the instruction in the ISA table.
    Record its type, opcode, funct3, and funct7 if applicable.
  3. Use the instruction type to determine the field layout.
  4. Convert each register name to its 5-bit index.
    Example: x1910011.
  5. If there is an immediate, convert it to a 12-bit binary value:
  6. Place each field into its correct bit range.
  7. Concatenate the fields from bit 31 down to bit 0.

7.6.2. How to Decode an Instruction

Given a 32-bit machine code word in hexadecimal, identify the instruction and recover all operand values.

Checklist:

  1. Convert the hex word to a 32-bit binary string (4 bits per hex digit).
  2. Extract bits [6:0] → opcode.
  3. Look up the opcode in the ISA table to determine the instruction type (R / I / S).
  4. Extract funct3 from bits [14:12].
  5. If R-type: extract funct7 from bits [31:25].
  6. Use opcode + funct3 (+ funct7 if R-type) to identify the specific instruction.
  7. Extract register fields based on type:
  8. If there is an immediate, reassemble it:
  9. Convert register indices to names (e.g., 00011x3).
  10. Write the assembly representation.

7.6.3. Worked Examples

Encoding R-type: and x5, x19, x22

Step 1–2. Look up and in the ISA table:

Step 3. R-type layout: funct7 | rs2 | rs1 | funct3 | rd | opcode

Step 4. Convert registers:

Step 5. No immediate.

Step 6–7. Assemble:

funct7    rs2     rs1    funct3   rd     opcode
0000000  10110   10011    111   00101  0110011

Machine code: 00000001011010011111001010110011 = 0x0169F2B3

Encoding I-type (Arithmetic): addi x9, x14, -7

Step 1–2. Look up addi:

Step 3. I-type layout: imm[11:0] | rs1 | funct3 | rd | opcode

Step 4. Convert registers:

Step 5. Immediate = −7. Two’s complement (12 bits):

Step 6–7. Assemble:

imm[11:0]       rs1    funct3   rd     opcode
111111111001   01110    000   01001  0010011

Machine code: 11111111100101110000010010010011 = 0xFF970493

Encoding I-type (Load): lw x3, 20(x8)

Step 1–2. Look up lw:

Step 3. I-type layout: imm[11:0] | rs1 | funct3 | rd | opcode

Step 4. Convert registers:

Step 5. Immediate = 20 = 000000010100 (12 bits, positive, no two’s complement needed).

Step 6–7. Assemble:

imm[11:0]       rs1    funct3   rd     opcode
000000010100   01000    010   00011  0000011

Machine code: 00000001010001000010000110000011 = 0x01442183

Encoding S-type: sw x17, 48(x6)

Step 1–2. Look up sw:

Step 3. S-type layout: imm[11:5] | rs2 | rs1 | funct3 | imm[4:0] | opcode

Step 4. Convert registers:

Step 5. Immediate = 48. Binary (12 bits): 000000110000.

Step 6–7. Assemble:

imm[11:5]  rs2     rs1    funct3  imm[4:0]  opcode
0000001   10001   00110    010    10000    0100011

Machine code: 00000011000100110010100000100011 = 0x03132823

Decoding R-type: 0x40F18533

Step 1. Convert hex to binary:

4        0        F        1        8        5        3        3
0100     0000     1111     0001     1000     0101     0011     0011

Full: 01000000111100011000010100110011

Step 2. opcode = bits [6:0] = 0110011.

Step 3. Look up 0110011 → R-type.

Step 4. funct3 = bits [14:12] = 000. funct7 = bits [31:25] = 0100000.

Step 5. Look up opcode=0110011, funct3=000, funct7=0100000sub.

Step 6. Extract registers:

Step 7. No immediate (R-type).

Assembly: sub x10, x3, x15

Decoding I-type (Arithmetic): 0x00F0E613

Step 1. Convert hex to binary:

0        0        F        0        E        6        1        3
0000     0000     1111     0000     1110     0110     0001     0011

Full: 00000000111100001110011000010011

Step 2. opcode = bits [6:0] = 0010011.

Step 3. Look up 0010011 → I-type arithmetic.

Step 4. funct3 = bits [14:12] = 110.

Step 5. Look up opcode=0010011, funct3=110ori.

Step 6. Extract registers:

Step 7. imm[11:0] = bits [31:20] = 000000001111 = 15. Positive, no sign extension needed.

Assembly: ori x12, x1, 15

Decoding I-type (Load): 0x02412383

Step 1. Convert hex to binary:

0        2        4        1        2        3        8        3
0000     0010     0100     0001     0010     0011     1000     0011

Full: 00000010010000010010001110000011

Step 2. opcode = bits [6:0] = 0000011.

Step 3. Look up 0000011 → I-type load.

Step 4. funct3 = bits [14:12] = 010lw (load word).

Step 5. Extract registers:

Step 6. imm[11:0] = bits [31:20] = 000000100100 = 36. Positive.

Assembly: lw x7, 36(x2)

Decoding S-type: 0x0055A623

Step 1. Convert hex to binary:

0        0        5        5        A        6        2        3
0000     0000     0101     0101     1010     0110     0010     0011

Full: 00000000010101011010011000100011

Step 2. opcode = bits [6:0] = 0100011.

Step 3. Look up 0100011 → S-type store.

Step 4. funct3 = bits [14:12] = 010sw (store word).

Step 5. Extract registers:

Step 6. Reassemble immediate:

Assembly: sw x5, 12(x11)

↑ Back to top

7.7. Reading an ISA Document

The reference document for RISC-V in this course is Harris Appendix B, page 1, specifically Figure B.1 (instruction format diagrams) and Table B.1 (RV32I integer instruction summary). Here is what to look for.

Figure B.1: Format diagrams. Shows the bit-field layout for each instruction type. Read it left to right from bit 31 to bit 0. Each row is one type. This is your first stop when you need to know where a field lives in a given instruction type.

Table B.1: Instruction table. Each row is one instruction. The columns mean:

Column What it tells you
op The 7-bit opcode in binary, with its decimal value in parentheses
funct3 The 3-bit secondary opcode
funct7 The 7-bit tertiary opcode (R-type only; means not used)
Type The instruction format type (R / I / S / …)
Instruction The assembly syntax
Description Plain-language meaning
Operation Precise mathematical definition of what the instruction does to program-visible state

How to use the table for encoding:

  1. Find the instruction by its mnemonic in the Instruction column.
  2. Read off opcode, funct3, funct7, and type from that row.
  3. Use Figure B.1 to determine field positions for that type.

How to use the table for decoding:

  1. Extract the opcode from the binary word.
  2. Find all rows with that opcode. There will typically be several — narrow down using funct3, then funct7 if needed.
  3. The matching row gives you the mnemonic, type, and operation.

What the ISA document does not tell you:

↑ Back to top