6. EECS 31L / 06 Processor Design (Single-Cycle)

EECS 31L • Study Notes • Processor Design
Mahmoud Elfar • Spring 2026 • v0.1

v0.1: Initial version

Table of Contents

6. EECS 31L / 06 Processor Design (Single-Cycle)

6.1. General-Purpose Processors

Loosely speaking, a processor is a piece of hardware that executes instructions. An instruction is a command that tells the hardware to perform a specific operation (ex: add two numbers). But we want our hardware to do more than just addition. As such, the idea of a general-purpose processor emerges: Give me a piece of hardware that can:

Perform a variety of operations (ex: add, subtract, load from memory, etc.)
Can perform those operations in different orders (sequence control)
Can operate on different data (data control)
Can easily change both data and sequence without needing to redesign the hardware (i.e., needs to be programmable)

Let’s rewrite these requirements in more precise terms:

The program must be stored somewhere. A processor does not know in advance what it will compute. The sequence of operations it should perform must be loaded into memory before execution begins, and the processor must be able to read those operations one at a time.
Execution must be step-by-step and repeatable. The processor reads one instruction, executes it, then moves to the next. State from one step must persist to the next. This requires memory elements — not just combinational logic.

Everything in this note is a direct consequence of these requirements. Some of the historical details were covered in lecture, but they are not part of your exam material, so we won’t cover them here.

↑ Back to top

6.2. ISA vs. Microarchitecture

Before building a processor, it is important to separate two distinct concepts that are often conflated.

Instruction Set Architecture (ISA) is the contract between software and hardware. It defines:

What instructions exist and what they mean
How instructions are encoded in binary
What program-visible state exists (registers, memory, program counter)
How each instruction changes that state

The ISA says what the processor must do. It says nothing about how.

Microarchitecture is the hardware implementation of an ISA. Two processors can implement the exact same ISA in completely different ways — different numbers of pipeline stages, different cache organizations, different ALU designs — and both will correctly execute the same programs because they both honor the ISA contract.

In this course, the ISA we use in our examples is RISC-V: a modern, open ISA designed for simplicity. The microarchitecture we will implement is the single-cycle processor: every instruction completes in exactly one clock cycle. Later in your career, you will encounter more complex and sophisticated ISAs and microarchitectures, but the basic principles will be the same.

At some point, we will also give an example for how to trace the execution of a program on a processor that uses different ISA and microarchitecture than what you implement in the lab. For this, we will use the Emotion Engine – the processor used in the original PlayStation and PlayStation 2 consoles.

In class, we jumped back and forth between ISA and microarchitecture details. In this note, we will follow a more disciplined approach and start with the microarchitecture first.

↑ Back to top

6.3. Microarchitecture: Datapath Building Blocks

A microarchitecture describes the hardware components that implement an ISA. We can divide the components into two categories:

Datapath: components that hold and manipulate data
Controller: components that generate control signals that configures the datapath (think select lines in multiplexers) to perform different operations

The components within each:

Datapath:
- Combinational circuits: ALU, adders, multiplexers, immediate generator.
- Memory elements: register file, program counter (PC), instruction memory (IMEM), data memory (DMEM)
Controller:
- Combinational circuits: control logic, ALU control logic
- Memory elements: finite state machine (FSM) state registers
(Bonus) Wiring: I just made this category up but I think it may help you understand the datapath better.
- Data buses: wires that carry data between components, typically wide (ex: 32 bits)
- Address buses: wires that carry addresses, typically narrower (ex: 5 bits for register file read address)
- Control buses: wires that carry control signals, typically 1 bit (ex: RegWrite) or a few bits (ex: ALU control)
- Clock and reset lines: special wires that synchronize and initialize the processor

Our focus here is on the datapath building blocks. We will cover how those blocks are interconntected to form the datapath and the controller in later notes.

6.3.1. The ALU

The Arithmetic Logic Unit (ALU) is the combinational component that performs computations: addition, subtraction, AND, OR, comparison, and so on.

ALU Model:

Input ports: A (first operand), B (second operand), ALU control (selects the operation)
Output ports: Result (output of the selected operation), flags (Zero, Overflow, Carry_Out)
Behavior: Check the ALU control signal. If it indicates addition, compute Result = A + B, etc. Afterwards, set the flags based on the Result.

flags are 1-bit signals that describe properties of the result. The flags used in this course are:

Flag	Name	Set to 1 when…
`zero`	Zero	The result is exactly zero (all bits are 0)
`sign`	Sign	The result is negative (the most significant bit is 1)
`overflow`	Overflow	The result overflows (whether signed or unsigned)
`carry_out`	Carry out	The addition or subtraction produced a carry out of the most significant bit (relevant for unsigned arithmetic)

Note: Remember that not all carry outs are overflows. In exams, I won’t ask you to compute or model either flags. You have suffered enough through EECS 31.

There can be more flags. In this class, we will focus only on zero and sign – you must know what they mean.
zero is particularly important for branch instructions. For example, beq subtracts two registers and branches if zero is asserted. What is beq, you ask? “Branch if equal”, but that’s a topic for another day.

Here is an example of the logic diagram (simplified for clarity) of a 32-bit ALU.

Simplified Logic Diagram of a 32-bit ALU

From how the multiplexer is wired, we can deduce the ALU control encoding for each operation:

Operation	ALU control value
ADD	000 (0x0)
SUB	001 (0x1)
XOR	100 (0x4)
OR	101 (0x5)
AND	110 (0x6)

Some notes:

The way the operations are encoded in the ALU control signal is completely independent of the instruction encoding in the ISA. We will see how one maps to the other in the controller design section.
The multiplexer is one way to implement the ALU control logic. Another way is to use tri-state buffers. As you may have already guessed, the ALU design can be optimized by sharing logic between operations. But the basic principle is the same: an operation is selected by a control signal.
The flags are computed by a separate combinational logic circuit (the “FLAG” block in the diagram). You encountered similar circuits before in EECS 31 final exam, where the implementation can be as simple as wiring the sign flag to the MSB of the result.

This is a favorite exam question as it tests your understanding of how the ALU works, as well as your ability to read and interpret logic diagrams and multiplexer control signals.

↑ Back to top

6.3.2. The Register File

Purpose:

A small, fast memory that holds the processor’s working values.
Example: RISC-V has 32 registers, each 32 bits wide.
All arithmetic and logic instructions read (some of) their operands from the register file and write their result back to it.

Construction: You can build a homebrew register file by doing the following:

Bring 32 registers
Wire all their clock inputs together. Do the same for their reset inputs.
Wire all their data inputs together
Wire each register’s write enable to a unique output of a decoder (one with enable signal)
Wire all data outputs to 32 multiplexers (one multiplexer per bit of the output data), each is 32-to-1 (one input per register).

This is getting out of hand. Just check the logic diagram below. You should be able to build something like this – you studied each component before in EECS 31. It is important to understand how the register file works internally. For 31L exams, you only need to know the input and output ports, the behavior, and how to interpret and select the correct values for W and N.

Simplified Logic Diagram of a 32-bit Register File

N×W notation. A register file is described as N×W, or an N W-bit register file, where:

N is the number of registers.
- N determines the width of the address bus: you need ⌈log₂N⌉ bits to index into N registers.
- For 32 registers: log₂32 = 5, so the address bus is 5 bits wide.
W is the width of each register in bits.
- W determines the width of all data buses.
- For RISC-V: W = 32.

A 32×32 register file, or a 32 32-bit register file, has 32 registers, each is 32 bits wide, with a 5-bit address bus and 32-bit data buses. I guess it is a bad example of naming, but RISC-V’s register file is 32×32.

A better example:
A 64×16 register file has 64 registers, each is 16 bits (2 bytes) wide. Expect the address bus to be 6 bits (log₂64 = 6) and the data buses to be 16 bits wide.

Reading is done with a multiplexer. The address input selects which register’s output to route to the read data port. Because the MUX is combinational, reads are asynchronous — the output follows the address with no clock required.
Writing is done with a decoder. The write address is decoded into a one-hot enable signal that activates exactly one register’s write input. The write is synchronous — it only takes effect on the rising clock edge, and only when the write enable signal is asserted.

Why two read ports, one write port. Most instructions read two source registers and write one destination register. Having two independent read ports means both operands are available simultaneously in the same cycle — no need to sequence them. One write port is sufficient because an instruction produces at most one result. Adding a second write port would roughly double the decoder and wiring complexity with no benefit for the instruction set we support.

One example of port naming convention is given in the following table. There are many variations, you should get used to encountering a new one, even within the same course.

                +-------------+
rg_rd_addr1 ──→ |             | ──→ rg_rd_data1
rg_rd_addr2 ──→ |    Reg      | ──→ rg_rd_data2
rg_wrt_addr ──→ |    File     |
rg_wrt_data ──→ |    32×32    |
rg_wrt_en   ──→ |             |
clk         ──→ |             |
                +-------------+

Port	Direction	Width	Name in lab
Read address 1	Input	5	`rg_rd_addr1`
Read address 2	Input	5	`rg_rd_addr2`
Read data 1	Output	32	`rg_rd_data1`
Read data 2	Output	32	`rg_rd_data2`
Write address	Input	5	`rg_wrt_addr`
Write data	Input	32	`rg_wrt_data`
Write enable	Input	1	`rg_wrt_en`

Register File Model:

Input ports: rg_rd_addr1 (read address 1), rg_rd_addr2 (read address 2), rg_wrt_addr (write address), rg_wrt_data (write data), rg_wrt_en (write enable), clk
Output ports: rg_rd_data1 (read data 1), rg_rd_data2 (read data 2)
Behavior:
- rg_rd_data1 always outputs the value of the register at index rg_rd_addr1 (asynchronous read).
- rg_rd_data2 always outputs the value of the register at index rg_rd_addr2 (asynchronous read).
- On the rising edge of clk, if rg_wrt_en is asserted, write rg_wrt_data to the register at index rg_wrt_addr (synchronous write).

Memory Elements as Register File Variants:
All other memory elements in the processor are variations of the same concept. What varies is the number of registers, the number of read/output buses (1 or 2), and the number of write buses (1 or none).

6.3.3. Program Counter (PC)

Purpose:

Holds the address of the current instruction.

Construction:

A 1×W register file: one register, W bits wide.
One read port (outputs the current address to instruction memory)
One write port (accepts the next address at the clock edge).
There is no address input because there is only one register; there is nothing to index into.

next_pc ──→ [ PC ] ──→ pc_out
clk     ──→ [    ]

6.3.4. Instruction Memory (IMEM)

Purpose:

Stores the program (a program is a sequence of 32-bit instructions the processor will execute).

Construction:

An N×32 register file: N registers, each 32 bits wide.
One read port (accepts the PC value as address; outputs the 32-bit instruction at that address).
No write port; contents are loaded before execution begins and never change during a run.
Because it is read-only, it is implemented as ROM.

            +---------+
pc_out ───→ |  IMEM   | ──→ instruction (32 bits)
            |   N×32  |
            |  ROM    |
            +---------+

6.3.5. Data Memory (DMEM)

Purpose:

Stores the program’s data (ex: arrays, variables, stack).

Construction:

An N×32 register file: N registers, each 32 bits wide.
One read port (accepts an address from the ALU; outputs the 32-bit data at that address).
One write port (accepts an address from the ALU, data from the register file, and a write enable signal; writes the data to that address on the clock edge if enabled).
Because it is read-write, it is implemented as RAM.

               +-------+
addr       ──→ | DMEM  | ──→ read_data
write_data ──→ |  N×32 |
MemWrite   ──→ |  RAM  |
MemRead    ──→ |       |
clk        ──→ |       |
               +-------+

6.4. It is all Memory (an insight, not another component)

Every storage element in the processor can be seen as an instance of the same abstraction: a bank of registers, with combinational read and synchronous write. The differences are in scale and access restrictions, not in kind.

Element	Size	Read ports	Write ports	Type
PC	1×W	1	1	Register
IMEM	N×32	1	0	ROM
Register File	32×32	2	1	RAM
DMEM	N×32	1	1	RAM

↑ Back to top

In the next two study notes, we will see how do instructions look like, and trace their execution through the datapath.