1.1.1 Structure and function of the processor

The central processing unit (CPU) has many components to allow it to perform its task of executing instruction, including:

arithmetic logic unit (ALU)
control unit (CU)
registers
buses

The arithmetic logic unit (ALU) performs arithmetic and logical operations on data. It can:

perform mathematical operations (add, subtract, multiply, divide) on numbers
perform shift operations (moving bits right/left in a register)
carry out Boolean logic operations, comparing 2 values and using operators like AND, OR, NOT, XOR

The control unit is responsible for:

controlling the fetch-decode-execute cycle
managing the execution of instructions by managing control signals to other parts of the computer
synchronising the computer's actions, using the inbuilt clock

Registers are special memory cells which operate at high speed, and are where all arithmetic, logical and shift operations occur. There are several special-purpose registers:

program counter (PC) - holds the address of the next instruction to be executed
accumulator (ACC) - stores the results of calculations (performed by ALU)
memory address register (MAR) - holds the address of a location that is to be read from or written to
memory data register (MDR) - temporarily stores data read from/written to memory
current instruction register (CIR) - holds the current instruction being executed

Buses are a set of parallel wires, connecting 2 or more components within the CPU to allow the transmission of data. The three important ones are:

data bus - transports data and instructions between components
address bus - carries the address of the memory location being read from/written to to the memory
control bus - transmits control signals between components, and ensures operations are synchronised. The control bus coordinates the use of the address and data buses by different components and provides status information between system components

The data and control bus are bi-directional (signals can be carried in both directions), but the address bus is unidirectional (the CPU sends addresses to the memory, but never receives any).

Assembly language uses mnemonics to represent instructions, as a simplified way of representing machine code. Each instruction is divided into operand (data or memory address of data upon which the operation is to be performed) and opcode (the type of instruction that needs to be executed).

The fetch-decode-execute (FDE) cycle is the sequence of operations completed to execute an instruction. The cycle is repeated for each instruction that is executed.

Fetch Phase:

memory address of next instruction copied from PC to MAR
read signal sent across control bus to memory; address from MAR sent across address bus
instruction at this address return to CPU along data bus and is copied into MDR
PC incremented (so it holds address of next instruction)
contents of MDR copied into CIR

Decode Phase:

contents of CIR sent to the CU to be decoded
instruction is split into opcode and operand

opcode is used to determine the type of instruction, and what hardware to use to execute it
operand holds either:

the address of data to be used with the operation (then copied to MAR), or
the actual data to be operated on (then copied to MDR)

Execute Phase:

decoded instruction is executed (appropriate instruction/opcode is carried out on the operand)

There are a number of factors that affect the performance of the CPU, including clock speed, number of cores and the amount of cache memory:

The clock speed of a processor (measured in hertz, Hz) is the number of fetch-decode-execute (FDE) cycles which it can perform each second. This speed is determined by the system clock, which generates signals which alternate between 0 and 1 and synchronises CPU operations, as each CPU operation begins as the clock changes from 0 to 1

The higher the clock speed, the more FDE cycles that can occur each second, so the better the performance of the CPU

A core is an independent processor which can run its own FDE cycle. A computer with multiple cores can complete multiple FDE cycles at a time

The more cores, the more FDE cycles can be executed at once, so the better the performance of the CPU. Theoretically, a dual-core processor should have twice the power of a single-core processor, but this isn't always the case as the software may not be able to take full advantage of both processors

Cache is very fast memory located within the CPU and holds data and instructions copied from main memory in case they are needed again soon. This saves the time that would be taken fetching them again from main memory. As it fills up, unused instructions and data still being held are replaced. There are different levels of cache, with level 1 being the smallest and closest to the CPU. Subsequent levels are larger and further from the CPU
- The more cache a computer has, the more data and instructions can be held in fast memory so performance is improved

Pipelining is a process which can be used to improve the performance of a processor. Normally, all the steps in the FDE cycle would take place one after another. leaving some parts of the CPU idle. Pipelining prevents this - while one instruction is being executed, another is being decoded and another is fetched from memory. Appropriate data is kept in a buffer in close proximity to the CPU until it is required. This doesn’t work for code that branches (instructions that cause the execution of a different instruction sequence).

A computer's architecture is the approach that is taken to its design.

Von Neumann architecture is based on the stored program concept, where machine code instructions are fetched and executed serially by a processor that performs arithmetic and logical operations.

Von Neumann architecture includes:

A single control unit
A single ALU
A single address, data and control bus
A single memory store for both instructions and data
A shared data bus for both instructions and data (prevents reading and writing at the same time)

Harvard architecture is similar to Von Neumann architecture, but has physically separate memories for instructions and data, and has separate buses for instructions and data (allowing the CPU to read an instruction, and access data in the memory simultaneously, and also allows the use of pipelining).

Von Neumann Architecture	Harvard Architecture
Data and programs share memory	Data and programs held in separate memory
One bus used to transfer data and instructions	Multiple parallel data and instruction buses used
Cheaper to develop as the CU is easier to design	More expensive to develop due to more complex design
Slower execution (due to shared bus)	Quicker execution (data & instructions fetched at same time)
Only a single, shared memory store (inefficient use of space)	Memories can vary in size, making efficient use of space

Contemporary processor architecture uses a combination of the above architectures, and are modified to deliver optimum performance and efficiency.