### Lecture 2: Computer Abstractions & Technology

- Last Time
  - Course Overview
  - Introduction to Computer Architecture
- Today
  - Recap: Design and computer architecture
  - Computer elements
    - · Transistors, wires, pins
  - Introduction to performance
  - Handout HW #1 due Feb 6

UTCS Lecture 2

Design and computer architecture

## How to design something:

- · List goals
- · List constraints
- · Generate ideas for possible designs
- · Evaluate the different designs
- · Pick the best design
- · Refine it

In reality, this process is iterative.

As constraints change, best design will change too.

[Use classroom remodel as example of design process]

UTCS Lecture 2

### Intel 4004 - 1971



- The first microprocessor
- · 2,300 transistors
- 108 KHz
- $10\mu m$  process

#### Intel Core 2 Duo - 2006



- "State of the art"
- 291 million transistors
- 3 *GHz*
- 0.065  $\mu$ m (65 nm) process
- Could fit ~100,000 4004s on this chip!

UTCS Lecture 2 5

## Changing Technology leads to Changing Architecture

- · 1970s
  - multi-chip CPUs
  - semiconductor memory very expensive
  - microcoded control
  - complex instruction sets (good code density)
- 1980s
  - single-chip CPUs, on-chip RAM feasible
  - simple, hard-wired control
  - simple instruction sets
  - small on-chip caches

- · 1990s
  - lots of transistors
  - complex control to exploit instruction-level parallelism
- · 2000s
  - even more transistors
  - slow wires
  - multi-core chips

# Don't forget the simple view

### All a computer does is

- Store and move data
- Communicate with the external world
- Do these two things conditionally
- According to a recipe specified by a programmer

### It's complex because

- We want it to be fast
- We want it to be reliable and secure
- We want it to be simple to use
- It must obey the laws of physics

UTCS Lecture 2

# Computer Abstractions & Technology

# Computer Elements

- Transistors (computing)
  - How can they be connected to do something useful?
  - How do we evaluate how fast a logic block is?
- Wires (transporting)
  - What and where are they?
  - How can they be modeled?
- Memories (storing)
  - SRAM vs. DRAM







### Abstractions in Logic Design

- · In physical world
  - Voltages, Currents
  - Electron flow
- In logical world abstraction
  - V < V<sub>lo</sub> ⇒ "0" = FALSE
  - V > Vhi ⇒ "1" = TRUE
  - In between forbidden





UTCS Lecture 2 13

## Basic Technology: CMOS

- CMOS: Complementary Metal Oxide Semiconductor
  - NMOS (N-Type Metal Oxide Semiconductor) transistors
  - PMOS (P-Type Metal Oxide Semiconductor) transistors
- NMOS Transistor
  - Apply a HIGH (Vdd) to its gate turns the transistor into a "conductor"
  - Apply a LOW (GND) to its gate shuts off the conduction path
- PMOS Transistor
  - Apply a HIGH (Vdd) to its gate shuts off the conduction path
  - Apply a LOW (GND) to its gate turns the transistor into a "conductor"

Slide courtesy of D. Patterson Lect



Vdd = (2.5V)

14

UTCS



# What can you build with transistors?

- · Logic Gates
  - Inverters, AND, OR, arbitrary



- Buffers (drive large capacitances, long wires, etc.)
- · Memory elements
  - Latches, registers, SRAM, DRAM





## The Ugly Truth

- Transistors are not ideal switches!
  - Gate Capacitance  $(C_a)$
  - Source-to-Drain resistance (R)
  - Drain capacitance
- Issues
  - Delay actually takes real time to turn transistors on and off
  - Power/Energy
  - Noise (from transistors, power rails)
- · But we can change transistor size
  - Increase  $C_q$ , but decrease R

UTCS Lecture 2

### Ideal (CS) versus Reality (EE)

- When input 0 -> 1, output 1 -> 0 but NOT instantly
  - Output goes 1 -> 0: output voltage goes from Vdd (2.5v) to 0v
- When input 1 -> 0, output 0 -> 1 but NOT instantly
  - Output goes 0 -> 1: output voltage goes from 0v to Vdd (2.5v)
- · Voltage does not like to change instantaneously











## Storage Element's Timing Model



- · Setup Time: Input must be stable BEFORE the trigger clock edge
- · Hold Time: Input must REMAIN stable after the trigger clock edge
- · Clock-to-Q time:
  - Output cannot change instantaneously at the trigger clock edge
  - Similar to delay in logic gates, two components:
    - · Internal Clock-to-Q
    - · Load dependent Clock-to-Q

UTCS

Slide courtesy of D. Patterson

Lecture 2

25

# Clocking Methodology



- $\cdot$  All storage elements are clocked by the same clock edge
- The combination logic block's:
  - Inputs are updated at each clock tick
  - All outputs MUST be stable before the next clock tick

UTCS

Slide courtesy of D. Patterson

Lecture 2

26





### Wires

- · Limiting Factor
  - Density
  - Speed
  - Power
- 3 models for wires (model to use depends on switching frequency)

UTCS Lecture 2 29

# Wire Density

- · Communication constraints
  - Must be able to move bits to/from storage and computation elements
- · Example: 9 ported register file



UTCS Lecture 2

30





### Rack Level





DOE ASCI White

MIT J-Machine

UTCS Lecture 2 33

# Memory

- Moves information in time (wires move it in space)
- Provides state
- · Requires energy to change state
  - Feedback circuit SRAM
  - Capacitors DRAM
  - Magnetic media disk
- Required for memories
  - Storage medium
  - Write mechanism
  - Read mechanism



4Gb DRAM Die

### Technology Scaling Trends

- CPU Transistor density 60% per year
- · CPU Transistor speed 15% per year
- · DRAM density 60% per year
- · DRAM speed 3% per year
- On-chip wire speed decreasing relative to transistors (witness the Pentium 4 pipeline)
- · Off-chip pin bandwidth increasing, but slowly
- · Power approaching costs limits
  - $P = CV^2f + I_{leak}V$
- · All of these factors affect the end system architecture

UTCS Lecture 2 35

### Summary

- · Logic Transistors + Wires + Storage = Computer!
- Transistors
  - Composable switches
  - Electrical considerations
    - · Delay from parasitic capacitors and resistors
    - · Power (P = CV2f)
- Wires
  - Becoming more important from delay and BW perspective
- Memories
  - Density, Access time, Persistence, BW





## Next Time

- Evaluation of Systems
  - Performance
    - · Amdahl's Law, CPI
  - Cost
  - Benchmark Examples
- Reading assignment
  - P&H Chapter 4 Performance measurement