Lecture 18: Virtual Memory - Lectures on Computer Architecture

Click the thumbnail above to watch the video lecture on YouTube

By Dr. Isuru Nawinne

18.1 Introduction

Virtual memory represents one of the most elegant abstractions in computer architecture, creating a layer between physical memory hardware and the memory view presented to programs. This lecture explores how virtual memory enables programs to use more memory than physically available by treating main memory as a cache for disk storage, supports safe execution of multiple concurrent programs through address space isolation, and provides memory protection mechanisms preventing programs from corrupting each other's data. We examine page tables, translation lookaside buffers (TLBs), page faults, and the critical design decisions that make virtual memory both practical and performant despite the enormous speed gap between RAM and disk storage.

18.2 Introduction to Virtual Memory

Virtual memory allows programs to use more memory than physically available by using main memory as a cache for secondary storage.

18.2.1 Key Purposes of Virtual Memory

Allow programs to use more memory than actually available
Support multiple programs running simultaneously on a CPU
Enable safe and efficient memory sharing between programs
Ensure programs only access their allocated memory

18.3 CPU Word Size and Address Space

The relationship between CPU word size and addressable memory determines the maximum amount of memory that can be addressed.

18.3.1 Address Space by CPU Word Size

8-bit CPU

Maximum addressable memory: 256 bytes (2^8)

16-bit CPU

Maximum addressable memory: 64 kilobytes (2^16)

32-bit CPU

Maximum addressable memory: 4 gigabytes (2^32)
Became mainstream in early 1980s
Was replaced when systems started reaching 4 GB memory limit

64-bit CPU

Maximum addressable memory: 16 exabytes (2^64)
About 16 million gigabytes
Current mainstream word size
Became mainstream around 2002-2003

18.3.2 Historical Pattern

Maximum address space sizes were always much larger than commonly used RAM sizes
Architectures were replaced when high-end systems started reaching the address space limits
Personal computers typically had much less memory than the theoretical maximum

18.4 Virtual vs Physical Addresses

18.4.1 Virtual Address

Address generated by CPU
Refers to entire theoretical address space
CPU thinks it has access to full address space
In 64-bit CPU: can address up to 16 exabytes

18.4.2 Physical Address

Actual address in real memory (RAM)
Much smaller range than virtual addresses
Typical modern RAM: 8-16 GB (much less than 16 exabytes)

18.4.3 Address Translation

Virtual addresses must be translated to physical addresses
Translation required every time memory is accessed
Main mechanism for making virtual memory work

18.5 Memory Hierarchy with Virtual Memory

Complete hierarchy from top to bottom:

CPU (generates virtual addresses, thinks memory is large and fast)
Cache (virtually or physically addressed)
Main Memory (acts as cache for secondary storage)
Secondary Storage/Disk (contains all pages)

CPU accesses cache directly. Main memory acts as cache for disk, not just a second level cache - requires additional mechanisms.

18.6 Terminology

18.6.1 CPU Level

Accesses: Words (1, 4, or 8 bytes)
Hit/Miss terminology used

18.6.2 Cache Level

Transfers: Blocks (16-256 bytes typically)
Hit/Miss terminology used

18.6.3 Memory Level

Transfers: Pages (1 KB to 64 KB typically)
Page Hit: Page is present in memory
Page Fault: Page is not present in memory (not "miss")

18.7 Access Latencies

Understanding the latency differences is crucial for virtual memory design:

Cache Hit: Under 1 cycle
Cache Miss (accessing main memory): 10-100 cycles
Page Fault (accessing disk): ~1 million cycles
- Extremely large penalty
- Influences design decisions significantly
- Page faults handled in software by OS due to large penalty

18.8 Virtual and Physical Address Structure

18.8.1 Example with 32-bit Addresses

Virtual Address (32 bits)

Virtual Page Number: 22 bits (most significant)
Page Offset: 10 bits (least significant)
Virtual address space: 4 GB
Number of virtual pages: 2^22 pages
Page size: 2^10 = 1 KB

Physical Address (28 bits)

Physical Page Number (Frame Number): 18 bits (most significant)
Page Offset: 10 bits (least significant)
Physical address space: 256 MB
Number of frames: 2^18 frames
Page size: 1 KB (same as virtual)

18.8.2 Key Points

Page offset has same number of bits in virtual and physical addresses
Physical address space is smaller than virtual address space
Memory contains "frames" where pages can be placed
Frame = slot in memory that can hold a page

18.9 Supporting Multiple Programs

Multiple programs can run simultaneously by sharing physical memory:

18.9.1 Each Program

Has its own virtual address space
Thinks it has entire memory to itself
CPU switches between programs quickly
Creates impression of simultaneous execution

18.9.2 Memory Sharing

Physical memory contains active pages from all running programs
Each program's virtual pages map to different physical frames
Operating system ensures programs only access their own memory

18.9.3 Example

Program 1 virtual address space: 8 virtual pages
Program 2 virtual address space: 8 virtual pages
Physical memory: Only 4 frames available
Active pages from both programs share the 4 frames
Same virtual page number from different programs can map to different physical frames

18.10 Page Table

The page table is a data structure stored in memory that contains address translations.

18.10.1 Purpose

Stores virtual-to-physical address translations
One page table per program
Contains entries for ALL virtual pages (not just active ones)

18.10.2 Page Table Entry Contents

Physical Page Number (main component)
Valid Bit: Is the page currently in memory?
- 1 = Page is in memory (translation valid)
- 0 = Page not in memory (page fault)
Dirty Bit: Has page been modified?
- 1 = Page modified, inconsistent with disk
- 0 = Page not modified, consistent with disk
Additional bits: Access permissions, memory protection status

18.10.3 Finding Page Table

Page tables stored at fixed locations in memory
Page Table Base Register (PTBR): Special CPU register storing starting address of active page table
When CPU switches programs, OS updates PTBR to point to correct page table

18.11 Address Translation Process

Steps to access memory:

CPU generates virtual address (virtual page number + page offset)
Access page table using PTBR + virtual page number as index
Read page table entry:
- If valid bit = 0: Page fault (handled by OS)
- If valid bit = 1: Read physical page number
Construct physical address: Physical page number + page offset
Access physical memory with physical address
Return data to CPU

18.11.1 Memory Accesses Required

One access for page table
One access for actual data
Total: Two memory accesses per data access

18.12 Page Table Size Calculation

18.12.1 Example: 4 GB Virtual, 1 GB Physical, 1 KB Pages

Number of Entries

Virtual address: 32 bits
Page offset: 10 bits (for 1 KB pages)
Virtual page number: 22 bits
Number of entries: 2^22 = ~4 million entries

Entry Size

Physical address: 30 bits (for 1 GB)
Page offset: 10 bits
Physical page number: 20 bits
Valid bit: 1 bit
Dirty bit: 1 bit
Total needed: 22 bits
Actual storage: 32 bits (word-aligned)
Size per entry: 4 bytes

Total Page Table Size

4 bytes × 2^22 entries = 16 MB
Significant memory overhead for each program

18.13 Write Policy for Virtual Memory

18.13.1 Write-Through: NOT USED

Would require writing to disk on every write
1 million cycle penalty unacceptable
Not a good design decision

18.13.2 Write-Back: USED (Standard Policy)

Writes only update memory
Dirty bit tracks modified pages
Only write to disk when:
- Page is evicted from memory
- Page's dirty bit is 1
Minimizes disk accesses

18.14 Placement Policy

18.14.1 Fully Associative Placement

Any page from disk can go to any frame in memory
Memory treated as one large set containing all frames
No direct mapping or set restrictions
Maximizes flexibility in page placement
Reduces page faults

18.14.2 Why Fully Associative?

Minimizes page faults (primary goal)
Large page fault penalty (1 million cycles) justifies complexity
Different from cache (doesn't use tag comparators in memory)
Address translation through page table provides necessary mechanism

18.15 Page Fault Handling

18.15.1 What Operating System Must Do

1. Fetch Missing Page

Access disk to retrieve page
OS must know disk location of page
OS maintains data structures tracking page locations

2. Find Unused Frame

OS tracks which frames are currently used
Can determine this through page tables
If unused frame exists: Place page in unused frame

3. If Memory Full (No Unused Frames)

Select active page to replace using replacement policy
Common replacement policies:
- Least Recently Used (LRU)
- Pseudo-LRU (PLRU)
- First-In-First-Out (FIFO)
- Least Frequently Used (LFU)
Goal: Exploit temporal locality (keep recently/frequently used pages)

4. Check Dirty Bit of Page to be Replaced

If dirty bit = 1: Write page back to disk before replacement
If dirty bit = 0: Can directly overwrite (data consistent with disk)
Prevents data loss

5. Update Data Structures

Update page table entry for new page
Update page table entry for replaced page (set valid = 0)
Place fetched page in frame

18.15.2 Optimization

Many operations can occur in parallel during disk fetch
While fetching data, OS can determine placement and handle replacement
Use buffers for write-back operations

18.15.3 Why Software Handling?

1 million cycle penalty is so large that software overhead is negligible
Complex replacement policies better suited to software
Hardware optimization doesn't provide significant benefit

18.16 Translation Lookaside Buffer (TLB)

18.16.1 Purpose

Avoid accessing memory twice for every data access
Act as cache for page table entries
Reduce address translation overhead

18.16.2 What is TLB?

Hardware cache specifically for page table entries
Stores recently used address translations
Based on locality of page table entry accesses
Exploits temporal and spatial locality of page accesses

18.16.3 TLB Entry Structure

Tag: Virtual address tag (or physical address tag)
Physical Page Number: The address translation
Valid Bit: Is this TLB entry valid?
- Different from page table valid bit
- Indicates if TLB entry contains valid translation
Dirty Bit: Same meaning as in page table

18.16.4 TLB Parameters

Size

16-512 page table entries (typical range)

Block Size

1-2 address translations
Small blocks because spatial locality between pages is larger
Adjacent pages not as closely related as adjacent cache blocks

Placement Policy

Fully associative or set associative
Fully associative for smaller TLBs (~16 entries)
Set associative for larger TLBs
Goal: Keep miss rate below 1%

Hit Latency

Much less than 1 cycle

Miss Penalty

10-100 cycles (memory access required)

18.16.5 TLB Operation

Hit

Address translation available in TLB
Use translation directly without accessing memory
Only one memory access needed (for data)

Miss

Translation not in TLB
Must access page table in memory
Total: Two memory accesses (page table + data)

18.16.6 Why Low Miss Rate Essential?

TLB misses double memory access time
Must access page table (10-100 cycles) then data
Miss rate typically kept below 1%
> 99% of translations served by TLB

18.17 Complete Memory Access with TLB

Two different approaches for handling memory access with TLB:

18.18 Approach 1: Virtually Addressed Cache

18.18.1 Process

CPU generates virtual address
Access cache with virtual address (parallel with TLB)
Cache Hit: Return data to CPU immediately
Cache Miss:
- Check TLB for address translation
- TLB Hit:
  - Get physical address
  - Access memory with physical address
  - Fetch missing block
  - Update cache
  - Send word to CPU
- TLB Miss:
  - Access page table in memory
  - Page Hit:
    - Get translation
    - Access memory for data
    - Update TLB
    - Update cache
    - Send word to CPU
  - Page Fault:
    - OS accesses disk
    - Fetch missing page
    - Find unused frame or replace page
    - If replaced page dirty: write back
    - Update page table
    - Update TLB
    - Update cache
    - Send word to CPU

18.18.2 Advantage

TLB access overlapped with cache access
Both happen in parallel
No additional latency for TLB access on cache hit

18.19 Approach 2: Physically Addressed Cache

18.19.1 Process

CPU generates virtual address
Access TLB for translation first
TLB Hit:
- Get physical address
- Access cache with physical address
- Cache Hit: Return data to CPU
- Cache Miss:
  - Access memory with physical address
  - Fetch missing block
  - Update cache
  - Send word to CPU
TLB Miss:
- Access page table in memory
- Page Hit:
  - Get translation
  - Update TLB
  - Access cache with physical address
  - If cache hit: return data
  - If cache miss: fetch from memory, update cache, return data
- Page Fault:
  - OS handles as described above
  - Update page table, TLB, cache
  - Return data to CPU

18.19.2 Advantage

Cache physically indexed and tagged
Simpler cache design
No aliasing issues

18.19.3 Key Difference

Approach 1: Cache uses virtual addresses, TLB access parallel
Approach 2: Cache uses physical addresses, TLB access sequential

Both approaches are valid, and the choice depends on cache indexing method (virtual vs physical).

Key Takeaways

Virtual memory provides memory abstraction and protection
Address translation is fundamental to virtual memory operation
Page tables map virtual addresses to physical addresses
TLB caches translations to avoid double memory access
Page faults are extremely expensive (~1 million cycles)
Write-back policy is essential for virtual memory
Fully associative placement minimizes page faults
Multiple programs can safely share physical memory
OS handles page faults in software
Virtual memory enables modern multitasking operating systems

Summary

Virtual memory represents a crucial abstraction in modern computing, enabling efficient and safe memory management across multiple concurrent programs while providing each program with the illusion of abundant, dedicated memory resources.

← Previous Lecture Next Lecture →