Back to All Lectures

Lecture 15: Direct Mapped Cache Control

Lectures on Computer Architecture

Click the thumbnail above to watch the video lecture on YouTube

By Dr. Isuru Nawinne

15.1 Introduction

This lecture provides a comprehensive, step‑by‑step examination of how a direct‑mapped cache services read and write requests, differentiates hits from misses, and preserves data correctness. We finish the full read path (including stall + block fetch sequence), analyze write hits and misses, and introduce the write‑through policy as the simplest consistency mechanism between cache and main memory. Performance consequences of constant memory writes, the need for high hit rates, and the motivation for more advanced write‑back policies (next lecture) are emphasized. By the end you will understand exactly what the cache controller must do (state transitions, signals, data/tag/valid updates) for every access type and why write policies are a central architectural tradeoff.

15.2 Lecture Introduction and Recap

15.2.1 Previous Lecture Review

Memory Systems Foundation

Locality Principles

Direct-Mapped Cache Introduction

Cache Structure (Recap)

Address Breakdown (Recap)

[Tag][Index][Offset]
^      ^       ^
|      |       └── Identifies word/byte within block
|      └── Identifies cache entry (direct mapping)
└── Remaining bits for block identification

15.2.2 Today's Focus

15.3 Cache Read Access - Complete Process

15.3.1 Read Access Input Signals

From CPU to Cache Controller:

  1. Address (word or byte address)
  2. Read Control Signal (from CPU control unit)
    • Indicates this is a read operation (not write)
    • Part of memory control signals

15.3.2 Cache Read Steps (Detailed)

Step 1: Address Decomposition

Step 2: Cache Entry Selection (Indexing)

Step 3: Tag Comparison

Step 4: Valid Bit Check

Step 5: Hit/Miss Determination

Step 6: Data Extraction (Parallel Operation)

Step 7: Word Selection (Using Offset)

15.3.3 Timing Optimization

Parallel Operations:

15.3.4 Read Hit Outcome

15.3.5 Pipeline Integration

15.4 Cache Read Miss Handling

15.4.1 Read Miss Scenario

Miss Conditions

  1. Tag mismatch (most common)
    • Requested block not in cache
    • Different block occupies that cache location
  2. Invalid entry
    • Valid bit = 0
    • Entry contains no valid data (e.g., after initialization)
  3. Both conditions
    • Tag mismatch AND invalid entry

15.4.2 Read Miss Response Required Actions

Action 1: STALL THE CPU

Process:

CPU's Perspective:

Action 2: MAKE READ REQUEST TO MAIN MEMORY

Request Details:

Reason for Block Transfer:

Memory Access Time:

Action 3: WAIT FOR MEMORY RESPONSE

Action 4: UPDATE CACHE ENTRY

Three components to update:

a) Update Data Block:

b) Update Tag:

c) Set Valid Bit:

Action 5: SEND DATA TO CPU

Action 6: CLEAR STALL SIGNAL

15.4.3 Total Read Miss Time

Formula:

Read Miss Time = Hit Latency + Miss Penalty

Where:

Example Calculation:

15.4.4 Performance Impact

15.4.5 Question: What About the Old Block?

The Deferred Question:

Initial Answer: "We'll discuss after introducing write policies"

15.5 Cache Write Access - Introduction

15.5.1 Write Access Input Signals

From CPU to Cache Controller:

  1. Address (where to write)
  2. Data Word (what to write)
  3. Write Control Signal (indicates write operation)

Three inputs vs. two for read (no data input needed for read).

15.5.2 Write Access Process

Step 1: Address Decomposition

Step 2: Cache Entry Selection

Step 3: Tag Comparison

Step 4: Valid Bit Check

Step 5: Hit/Miss Determination

Step 6: Data Writing (The Difference)

This is where write differs from read:

15.5.3 Writing Mechanism

Input:

Demultiplexer Selection:

Example:

Write Operation Control:

15.5.4 Critical Question: Can Write and Tag Compare Happen in Parallel?

For Read (Previous Discussion)

For Write (Current Question)

More problematic!

Scenario:

Problem:

Initial Conclusion:

15.6 Write Policies - Introduction

15.6.1 The Data Consistency Problem

Scenario:

The Inconsistency:

15.6.2 Why This Matters

15.6.3 Two Fundamental Write Policies

  1. Write-Through (discussed this lecture)
  2. Write-Back (mentioned, detailed in next lecture)

15.7 Write-Through Policy

15.7.1 Write-Through Definition

Policy Statement:

> "Always write to BOTH cache AND memory"

Mechanism:

15.7.2 Write-Through Process

Write Hit with Write-Through

  1. Determine it's a write hit (tag match + valid)
  2. Write data word to cache block (using offset)
  3. Also send write request to main memory
  4. Update same address in memory
  5. Wait for memory write to complete
  6. Both cache and memory now have same value

Write Miss with Write-Through

  1. Determine it's a write miss
  2. Stall CPU
  3. Fetch missing block from memory (read operation)
  4. Update cache entry with fetched block
  5. Write the word to correct position in block
  6. Also write to memory
  7. Clear stall signal
  8. Both levels updated

15.7.3 Advantages of Write-Through

Advantage 1: SIMPLICITY

Advantage 2: CONSISTENCY GUARANTEED

Advantage 3: ANSWERS THE OLD BLOCK QUESTION

With write-through policy:

Comparison:

Advantage 4: PARALLEL WRITE AND TAG COMPARE NOW POSSIBLE!

Critical Insight:

Can now overlap write and tag comparison. Why? Two scenarios:

Scenario A: Write Hit

Scenario B: Write Miss

Result:

Timing Optimization:

Tag comparison time: T_comp
Write time: T_write
Without overlap: Total = T_comp + T_write
With overlap: Total = max(T_comp, T_write)
Typically similar delays → Nearly 2× speedup

15.7.4 Disadvantages of Write-Through

Disadvantage 1: EXCESSIVE WRITE TRAFFIC

Disadvantage 2: CPU STALLS ON EVERY WRITE

Critical Problem:

Stall Duration:

Example:

Impact on Programs with Many Writes:

Performance Comparison:

Pipeline Impact:

Real-World Issue:

Disadvantage 3: POWER CONSUMPTION

Disadvantage 4: MEMORY WEAR

15.8 Resolving the Old Block Question

15.8.1 The Question Revisited

Original Question:

> "What happens to the old block when we fetch a new block from memory on a miss?"

Context:

15.8.2 Answer with Write-Through Policy

YES, Safe to Discard

Reason 1: Memory Has Updated Version

Reason 2: Can Re-fetch If Needed

15.8.3 Example Scenario

  1. Block A in cache at index 3
  2. Block A modified several times
  3. Each modification written to cache AND memory
  4. Block B (also maps to index 3) is requested
  5. Miss occurs for Block B
  6. Fetch Block B from memory
  7. Replace Block A with Block B at index 3
  8. Block A discarded from cache
  9. Block A's data safe in memory
  10. Later access to Block A: Miss, fetch from memory again

15.8.4 Comparison with Invalid Entry

15.8.5 Contrast with Future Policy (Teaser)

Conclusion:

15.9 Parallelism in Write Access with Write-Through

15.9.1 The Parallel Write Problem Solved

Original Concern:

15.9.2 With Write-Through Policy

Case 1: Write Hit

Case 2: Write Miss

15.9.3 Key Insight

15.9.4 Timeline for Write Miss

Cycle 1: Write to cache (possibly wrong block) + Tag compare

Cycle 1: Also initiate memory write (correct address)

Cycle 2-50: Fetch correct block from memory

Cycle 51: Overwrite cache entry with correct block

Result: Cache correct, memory correct

15.9.5 Safety Guarantee

15.9.6 Performance Benefit

15.9.7 Enabled by Write-Through

15.10 Summary of Cache Operations

15.10.1 Complete Cache Operation Overview

READ HIT

READ MISS

WRITE HIT (with Write-Through)

WRITE MISS (with Write-Through)

15.10.2 Performance Characteristics

Case Time Comment
Best Case (Read Hit) < 1 cycle Optimal performance. Want this to be most common case
Moderate Case (Read Miss) 50+ cycles Acceptable if infrequent. Reason for high hit rate requirement
Poor Case (Write Hit with Write-Through) 50+ cycles Every write hits this case. Unacceptable for write-heavy programs
Worst Case (Write Miss with Write-Through) 100+ cycles Rare but extremely slow. Catastrophic when occurs

Performance Goal:

15.11 Write-Through Policy Evaluation

15.11.1 Summary of Write-Through

Mechanism:

Implementation Complexity:

15.11.2 Advantages

Advantage Description
1. Simplicity Easy to understand, simple to implement, minimal controller complexity, aligns with design principle (simple cache)
2. Consistency Cache and memory always consistent, no special synchronization needed, can discard blocks anytime, memory always reliable
3. Data Safety No data loss on block replacement, memory has all updates, crash recovery simpler, I/O devices see correct data
4. Enables Optimizations Can overlap write and tag compare, reduces hit latency, safe due to memory backup

15.11.3 Disadvantages

Disadvantage Description
1. Performance Penalty Every write stalls CPU, 10-100+ cycle stalls per write, unacceptable for write-intensive programs, contradicts pipeline optimization goals
2. Memory Traffic Excessive write traffic to memory, memory bus congestion, reduces available bandwidth for read misses, slows down entire system
3. Power Consumption Every write powers up memory, unnecessary power usage, battery drain in mobile devices, heat generation
4. Memory Wear Flash/SSD: Limited write cycles, accelerated wear-out, reduced memory lifespan, particularly bad for SSDs

15.11.4 When Write-Through Used

Suitable Applications

Real-World Usage

Modern Systems

15.12 The Need for Alternative Write Policies

15.12.1 The Performance Problem

Write-Heavy Programs

Many programming patterns involve frequent writes:

Example Code:

for (int i = 0; i < 1000; i++) {
    array[i] = compute(i);  // Store in every iteration
    sum += array[i];         // Read, accumulate, store
}

With Write-Through

15.12.2 Pipeline Impact

15.12.3 Comparison with Read Operations

Operation Time Frequency Acceptability
Read hit < 1 cycle Common Fast
Read miss 50 cycles Rare Acceptable
Write hit 50 cycles Frequent Unacceptable
Write miss 100+ cycles Rare Terrible

15.12.4 The Contradiction

15.12.5 Question Raised

"What can we do to avoid this situation?"

Student Insight:

"We can write to memory only when we want to replace that cache block with different data"

Instructor Response:

"Exactly! That becomes a different write policy."

15.12.6 Teaser for Next Lecture

15.13 Lecture Conclusion

15.13.1 Topics Covered

1. Complete Read Access Process

2. Read Miss Handling

Six-step process:

  1. Stall CPU
  2. Request block from memory
  3. Wait for response
  4. Update cache entry (data, tag, valid)
  5. Send data to CPU
  6. Clear stall

3. Write Access Process

4. Data Consistency Problem

5. Write-Through Policy

6. Old Block Question Resolved

7. Parallel Write Optimization

8. Performance Issues

15.13.2 Next Lecture Preview

Topics to Cover:

Implementation Details:

Advanced Topics (if time):

The Goal:

Key Insight:

Write-through sacrifices performance for simplicity. In modern systems, performance is critical, so more complex policies are necessary despite added complexity.

Key Takeaways

  1. Cache read hit completes in single cycle—tag match and valid bit set indicate data available immediately from cache.
  2. Cache read miss requires multiple cycles—must fetch entire block from main memory, update cache entry, set valid bit, then retry access.
  3. Cache controller implements state machine—managing transitions between idle, compare tags, fetch block, and write cache states.
  4. Tag comparison determines hit/miss—stored tag must match address tag AND valid bit must be set for successful hit.
  5. Block fetch retrieves entire block from memory—exploiting spatial locality by bringing multiple words that will likely be accessed soon.
  6. Valid bit initialization crucial at startup—all valid bits cleared to zero, preventing false hits on random cache data.
  7. Write operations complicate cache design—must maintain consistency between cache and main memory through careful policy choices.
  8. Write-through policy updates both cache and memory on every write—simple consistency but severe performance penalty.
  9. Write-through advantages: Simple implementation, main memory always current, no dirty bit needed, straightforward crash recovery.
  10. Write-through disadvantages: Every write causes slow memory access (~100 ns), dramatically reduces performance, wastes memory bandwidth.
  11. Write buffers partially mitigate write-through penalty—CPU writes to buffer and continues, buffer writes to memory asynchronously.
  12. Write buffer depth typically 4-8 entries—balances performance improvement against hardware cost and complexity.
  13. Write buffer full forces CPU stall—occurs during write-intensive code sections, limiting write-through effectiveness.
  14. Write miss policies determine cache behavior—write-allocate (fetch block first) versus no-write-allocate (write directly to memory).
  15. Write-allocate exploits temporal locality—if just written location likely accessed again soon, fetching to cache improves future performance.
  16. No-write-allocate avoids fetch overhead—appropriate when written locations unlikely to be accessed soon.
  17. Policy combinations affect overall performance—write-through typically paired with no-write-allocate for consistency.
  18. Cache consistency means cache and memory agree on data values—critical correctness requirement across all cache operations.
  19. Performance impact of write policies substantial—write-through can increase memory traffic by 15-20% in typical programs.
  20. Write-back policy introduced as superior alternative—defers memory writes until block eviction, dramatically reducing memory traffic.

Summary

Detailed examination of cache memory operations reveals the sophisticated control logic required to manage read and write accesses while maintaining data consistency between cache and main memory. Read operations follow straightforward paths: hits deliver data in single cycle via tag comparison confirming both tag match and valid bit set, while misses trigger multi-cycle sequences fetching entire blocks from main memory, updating cache entries, setting valid bits, and retrying accesses. The cache controller implements these sequences through state machine logic managing transitions between idle, tag comparison, block fetching, and cache writing states. Write operations introduce significant complexity and performance implications through policy choices determining how cache and memory stay synchronized. Write-through policy, updating both cache and memory on every write, offers simplicity and guaranteed consistency—main memory always reflects current data state, enabling straightforward crash recovery and multi-processor coherence. However, write-through's performance penalty proves severe: every write operation incurs ~100 nanosecond memory access delay, effectively eliminating cache benefit for write-heavy code sections and wasting substantial memory bandwidth on updates. Write buffers provide partial mitigation by decoupling CPU from memory write delays, allowing processors to write to small hardware queues and continue execution while buffer contents asynchronously propagate to main memory. Typical write buffers holding 4-8 entries balance performance improvement against hardware cost, though write-intensive code can still fill buffers and force CPU stalls. Write miss policies—write-allocate (fetch block before writing) versus no-write-allocate (write directly to memory)—represent additional design choices affecting performance based on program access patterns. Write-allocate exploits temporal locality, benefiting code that writes then soon reads same locations, while no-write-allocate avoids fetch overhead for write-once scenarios. Write-through typically pairs with no-write-allocate for policy consistency. The fundamental limitation—that write-through forces memory access on every write regardless of whether data will be accessed again—motivates write-back policies introduced in subsequent lectures, which defer memory writes until block eviction and thereby dramatically reduce memory traffic. Understanding these operational details and policy tradeoffs proves essential for appreciating how real cache implementations balance performance, complexity, consistency, and correctness requirements in practical computer systems.