CSCI 343 May 6: Cache Misses, Associativity, Replacement Policies, and Snooping Protocols

Types of Cache Misses

Cache misses occur when the CPU requests a memory block that is not currently in the cache. These are categorized as follows:

Compulsory Misses:
- Also known as cold-start misses.
- Occur when a block is accessed for the first time and is not yet in the cache.
- Unavoidable on first access.
Capacity Misses:
- The cache cannot contain all the blocks needed during execution.
- Even with optimal replacement, some blocks are evicted and then needed again.
Conflict Misses:
- Also known as collision misses.
- Occur when multiple blocks compete for the same cache line or set.
- For example, blocks 0 and 4 both mapping to line 0 in a direct-mapped cache will cause conflicts.

Cache Mapping and Associativity

Cache mapping strategies influence the occurrence of conflict misses:

Direct-Mapped Cache:
- Each block maps to exactly one line.
- Simplest and fastest lookup.
- High conflict miss rate due to fixed mapping.
Fully Associative Cache:
- Any block can go into any line.
- Minimizes conflict misses.
- Hardware is complex due to the need to search all lines.
Set-Associative Cache:
- A compromise between direct-mapped and fully associative.
- The cache is divided into sets, and each set contains multiple lines.
- An n-way set-associative cache allows a block to go into n different locations (within its set).

Replacement Policies

When a cache set is full, a block must be replaced. Common strategies:

Random Replacement:
- Replace a randomly selected block in the set.
- Simple but potentially inefficient.
Least Recently Used (LRU):
- Replaces the block that hasn’t been used for the longest time.
- Effective but requires tracking access order.
First-In-First-Out (FIFO):
- Removes the block that has been in the cache the longest.
- Simpler than LRU, but may evict frequently used blocks.
Least Frequently Used (LFU):
- Replaces the block used least often.
- Can become expensive due to frequency tracking.

Snooping Protocols and Cache Coherence

In multiprocessor systems, cache coherence ensures that all processors see a consistent view of memory. The snooping protocol is one method used to maintain coherence.

Overview of Snooping Protocol

Each cache controller monitors (or “snoops on”) a shared communication bus.
When a processor writes to a memory block, all other caches check if they have a copy of that block.
If a copy exists, they take action based on the protocol (e.g., invalidate, update).

Common Snooping-Based Protocols

Write-Invalidate:
- When a processor writes to a block, all other caches invalidate their copies.
- Simple and efficient for cases where writes are infrequent.
Write-Update (or Write-Broadcast):
- When a processor writes to a block, it also broadcasts the new data to other caches.
- Keeps caches updated but may incur high traffic.

Key Points

Snooping works best with a broadcast medium (e.g., a shared bus).
Protocols must ensure that only one processor can write to a block at a time (using mechanisms like MESI).
In contrast, directory-based protocols are more scalable and are used in systems without a broadcast bus.

Summary of Cache Line Assignment Effects

A fully associative cache allows total flexibility in block placement but has more complex hardware.
Set-associative caches provide a good balance and are common in practice.
The number of sets and the degree of associativity directly affect the miss rates.

Diagram Hint

To visualize conflict misses in direct-mapped caches, you can draw a diagram showing:

A few cache lines (e.g., 4 lines).
Memory blocks that map to the same line.
A sequence of accesses showing how earlier blocks are evicted when a new block with the same mapping arrives.

This lecture solidifies the foundational understanding of cache behavior, emphasizing the importance of associativity, replacement, and coherence in multiprocessor systems.