cache
Summary
Locality
- temporal locality
- data tends to be referenced again soon
- ie. counter for loops
- spatial locality
- nearby data tends to be referenced soon
- ie. items in arrays
Average access time
- hit -> when data is found in cache
- miss -> when data is not found in cache, need to look in memory
Cache block
- unit of transfer between memory and cache
- usually in words
- larger block sizes can improve spatial locality at the cost of more latency
like how with 4-byte blocks(words), we can ignore the last memory address bits, unless we desire a particular byte within that word
Memory sizes
| Size | Number of bytes |
|---|---|
| 1KB | 2¹⁰ |
| 1MB | 2²⁰ |
| 1GB | 2³⁰ |
| 1TB | 2⁴⁰ |
Cache types
| Direct mapped | Set associative | Fully associative | |
|---|---|---|---|
| Block placement | one block per index | n blocks per index | any cache block |
| Block search | check tag at corresponding block | search tag within set | search tag in whole cache |
| Blok replacement | overwrite | choose by replacement policy | choose by replacement policy |
Concept
Types of memory
| Dynamic RAM | Static RAM |
|---|---|
| High density - 1 transistor per cell | Low density - 6 transistors per cell |
| Slow access - 50-70ns | Fast access - 0.5-5ns |
| Cheaper | More expensive |
| Principle of lcaility |
- program only accesses a small portion of the memory address space within a small time interval
- keep frequently/recently used data in a smaller but faster memory(cache)
Working set
- set of memory locations accessed during a time period
- the goal of cache is to capture the working set to reduce latency
Write policy
- write-through
- write to both cache and memory
- delayed by main memory, reduce latency with a write buffer
- write-back
- write to cache, write to memory when cache block is replaced
- additional bit to track write state, write to memory if bit is 1