







































# Terminology

# Clean line:

- Content of cache line and main memory is identical (memory is up to date)
- Can be evicted without write-back

### Dirty line:

- Content of cache line and main memory differ (memory is stale)
- Needs to be written back eventually Time depends on protocol details

# Bus transaction:

- A signal on the bus that can be observed by all caches
- Usually blocking (only one signal at a time)

# Local (private) read/write:

A load/store operation originating at a core connected to the cache



21













| tially: all in I state |          |          |          |            |                                        |
|------------------------|----------|----------|----------|------------|----------------------------------------|
|                        |          |          |          |            |                                        |
| Action                 | P1 state | P2 state | P3 state | Bus action | Data from                              |
| P1 reads x             | E        | 1        | I        | BusRd      | Memory                                 |
| P2 reads x             | S        | S        | 1        | BusRd      | Memory or<br>Cache of P1<br>(FlushOpt) |
| P1 writes x            | М        | L        | 1        | BusRdX*    | Cache                                  |
| P1 reads x             | М        | I.       | I        | -          | Cache                                  |
| P3 writes x            | 1        | 1        | М        | BusRdX     | Memory or<br>Cache of P1<br>(FlushOpt) |







- Most systems have multi-level caches (here assume 2)
  - Problem: only "last level cache" is connected to bus or network
  - Yet, snoop requests are relevant for inner-levels of cache (L1)
  - Modifications of L1 data may not be visible at L2 (and thus the bus)

# L1/L2 modifications

- On BusRd check if line is in M state in L1 It may be in E or S in L2!
- On BusRdX(\*) send invalidations to L1
- Everything else can be handled in L2

# **Directory-based cache coherence**

- Snooping does not scale
  - Bus transactions must be globally visible
  - Implies broadcast
- Typical solution: tree-based (hierarchical) snooping
  - Root becomes a bottleneck

### Directory-based schemes are more scalable

- Directory (one entry for each cache line) keeps track of all owning caches
- Point-to-point update to involved processors No broadcast

Can use specialized (high-bandwidth) network, e.g., HT, QPI ...

31

32



















