Multiprocessors II ================== Lecture 25, 4/26/2005 Example of parallel program --------------------------- Sum of an array of number, pg. 9-11 from CD . show a naive implementation (lots of false sharing) . show better implementation show graphically how it works Some synchronization models --------------------------- * Barrier -- everyone waits until they get there, then everyone continues * Lock -- exclusive access (mutual exclusion) * Producer/consumer -- pairwise ordering Bus-based cache coherence ------------------------- * Figure of bus-based multiprocessor * Talk about what happens if writes happen in multiple caches -- bad! * Use bus to coordinate writes * Two options: - write-update -- all writes go to bus - write-invalidate -- writing processor must acquire exclusive copy prior to writing Conserves bus bandwidth, just like write-allocate cache * MESI protocol M = modified E = exclusive S = shared I = invalid * Go through example of a cache line, and its state for each cache - first its written by one processor - then read by lots of processors - then written by a different processor * Multiprocessors also need support for atomic memory operations: e.g. test if 0, if so, set equal to 1. Synchronization --------------- Atomic pair, pg. 9-20 from CD: . Load locked . Store conditional