Search This Blog

Computer Architecture: Parallelism and Locality

Computer Architecture: Parallelism and Locality

Instructor Mattan Erez

Description: Two major challenges facing computer architects today are dealing with tight power budgets and achieving high performance as off-chip bandwidth diminishes in comparison with available on-chip compute resources. In this course we will explore how the fundamental properties of locality and parallelism can be utilized in both hardware and software to overcome these challenges of power and bandwidth constraints. We will develop hardware cost models and hardware and software techniques through a combination of structured lectures, paper reading, discussions, homework assignments, programming labs, and a collaborative project. Examples of architectures and methods that will be covered include traditional general-purpose processors, massively parallel processors, parallel memory systems, parallel programming and execution models, shared memory systems, distributed shared memory systems, domain decomposition techniques, and cache-aware and cache-oblivious algorithms (tentative syllabus below).

Text :
There is no required textbook for this class, however, you may find the following useful:
  • Timothy G. Mattson, Beverly A. Sanders, Berna L. Massingill, “Patterns for parallel programming”, 2005, Addison-Wesley Boston.
  • David B. Kirk and Wen-Mei Hwu, “Programming Massively Parallel Processors: A Hands-on Approach”, 2010, Morgan Kaufmann.
Download Slides  :
LectureTopic (notes)
1Introduction
2Locality in CPUs
3Locality + Cache aware
4Locality + Cache oblivious
5Wires/Interconnect
6Wires II
7Wire Alternatives + HW Parallelism
8HW Parallelism (I) (pptx/pdf)
9HW Parallelism (II)(pptx/pdf)
10SW Parallelism (I) (pptx/pdf)
11SW Parallelism (II) (pptx/pdf)
12SW Parallelism (III) (pptx/pdf)
13SW Parallelism (IV) (pptx/pdf)
14SW Parallelism (V) (pptx/pdf)
15GPU and Graphics Intro
16NVIDIA GPUs (I) (pptx/pdf)
17NVIDIA GPUs (II) (pptx/pdf)
18NVIDIA GPUs (III) (pptx/pdf)
19NVIDIA GPUs (IV) (pptx/pdf)
20NVIDIA GPUs (V) (pptx/pdf)
21NVIDIA GPUs (VI) + CUDA (I) (pptx/pdf)
22CUDA (pptx/pdf)
23Memory Systems (pptx/pdf)
24Quiz
25Memory (II) + Heterogeneous (I)
26Heterogeneous (II)
27Conclusions (and reliability)