Program Optimization for Multicore Architectures
Program Optimization for Multicore Architectures
Course Contents
The course will cover the following:
- What are multi-core architectures
- Issues involved in writing code for multi-core architectures
- How to develop programs for these architectures
- What are the program optimizations techniques
- How to build some of these techniques in compilers
- OpenMP and other message passing libraries, threads, mutex etc.
Details:
Introduction to parallel computers: Instruction Level Parallelism (ILP) vs. Thread Level Parallelism (TLP); performance issues: brief introduction to cache hierarchy and communication latency; shared memory multiprocessors: general architecture and the problem of cache coherence; synchronization primitives: atomic primitives; locks: TTS, tickets, array; barriers: central and tree; performance implications in shared memory programs; chip multiprocessors: why CMP (Moore's law, wire delay) ; shared L2 vs. tiled CMP; core complexity; power/performance; snoopy coherence: invalidate vs. update, MSI, MESI, MOESI, MOSI; performance trade-offs; pipelined snoopy bus design; memory consistency models: SC, PC, TSO, PSO, WO/WC, RS; chip multiprocesor case studies: Intel Montecito and dual core Pentium 4, IBM power4, Sun Niagara
Introduction to optimization: overview of parallelism, shared memory programming; introduction to OpenMP; data flow analysis, pointer analysis, alias analysis, data dependence analysis, solving data dependence equations (integer linear programming problem); loop optimizations; memory hierarchy issues in code optimization
Operating system issues for multiprocessing: need for pre-emptive OS, scheduling techniques: usual OS scheduling techniques, threads, distributed scheduler, multiprocessor scheduling , gang scheduling; communication between processes, message boxes, shared memory; sharing issues and synchronization, sharing memory and other structures, sharing I/O devices, distributed semaphores, monitors spin locks, implementation techniques for multicores; case studies from applications: digital signal processing, image processing, speech processing
Lectures
- Introduction to the course and logistics ()
- Introduction to multi-core architectures ()
- OpenMP Tutorials ()
- Intel Tools ()
- Virtual memory and caches, Parallel programming, Coherence and consistency, Synchronization, Case studies of CMP ()
- Shared Memory Multiprocessors ()
- Introduction to Optimization, Control flow Analysis, Dataflow Analysis, Compilers for High Performance Architectures, Data Dependence Analysis ()
- Loop Optimizations ()
- CPU Scheduling, Synchronization, Multi-processor Scheduling, Security issues ()
No comments:
Post a Comment