Lectures are designed for synchronous delivery. It is not expected that the recorded version is an adequate substitute for attending.

Recorded lectures are available on Panopto through Blackboard. Log in to blackboard. Lectures are named LecXX.[Topic].[Date].mp4. Although Panopto is inconvenient, it is the only way to control access to lectures to enrolled students.

The Jupyter notebooks for lectures should be synced from gigantum following the instructions in gigantum.html. Notebooks with filenames 0X_lecture_name.ipynb are found in this project.



  • midterm released as of 8:00 am October 19, 2020.


  • final released as of 1:00 pm December 14, 2020.

Late Hours

A total of 48 late hours are permitted per semester to use as needed for projects. Late hours will be rounded up to the nearest hour, e.g. a project submitted 2.5 hours late will count as using 3 late hours.

Course Schedule

(31 August) Introduction to Parallel Programming and Gigantum

Parallelism in modern computer architectures. Performance beyond computational complexity. An introduction to the Gigantum environment for reproducibility and sharability.

  • Reading:
    • Chapter 1, Patterns for Parallel Programming
  • Lectures:
    • Introduction to Parallel Programming: 01_intro.ipynb
    • Gigantum
  • sli.do event: https://www.slido.com event code #64704

(2 September) Amdahl’s Law, Strong Scaling, and Parallel Efficiency

Amdahl’s law is the fundamental principle behind strong scaling in parallel computing. Strong scaling is the process of solving a problem of the fixed size faster with parallel resources.

(9 September) OpenMP

Lecture 3: An introduction to parallelizing serial programs based on compiler directives. Also serial equivalence, and loop parallel constructs.

(14 September) Cache Hierarchy

Lecture 4: Memory hierarchy and latency. Caching concepts: size, lines, associativity, and inclusion/exclusion. Caching microbenchmarks.

(16 September) Loop Optimization

Lecture 5: Loop Optimizations

(21 September) Moore’s Law and Factors Against Parallelism

Lecture 6: The impact of multi-core and pipelines on processing power. Moore’s law reloaded.

(23 September) Factors Against Parallelism

Lecture 7: Startup, interference, and skew.

  • Reading:
    • Chapter 2, Patterns for Parallel Programming

(28 September) Loop Dependencies and Parallel Architectures and OS Constructs

Lectures 8a and 8b: Operating system process abstraction: contexts and context switching, virtual memory, program organization in memory. Operating system threads, thread context, thread memory model.

(30 September) Java Threads and Java Synchronize

Lecture 9a: Java Threads

Lecture 9b: Asynchrony, waiting on threads, volatile variables, and synchronized functions.

  • Reading:
    • Appendix C: Patterns for Parallel Programming

(5 October) Mutual Exclusion

Lecture 10: Critical sections and fast mutual exclusion.

  • Reading:
    • Chapter 1 and 2-2.6: Herlihy and Shavat
NO MATERIAL past this point is on the midterm

(7 October) Dask

Lecture 11: Dask Arrays. Data parallel and declarative programming. Execution graphs and lazy evaluation.

(12 October) Dask Dataframes

Lecture 12: Parallel Pandas. Slicing and Aggregation. Indexing.

(14 October) Dask Projects and Weak Scaling

Lecture 13: Weak Scaling

Lecture13p1: Game of Life in Dask

Lecture13p2: Dataframe Exercise

(19 October) Midterm: No Class

(21 October) Distributed Memory Hardware and Supercomputers

Lecture14: Flynn’s taxonomy.

Lecture 14b: Top 500

(26 October) MPI Introduction

Lecture 15: Supercomputers of the world and trends in architecture. MPI Overview, the MPI Runtime Environment.

(28 October) MPI Messaging

Lecture 16: MPI Synchronous Messaging, Asynchronous I/O, and Barriers.

  • Reading:
    • MPI Tutorial, Lawrence Livermore National Lab
    • Appendix B, Patterns for Parallel Programming

(2 November) Introduction to Map/Reduce

Lecture 17: The Google Parallel computing environment, functional programming concepts applied to large-scale parallelism, text processing.

(4 November) HPC Checkpoints and I/O Performance.

Lecture 18

NEW PROJECT: Hadoop! and Spark lectures are in gigantum.com/jhupp/jhupp-lectures-spark

(9 November) Hadoop!

Lecture 19: Hadoop! programming, the WordCount tutorial, and the Hadoop! toolchain.

(11 November) Triangle Counting in Hadoop!

Lecture 20: Friends-of-friends running example. The M/R sorting guarantee and combiners.

(16 November) Hadoop! Streaming

Lecture 21: Streaming execution of Map/Reduce and sort equivalance.

(18 November) Introduction to Spark

Lecture 22: Spark and Resilient Distributed Datasets.

Spark Programming (this lecture was not given)

Lecture 23: Online coding of Friends-of-Friends in Spark.

You are not responsible for this material.

(30 November) GPU architecture

Lecture 24: The evolution of GPU computing, from graphics pipeline to GPGPU to CUDA. GPU hardware.

Two lectures (24a and 24b) were recorded and are available on Blackboard through Panopto.

(2 December) Roofline

Lecture 25: The roofline performance model and off-chip bandwidth

(7 December) GPU Programming: CUDA

Lecture 26: Shared memory, threads, warps, barriers.