Parallel Programming Fall 2021 (EN 601.320/420/620)

Syllabus in standard CS/JHU/ABET format. The material on this page mirrors that information.

Course Description

This course prepares the programmer to tackle the massive data sets and huge problem size of modern scientific, cloud, and enterprise computing. Students taking this course will abandon the comfort of serial algorithmic thinking and learn to harness the power of cutting-edge software and hardware technologies. The issue of parallelism spans many architectural levels. Even “single server” systems must parallelize computation in order to exploit the inherent parallelism of recent multi-core processors and many-core accelerators. The course will examine different forms of parallelism using a variety of frameworks, including python multiprocessing, python joblib, Spark, Dask, Ray, and OpenMP. Programming will be mostly in Python with some optimized code written in C++. The course is uitable for second-year undergraduate CS majors and undergraduate and graduate students from other science and engineering disciplines that have prior programming experience.

Prerequisites:

  • Intermediate Programming (EN 601.120 or the equivalent)
  • Data Structures (EN 601.226 or the equivalent)
  • Computer Systems Fundamentals (EN 601.333 or the equivalent)

Comments on the 2021 Edition

I have significantly revamped the course this year with new examples and a change in focus. The course is going to focus on parallel computing for data science more exclusively, moving away from high-performance scientific computing. I am also going to try to have all programming assignment and examples in Python, avoiding the R programming language as much as possible.

Students are responsible for all material and announcements on this course Web page and Piazza.

Academic Conduct

The guidelines of Johns Hopkins’ undergraduate ethics policy and graduate student conduct policy apply to all activities associated with this course. Additionally, students are subject to the Computer Science Academic Integrity Code.

In addition, the specific ethics guidelines for this course are: Students are encouraged to consult with each other and even collaborate on all programming assignments. This means that students may look at each other’s code, pair program, and even help each other debug. Any code the was written together must have a citation (in code comments) that indicates who developed the code. Any code excerpted from outside sources must have a citation to the source (in code comments). Each assignment involves questions that analyze the assignment and connect the program to course concepts. The answers to these questions must be prepared independently by each student and must be work that is solely their own.

Clarification and Amendment: For any homework that is done in pairs or teams, all team members must be listed as collaborators in comments in the source code AND in any submitted documents (PDFs and notebooks). You should also specify the nature and scope of the collaboration, shared programming, discussion, debugging, consulting. Failure to state a collaboration and its scope is an ethics violation and will be treated as such.

Schedule

MW 4:30 pm - 5:30 pm, https://wse.zoom.us/j/99616813994 Zoom link with password can be found in Blackboard.

Course Staff

Instructor

Randal Burns, randal@jhu.edu, http://www.cs.jhu.edu/~randal/

Teaching Assistants

Brian Choi, bchoi11@jhu.edu

Brian Wheatman, wheatman@cs.jhu.edu

Course Assistants

Course Goals

Specific Outcomes for this course are that

  • Take a computational task and construct an implementation that maximizes parallelism.
  • Analyze and instrument an implementation of a computer program for its speedup, scaleup, and parallel efficiency.
  • Reason about the loss of parallel efficiency and attribute that loss to factors, including startup costs, interference, and skew.
  • Work with a diverse set of programming tools for different parallel environments, including cloud computing, high-performance computing, multicore, and GPU accelerators.
  • Analyze how locality, latency, and coherency in the memory hierarchy influence parallel efficiency and improve program design based on the properties of memory.

This course will address the following CSAB ABET Criterion 3 Student Outcomes Graduates of the program will have an ability to:

  1. Analyze a complex computing problem and to apply principles of computing and other relevant disciplines to identify solutions.
  2. Design, implement, and evaluate a computing-based solution to meet a given set of computing requirements in the context of the program’s discipline.
  3. Apply computer science theory and software development fundamentals to produce computing-based solutions.

Grading

There is no specific formula for grading. Course staff look at several indexes over the class performance and then factors in class participation and evidence of learning trajectory to make a final decision that is subjective, but informed by statistics. There is no curve. Each student earns a grade that reflects their individual learning and performance in the class.

In Fall 2020, the University has implemented a choice for undergradautes to take courses pass/fail or for grades. Course staff will grade all assignments and give a letter grade to each student. The student’s election will not effect or change our process.

Requests for Regrades

A student should only request a regrade of an assessment if a technical error was made in grading. In this case, the student must clearly document the technical error associated with a specific problem and submit a written request for a regrade. Once an assessment grade is released, you will have two weeks to submit a regrade request.

Textbooks used in Course

The course does not follow a textbook. Lectures will refer to specific material from the following books. For the O’Reilly Books, you must first access one through the library proxy. After that, the links will work.

Mattson, T. G., B. A. Sanders, and D. L. Massingill. Patterns for Parallel Programming. Addison-Wesley, 2004. This text is available online to Hopkins students https://learning.oreilly.com/library/view/patterns-for-parallel/0321228111/

N. Matloff, Parallel Computing for Data Science. CRC Press, 2015. https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.467.9918&rep=rep1&type=pdf

Herlihy, M. and N. Shavit. The Art of Multiprocessor Programming. Morgan-Kaufmann, 2008. This text is available online to Hopkins students https://learning.oreilly.com/library/view/the-art-of/9780123973375/

Midterm Exam: Take home. Distributed on October 20, 2021.

Final Exam: TBA. Final exam will likely be an untimed take home exam.