Assignment #2: Cellular Automata in MPI

Due date: We suggest you have this completed by the end of spring break, Monday 27 March 2019, 11:59:00 pm. We will make example solutions available by Tuesday, March 27th. Please be sure to format your submission correctly as described on piazza. Watch out corrections and justifications on Piazza. Note that we are not assigning grades for projects, so submission is optional and solely for the purposes of feedback.

The goal of this assignment is to learn to use MPI for lock-step time-series simulation with geometric decomposition. This is a common paradigm for MPI programs in the sciences.

We will explore these issues in a simulation of cellular automata. Cellular automata are of wide interest. Wolfram puts forth that simple automata exhibit the complexity that underlies natural phenomena. This is tenet of the field of digital physics (

To keep it simple, we will focus on Conway’s game of life. One of the earliest and best studied 2-d cellular automata.

Game of Life

The following resources may be helpful in understanding the game of life.

An Implementation

Submit an MPI program which is a completion of the gameoflife.c code stub. We will use a Makefile to compile your code, much like this:


all: gol
	$(CC) gameoflife.c -o gameoflife
	rm -f gameoflife

Do not alter the code that prints the final output. You should print after every iteration, following the format in the code stub, or our grade scripts will not be able to parse your output.

The program should implement the following:

  • A 16×16 square grid of life that wraps around both the x and y axis, i.e squares x,0 and x,15 are adjacent as are squares 0,y and 15,y.
  • The glider pattern from should start in the upper left corner.
  • In the end, your simulation should run for 64 iterations so that the glider returns to its original location.
    • Initially, test your code using fewer iterations by specifying a lower number on the cmd line (code stub has cmd line parameter to specify # of iterations)
    • Once it works for lower iterations, try your code using the full 64 iterations
  • On every iteration, you should output the configuration of the board to standard out from process 0.
    • This means that you need to communicate the grid updates from processes 1..n-1 to process 0.
    • In practice, one would only collect the global data structure at the end of the simulation or every kth time step.
  • A data decomposition that supports 1,2,4,8, and 16 processors based on slicing the array horizontally.
    • E.g., for 4 processors, P0 gets rows 0-3, P1 gets 4-7, P2 gets 8-11, P3 gets 12-15.
  • Implement a safe, deadlock-free messaging discipline using send and receive.

Using MPI

You can install the MPI libraries on the VM or your personal computer. Recall that you can run multiple MPI processes on your machine even though you may have only 1 or 2 processing elements.


Submit answers to the following questions in a typeset PDF document with separate pages for each subquestion.

  1. Please characterize (describe your answers):

    a. The deadlock-free approach of your solution.

    b. The efficiency of your implementation using big-O notation (define all variables used and show work leading to final expressions), in terms of the (i) Number of messages passed per process for a single iteration of the game, (ii) The amount of memory used per process for a single iteration of the game.

    c. The dependencies between the decomposed parts of the problem i.e., inter-process data dependencies.

  2. Consider an alternative spatial decomposition in which we divide the grid recursively into smaller squares, e.g. an n x n grid may consist of 4 (n/2) x (n/2) squares or 16 (n/4) x (n/4) squares or 64 (n/8) x (n/8) squares. What are the relative advantages and disadvantages of this decomposition when compared with the horizontal slicing of the space in your implementation:

    a. On the number and size of MPI messages in the simulation? (Give a big-O expression for each decomposition for the size and number of messages as a function of the number of processes.)

    b. On the memory overhead at each processor? (Again, give a big-O expression for each decomposition.)

    c. For the new decomposition, describe a deadlock-free messaging discipline for transmitting the top, bottom, left, and right sides and the top left, top right, bottom left and bottom right corners among neighboring partitions. Pseudocode or a drawing will be helpful in describing the protocol.

Warning: Define all variables and show work leading to final expressions used in big-O expressions.


Submission is optional and only for feedback. If choose to do so, send this email to as an attachment, or provide us an invitation to a private github repo with your solution. Simply provide your gameoflife.c file and a solutions pdf containing your answers.