Assignment #2: Cellular Automata in MPI
Due date: We suggest you have this completed by the end of spring break, Monday 27 March 2019, 11:59:00 pm. We will make example solutions available by Tuesday, March 27th. Please be sure to format your submission correctly as described on piazza. Watch out corrections and justifications on Piazza. Note that we are not assigning grades for projects, so submission is optional and solely for the purposes of feedback.
The goal of this assignment is to learn to use MPI for lockstep timeseries simulation with geometric decomposition. This is a common paradigm for MPI programs in the sciences.
We will explore these issues in a simulation of cellular automata. Cellular automata are of wide interest. Wolfram puts forth that simple automata exhibit the complexity that underlies natural phenomena. This is tenet of the field of digital physics (http://en.wikipedia.org/wiki/Cellular_automata).
To keep it simple, we will focus on Conway’s game of life. One of the earliest and best studied 2d cellular automata.
Game of Life
The following resources may be helpful in understanding the game of life.
 http://psoup.math.wisc.edu/mcell/whatis_life.html gives a simple and intuitive description of the rules of Life and how to evaluate them on a grid. In fact, I used this example to debug my program, making sure that I implemented the rules correctly.
 http://www.shodor.org/cserd/Resources/Tutorials/MPIExamples/Life.php provides an MPI implementation of Life. I found this to be not very helpful at all, but you are welcome to use it as a reference. Were you to turn this in, you would get no points.
 http://www.bitstorm.org/gameoflife/ provides another description of the game of life. The applet there includes many common life patterns that you may use to evaluate your code. You will actually turn in some code that implements the glider pattern.
An Implementation
Submit an MPI program which is a completion of the gameoflife.c code stub. We will use a Makefile
to compile your code, much like this:
CC=mpicc
all: gol
gol:
$(CC) gameoflife.c o gameoflife
clean:
rm f gameoflife
Do not alter the code that prints the final output. You should print after every iteration, following the format in the code stub, or our grade scripts will not be able to parse your output.
The program should implement the following:
 A 16×16 square grid of life that wraps around both the x and y axis, i.e squares
x,0
andx,15
are adjacent as are squares0,y
and15,y
.  The glider pattern from http://www.bitstorm.org/gameoflife/ should start in the upper left corner.
 In the end, your simulation should run for 64 iterations so that the glider returns to its original location.
 Initially, test your code using fewer iterations by specifying a lower number on the cmd line (code stub has cmd line parameter to specify # of iterations)
 Once it works for lower iterations, try your code using the full 64 iterations
 On every iteration, you should output the configuration of the board to standard out from process 0.
 This means that you need to communicate the grid updates from processes 1..n1 to process 0.
 In practice, one would only collect the global data structure at the end of the simulation or every kth time step.
 A data decomposition that supports 1,2,4,8, and 16 processors based on slicing the array horizontally.
 E.g., for 4 processors, P0 gets rows 03, P1 gets 47, P2 gets 811, P3 gets 1215.
 Implement a safe, deadlockfree messaging discipline using send and receive.
Using MPI
You can install the MPI libraries on the VM or your personal computer. Recall that you can run multiple MPI processes on your machine even though you may have only 1 or 2 processing elements.
Questions
Submit answers to the following questions in a typeset PDF document with separate pages for each subquestion.

Please characterize (describe your answers):
a. The deadlockfree approach of your solution.
b. The efficiency of your implementation using bigO notation (define all variables used and show work leading to final expressions), in terms of the (i) Number of messages passed per process for a single iteration of the game, (ii) The amount of memory used per process for a single iteration of the game.
c. The dependencies between the decomposed parts of the problem i.e., interprocess data dependencies.

Consider an alternative spatial decomposition in which we divide the grid recursively into smaller squares, e.g. an n x n grid may consist of 4 (n/2) x (n/2) squares or 16 (n/4) x (n/4) squares or 64 (n/8) x (n/8) squares. What are the relative advantages and disadvantages of this decomposition when compared with the horizontal slicing of the space in your implementation:
a. On the number and size of MPI messages in the simulation? (Give a bigO expression for each decomposition for the size and number of messages as a function of the number of processes.)
b. On the memory overhead at each processor? (Again, give a bigO expression for each decomposition.)
c. For the new decomposition, describe a deadlockfree messaging discipline for transmitting the top, bottom, left, and right sides and the top left, top right, bottom left and bottom right corners among neighboring partitions. Pseudocode or a drawing will be helpful in describing the protocol.
Warning: Define all variables and show work leading to final expressions used in bigO expressions.
Submission
Submission is optional and only for feedback.
If choose to do so, send this email to dnb@cs.jhu.edu as an attachment, or provide us an invitation to a private github repo with your solution. Simply provide your gameoflife.c
file and a solutions pdf
containing your answers.