Assignment #1: Cellular Automata in MPI
Due date: Thursday 2 March 2017, 11:59:00 pm.
The goal of this assignment is to learn to use MPI for lockstep timeseries simulation with geometric decomposition. This is a common paradigm for MPI programs in the sciences.
We will explore these issues in a simulation of cellular automata. Cellular automata are of wide interest. Wolfram puts forth that simple automata exhibit the complexity that underlies natural phenomena. This is tenet of the field of digital physics (http://en.wikipedia.org/wiki/Cellular_automata).
To keep it simple, we will focus on Conway’s game of life. One of the earliest and best studied 2d cellular automata.
Game of Life
The following resources may be helpful in understanding the game of life.
 http://psoup.math.wisc.edu/mcell/whatis_life.html gives a simple and intuitive description of the rules of Life and how to evaluate them on a grid. In fact, I used this example to debug my program, making sure that I implemented the rules correctly.
 http://www.shodor.org/cserd/Resources/Tutorials/MPIExamples/Life.php provides an MPI implementation of Life. I found this to be not very helpful at all, but you are welcome to use it as a reference. Were you to turn this in, you would get no points.
 http://www.bitstorm.org/gameoflife/ provides another description of the game of life. The applet there includes many common life patterns that you may use to evaluate your code. You will actually turn in some code that implements the glider pattern.
An Implementation
Submit an MPI program which is a completion of the gameoflife.c code stub. We will use a Makefile
to compile your code, much like this:
CC=mpicc
all: gol
gol:
$(CC) gameoflife.c o gameoflife
clean:
rm r gameoflife
Do not alter the code that prints the final output. You should print after every iteration, following the format in the code stub, or our grade scripts will not be able to parse your output.
The program should implement the following:
 A 16×16 square grid of life that wraps around both the x and y axis, i.e squares
x,0
andx,15
are adjacent as are squares0,y
and15,y
.  The glider pattern from http://www.bitstorm.org/gameoflife/ should start in the upper left corner. Here are my data structures.
 In the end, your simulation should run for 64 iterations so that the glider returns to its original location.
 Initially, test your code using fewer iterations by specifying a lower number on the cmd line (code stub has cmd line parameter to specify # of iterations)
 Once it works for lower iterations, try your code using the full 64 iterations
 On every iteration, you should output the configuration of the board to standard out from process 0.
 This means that you need to communicate the grid updates from processes 1..n1 to process 0.
 In practice, one would only collect the global data structure at the end of the simulation or every kth time step.
 A data decomposition that supports 1,2,4,8, and 16 processors based on slicing the array horizontally.
 E.g., for 4 processors, P0 gets rows 03, P1 gets 47, P2 gets 811, P3 gets 1215.
 Implement a safe, deadlockfree messaging discipline using send and receive.
Using MPI
You can install the MPI libraries on the VM or your personal computer. Recall that you can run multiple MPI processes on your machine even though you may have only 1 or 2 processing elements.
Questions
Submit answers to the following questions in a typeset PDF document.

Please characterize (describe your answers):
a. The deadlockfree approach of your solution and its efficiency.
b. The dependencies between the decomposed parts of the problem.

Consider an alternative spatial decomposition in which we divide the grid recursively into smaller squares, e.g. an n x n grid may consist of 4 (n/2) x (n/2) squares or 16 (n/4) x (n/4) squares or 64 (n/8) x (n/8) squares. What are the relative advantages and disadvantages of this decomposition when compared with the horizontal slicing of the space in your implementation:
a. On the number and size of MPI messages in the simulation? (Give an expression for each decomposition for the size and number of messages as a function of the number of processes.)
b. On the memory overhead at each processor? (Again, give an expression for each decomposition.)
c. On the flexibility of decomposition?
d. For the new decomposition, describe a deadlockfree messaging discipline for transmitting the top, bottom, left, and right sides and the top left, top right, bottom left and bottom right corners among neighboring partitions. Pseudocode or a drawing will be helpful in describing the protocol.
Submission
Submission will be through Gradescope. Simply upload your gameoflife.c
file to the project 1 assignment.
Submit your solutions PDF file to Project 1 Solutions PDF.