## Assignment #1: Cellular Automata in MPI

Due date: Thursday 2 March 2017, 11:59:00 pm.

The goal of this assignment is to learn to use MPI for lock-step time-series simulation with geometric decomposition. This is a common paradigm for MPI programs in the sciences.

We will explore these issues in a simulation of cellular automata. Cellular automata are of wide interest. Wolfram puts forth that simple automata exhibit the complexity that underlies natural phenomena. This is tenet of the field of digital physics (http://en.wikipedia.org/wiki/Cellular_automata).

To keep it simple, we will focus on Conway’s game of life. One of the earliest and best studied 2-d cellular automata.

### Game of Life

The following resources may be helpful in understanding the game of life.

#### An Implementation

Submit an MPI program which is a completion of the gameoflife.c code stub. We will use a Makefile to compile your code, much like this:

CC=mpicc

all: gol
gol:
\$(CC) gameoflife.c -o gameoflife
clean:
rm -r gameoflife


Do not alter the code that prints the final output. You should print after every iteration, following the format in the code stub, or our grade scripts will not be able to parse your output.

The program should implement the following:

• A 16×16 square grid of life that wraps around both the x and y axis, i.e squares x,0 and x,15 are adjacent as are squares 0,y and 15,y.
• The glider pattern from http://www.bitstorm.org/gameoflife/ should start in the upper left corner. Here are my data structures.
• In the end, your simulation should run for 64 iterations so that the glider returns to its original location.
• Initially, test your code using fewer iterations by specifying a lower number on the cmd line (code stub has cmd line parameter to specify # of iterations)
• Once it works for lower iterations, try your code using the full 64 iterations
• On every iteration, you should output the configuration of the board to standard out from process 0.
• This means that you need to communicate the grid updates from processes 1..n-1 to process 0.
• In practice, one would only collect the global data structure at the end of the simulation or every kth time step.
• A data decomposition that supports 1,2,4,8, and 16 processors based on slicing the array horizontally.
• E.g., for 4 processors, P0 gets rows 0-3, P1 gets 4-7, P2 gets 8-11, P3 gets 12-15.

#### Using MPI

You can install the MPI libraries on the VM or your personal computer. Recall that you can run multiple MPI processes on your machine even though you may have only 1 or 2 processing elements.

#### Questions

Submit answers to the following questions in a typeset PDF document.

b. The dependencies between the decomposed parts of the problem.

2. Consider an alternative spatial decomposition in which we divide the grid recursively into smaller squares, e.g. an n x n grid may consist of 4 (n/2) x (n/2) squares or 16 (n/4) x (n/4) squares or 64 (n/8) x (n/8) squares. What are the relative advantages and disadvantages of this decomposition when compared with the horizontal slicing of the space in your implementation:

a. On the number and size of MPI messages in the simulation? (Give an expression for each decomposition for the size and number of messages as a function of the number of processes.)

b. On the memory overhead at each processor? (Again, give an expression for each decomposition.)

c. On the flexibility of decomposition?

d. For the new decomposition, describe a deadlock-free messaging discipline for transmitting the top, bottom, left, and right sides and the top left, top right, bottom left and bottom right corners among neighboring partitions. Pseudocode or a drawing will be helpful in describing the protocol.

#### Submission

Submission will be through Gradescope. Simply upload your gameoflife.c file to the project 1 assignment.

Submit your solutions PDF file to Project 1 Solutions PDF.