The Dining Philosophers (CS 300, Parallel and Distributed Computing (PDC)<br/><span style='font-size: 65%'>Due Friday, January 9, 2015</span>)

Laboratory exercises

Login to a cluster admin node. Create two subdirectories ~/PDC/lab3 and ~/PDC/lab3/job, then change your default directory to lab3.

Starting a synchronization server. We will use MPI to program one process as a synchronization server, which can be used to insure correct interprocess communication.

Make a copy of the file Sync.h, which defines an enumerated type and two classes. This file implements a simple client-server mechanism using MPI.

The enumerated type SyncMessageType defines two named integers QUIT and PRINT, to be used as MPI tags.
The class SyncServer consists of just one method run() that receives MPI messages, takes an action, and sends a message in reply.
The class Sync has methods for sending and receiving to a process that is running SyncServer. The constructor for Sync takes one argument, the rank of a process running SyncServer.
Note: Be sure to use the rank of the SyncServer process for this constructor call. The argument of the Sync() constructor will be the destination for messages, so if you fill in the rank of the process calling Sync(), that process will be sending and receiving from itself -- which risks deadlock!

Write an MPI application dp.cpp that tests Sync.h...

Head node (rank 0) should start SyncServer::run()...
Other nodes should create an instance of Sync with constructor argument 0, then should carry out the following steps:
1. Send a PRINT message with string "hello" to the server
2. Receive an ACK message from the server
3. Send a QUIT message (with empty string) to server
4. Receive ACK from server.
Note: Only one of the non-head nodes will get as far as sending its QUIT message before the program terminates.

Add an "echo" service to Sync.h, in which the server sends an ACK message with the same content in response to an ECHO message... Test, showing enough diagnostic output to verify that your new service works.

Change dp.cpp so the head node runs on the highest-ranked node instead of the node with rank 0...

Add a "chopstick" service to Sync.h, which allocates an array chop[] of int with one element for each process other than the server, representing whether that chopstick is available (value NONE) or is allocated by a process (value is rank of that process). Note: Use #define to define a preprocessor constant NONE to have the value -1, for code readability.

Here are some suggestions for this step.

Add two new message types, named GETCHOP and PUTCHOP, for requesting a chopstick and releasing one, resp.
If there are nprocs processes in your job, there will be nprocs-1 "philosopher" (non-server) processes, and therefore nprocs-1 chopsticks. So, your chopstick array chop[] should have size nproc-1 ints.
You can allocate your array chop[] in your main() program, and pass chop and its length to the server as arguments for run(). Or, you can allocate chop[] as a local variable in run() and calculate its length using MPI::COMM_WORLD.Get_size() MPI_Comm_size().
The "philosophers" will have rank 0 to nprocs-2, and the server will have rank nprocs-1.
You can represent that philosopher's left chopstick by chop[p] and that philosopher's right chopstick by chop[p+1], except when p = nprocs-2, in which case the right chopstick would be chop[0]. Under this strategy, a formula for the right chopstick would be chop[(p+1)%(nprocs-1)].
You can store a philosopher's rank p in an array element chop[k] when p possesses chopstick k.
Your server code for handling PUTCHOP and GETCHOP requests can determine the requesting process's rank p as status.source() status.MPI_SOURCE. The message's content can consist of a one-character string, either "R" for the Right chopstick (chop[p]) or "L" for the Left chopstick (chop[(p+1)%(nprocs-1)]).
For GETCHOP requests:
- If the requested chopstick is unassigned (value NONE), then assign p to that array location and send an empty ACK message (containing zero-length string) to the process with rank p, to indicate that the request succeeded.
- If the requested chopstick is already assigned to a process with a different rank, send an ACK message containing the string "NO" to the process with rank p.
- If the requested chopstick is already assigned to that same process with rank p, send an ACK message containing the string "ERROR" to the process with rank p, because no process should be requesting the same chopstick twice without releasing it between requests.
For PUTCHOP requests:
- If the requested chopstick is already assigned to that same process with rank p, then assign NONE to that array location and send an empty ACK message to that process with rank p, to indicate that the request succeeded.
- Otherwise, send an ACK message containing the string "ERROR" to the process with rank p, because no process should be trying to put back a chopstick that it doesn't possess.

Implement a simulation of the dining philosophers problem, with random thinking and eating time and 5 philosophers (thus 6 processes, one for the synchronization server)

The function stosleep() below will cause a process to pause execution for a given number of milliseconds (thousandths of a second). For example, the call stosleep(1500) will pause for 1.5 seconds. (This uses a system call select; system calls are discussed in an Operating Systems course.)
```
#include <sys/select.h>

/* pause execution for msec thousandths of a second, on Linux */
void stosleep(long msec) {
  struct timeval tv;
  tv.tv_sec = msec/1000;
  tv.tv_usec = 1000 * msec % 1000000;
  select(0, NULL, NULL, NULL, &tv);
}
```
You can use stosleep() to simulate your philosophers thinking and eating. For example, the call
```
stosleep(rand()%2000)
```
will cause a random delay of up to 2 sec. Be sure to call srand() with an argument that depends on rank in order to get different random delays for different philosophers.
Use a simple (if incorrect) algorithm for philosophers attempting to obtain chopsticks, such as the following.
```
repeat forever
   think
   pick up left chopstick (as soon as it's available)
   pick up right chopstick (as soon as it's available)
   eat
   return left chopstick
   return right chopstick
```
Check the ACK message for a GETCHOP request to determine when a chopstick is available: if that ACK message is empty (status.Get_length(MPI::BYTE) = 0 MPI_Get_count() delivers a non-zero value with data type MPI_BYTE), then the chopstick was obtained; a non-empty ACK message means the chopstick wasn't available (or an error). If the chopstick isn't available,
- print a message "Blocked process p" "Process p retrying for its C chopstick" (where C is the chopstick L or R being sought). (where p is the philosopher's rank), then
- repeatedly wait for a short time (stosleep(100) waits for 0.1 sec) and try again, until the chopstick becomes available, and finally
- print a message "Unblocked process p" "Process p now possesses chopstick C".
.

Try running your simulation, with output from each philosopher process labelled according to rank indicating each step of the algorithm.

Note: This algorithm never stops, so use scancel to delete the job after letting it run for a short while. You can check on progress by viewing the job's output file while the job is running.

Patterns in The Dining Philosophers.

Besides Message Passing, the chopstick service provides an example of a new pattern at the mid level.

The array chop[] illustrates the Shared Array pattern, in which multiple processes or threads can use the same array data structure safely, without danger of race conditions.

In dp.cpp, the synchronization process that runs the SyncServer code prevents any race conditions. No race conditions can occur because

only the synchronization process has direct access to the array elements, and
that synchronization process is programmed to carry out any philosopher process's request completely before proceeding to any other requests, which insures that no two requests will interfere with each other's array access through unfortunate timing.

This synchronization strategy for protecting the array chop[] implements a new low-level pattern.

The Mutual Exclusion pattern appears when no two processes or threads can access shared resources at the same time.

Here, the array chop[] is the shared resource among the philosopher processes, and the synchronization process guarantees that only one of those philosopher processes can access that array at a time. Mutual Exclusion is a low-level pattern, focused much more on hardware resources than on the features of an application, but is quite unlike the other low-level patterns we have seen (Message Passing and Collective Communication).

There is also a high level pattern at work in dp.cpp.

In Task Decomposition, a programming problem is approached by dividing the work into multiple tasks that can be carried out simultaneously.

In dp.cpp, there are two kinds of tasks: the synchronization service; and the philosopher tasks. Only one synchronization task is used, whereas multiple philosopher tasks are needed.

Our two high-level patterns, Data Decomposition and Task Decomposition, concern strategies for finding the opportunities for parallelism in a programming problem. The mid- and low-level patterns focus on implementing and executing those decompositions.

Make a copy deadlock.cpp of your dining philosophers simulation dp.cpp, and modify the copy in order to cause a deadlock.

One way to do this is to take out the random delays for eating and thinking, and instead add delays between picking up left and right chopsticks, so that all philosophers will pick up their left chopsticks and be unable to obtain their right chopsticks.

Note: Since our philosophers don't actually block when they can't get a desired chopstick, but keep retrying (say, every 0.1 sec), then our simulated "deadlock" occurs when a set of philosophers gets stuck repeatedly trying GETCHOP requests.

The program deadlock.cpp above does not truly deadlock, because philosophers do not block when a chopstick is unavailable. Revise deadlock.cpp to create a true deadlock by adding a new type of request GETCHOP_BLOCKING to your synchronization server, with the following behavior (where the process with rank p is requesting the chopstick with index c):

If the requested chopstick is unassigned (value -1), then assign p to that array location and send an empty ACK message (containing zero-length string) to the process with rank p, to indicate that the request succeeded.
If the requested chopstick is already assigned to a process with a different rank, then assign p to an array element waiting[c], and do not send an ACK message to the process with rank p at this time.
If the requested chopstick is already assigned to that same process with rank p, send an ACK message containing the string "ERROR" to the process with rank p, because no process should be requesting the same chopstick twice without releasing it between requests.

The array waiting[] should be an array of int with one element for each chopstick (same length as chop[]). Initialize all the elements of waiting[] with the constant value NONE = -1 to indicate that no process is waiting for any of the chopsticks at the beginning of the program. Note: Because each chopstick is shared by only two processes, the array waiting[] only needs one memory location per chopstick for this problem. For a more general problem, a whole queue of waiting processes might be needed per chopstick.

The PUTCHOP algorithm will now need to send ACK messages to any waiting processes when a chopstick is returned. Here is the revised algorithm for a PUTCHOP request by process q with chopstick c (additions in boldface):

If the requested chopstick number is out of range (negative or beyond the chop[] array), send an ACK message containing the string "ERROR in chopstick number".
If the requested chopstick is already assigned to that same process with rank q, then

if waiting[c] holds the rank of a process waiting on that chopstick, then assign that rank waiting[c] to the location chop[c], send an empty ACK message to that process waiting[c], and assign NONE = -1 to the array location waiting[c].

Otherwise, assign -1 to chop[c].

Then send an empty ACK message to that process with rank q, to indicate that its PUTCHOP request succeeded.
Otherwise, send an ACK message containing the string "ERROR" to the process with rank q, because no process should be trying to put back a chopstick that it doesn't possess.

Try to program a simulation of dining philosophers that has starvation. See the second algorithm in the page on race conditions for an algorithm with starvation; use stosleep() as needed to time your computation in order to produce a starvation situation.

Try to program a simulation of dining philosophers that has a race condition. See the third algorithm in the page on race conditions for an algorithm with a race condition. You can use stosleep to force "bad timing" and cause the race.

The philosopher processes need shared access to an array myturn[] for this algorithm. Add this array to the synchronization server's memory, and add new message types READ_MYTURN and WRITE_MYTURN for assigning or reading from that shared array.
- The content of a READ_MYTURN message can be an index for the desired element in the shared array.
- The content of a WRITE_MYTURN message can consist of two integers, an index and a value to assign to that location of the shared array (0 for false, 1 for true).
Implement myturn[] to satisfy the Shared Array pattern, including Mutual Exclusion, like the array chop[]. The race condition in this problem should occur in the philosopher algorithms, not in the mechanism for array access.
Note: In order to read integers from the content of a message, you can use C++'s stringstream class. If buff holds a string (such as the content of a message), then
```
   #include <sstream>
   ...
   stringstream ss(buff);
```
defines a stringstream object ss which behaves like an input stream, extracting characters from the string buff. For example, you can read an integer argument from ss using the >> operator:
```
   int val;
   ss >> val;
```
The algorithm also requires a way for processes to block themselves (sleep()) and other processes to unblock other sleeping processes (wakeup). You can implement this as a service in the synchronization server as follows:
- Add two new message types, SLEEP and WAKEUP.
- Add an array sleeping[] with one element for each philosopher process, similar to chop[]. Initialize all the elements of sleeping[] to 0 (representing False) during initialization of your SyncServer::run() method, before entering the loop for processing requests from philosophers.
- For a SLEEP request from a philosopher with rank p:
  - Assign 1 to sleeping[p]. Do not send an ACK message to philosopher process p at this time! That philosopher process will block on the MPI_Recv() call, awaiting that ACK message, thus implementing the sleep operation.
- A WAKEUP request should have one integer argument, the rank q of a philosopher process. For WAKEUP requests:
  - Obtain the integer q from the message content using stringstream, as described above. This is the usual way to convert strings to integers in C++.
  - If q is not the rank of a philosopher (i.e, is not in the range 0 to nprocs-2), send an ACK message to the philosopher process p that requested WAKEUP, with a non-empty content to indicate an error. Then, break from the WAKEUP code to avoid carrying out the rest of these steps.
    
    Note: That philosopher p should check the content of the ACK message it receives so it can detect the error it made and print an error message. Then, that philosopher p might as well call MPI_Abort() to shut down the MPI job, so you can correct the bug. (In other circumstances, it may be possible to recover from an error, but not in this situation.)
  - Assign the value 0 (representing False) to sleeping[q].
  - Send an ACK message (with empty content, indicating no error) to the philosopher process p that requested WAKEUP.
  - Then, send an ACK message (with empty content) to the philosopher process q being woken up.
Note: The order of those last two ACK messages is very important for correctness: If philosopher q becomes unblocked before philosopher p does, then it's conceivable that some call wakeup(p) will cause a lost wakeup between the those two ACK messages that wouldn't have been lost if the ACK for q had correctly occured before the ACK for p.

Correctness with low-level patterns.

In general, it is more difficult to program correctly with low-level patterns than with higher-level patterns. The ordering of ACK messages above illustrates why this is true: Two ACK messages need to take place at about the same time, and simply executing them in the wrong order leads to a race condition. This could also be described as an error in the Mutual Exclusion pattern: When those ACK messages execute in the wrong order, two different processes could have access to the shared resource sleeping[] at the same time, leaving open a possible race condition logical error.

In contrast, the mid-level pattern Shared Array includes correct "thread-safe" concurrency, so programs using an implementation of Shared Array need not include precautions to avoid race conditions from multiple processes accessing a single array element at about the same time. (Students who have taken Hardware Design might recognize the same kind of comparison between programming a loop or an if construct in a high level language vs. in assembly language: a programmer must pay careful attention to instruction order, branch/jump conditions, destinations of branches, etc., at th low level of assembly programming.)

Our quick implementation of sleep() and wakeup() using Message Passing illustrates the power and flexibility of that pattern. But a programmer must give extra attention to program correctness with a pattern at such a low level as Message Passing.

Deliverables

Submit your work using git on the cluster head node you used. Include the cluster name in your commit message, e.g.,

    mist$  cd ~/PDC
    mist$  git pull origin master
    mist$  git add -A
    mist$  git commit -m "Lab 3 submit (mist)"
    mist$  git push origin master

Also, fill out this form to report on your work.

See lab1 Deliverables section if you need to set up the cluster you used for git.

If you did your work on two different clusters, submit work from both of them using one of the strategies in lab1 Deliverables.

This lab is due by Friday, January 9, 2015.

Preliminary material

Laboratory exercises

Deliverables