CS 499 Homework 3: Unix IPC and Shared Memory Machine Basics
- Due: Sunday 4/3/2016 by 11:59 pm
- Approximately 10% of total grade
- Submit to Blackboard zip of code with Word/PDF report
- You may work in groups of 2 and submit one assignment per group.
- Code Distribution: distrib-hw3.zip
CHANGELOG: Empty
Table of Contents
1 Overview
This assignment is divided into 3 parts to cover
- Unix System V IPC
- Basic Shared Memory Architecture Theory
- OpenMP Basics
Each problem has some writing associated with it which you should do in either a Word or PDF document which is turned in with any code you submit.
2 (40%) Problem 1: IPC Heat
Convert the heat.c
program to use a System V Interprocess
Communication calls to parallelize it. This program should be
familiar to you based on the prior programming assignments.
2.1 Implementation Notes
The restrictions on your implementation are as follows.
- Name your program
ipc_heat.c
and adhere to the following usage pattern which takes a final argument which is the number of processes to use in the computation.> ipc_heat usage: ipc_heat max_time width print #PROCS max_time: int width: int print: 1 print output, 0 no printing #PROCS: int, number of processes to use > ipc_heat 5 8 1 2 | 0 1 2 3 4 5 6 7 ---+------------------------------------------------- 0| 20.0 50.0 50.0 50.0 50.0 50.0 50.0 10.0 1| 20.0 35.0 50.0 50.0 50.0 50.0 30.0 10.0 2| 20.0 35.0 42.5 50.0 50.0 40.0 30.0 10.0 3| 20.0 31.2 42.5 46.2 45.0 40.0 25.0 10.0 4| 20.0 31.2 38.8 43.8 43.1 35.0 25.0 10.0
- As was the case for the MPI version of heat,
ipc_heat
only needs to work when the number processors used evenly divides the number of columns in the heat matrix. - Use
fork()
to spawn child processes to parallelize the program. Spawn a number of children equal to the last command line argument given. Take care that children do not spawn additional child processes which is a common mistake. - Use a block of shared memory for the Heat Matrix processes will
access. You will likely want to allocate and attach this shared
memory prior to spawning children so that they can all access it.
Keep in mind that calls like
shmget()
can only allocate 1D blocks of memory so you may wish to set up pointers in an array within it to simulate a 2D-like array using a loop like the following.// allocate shared memory with shmget() // attach memory to shm_ptr using shmat() // Below code makes mat look like a 2D array mat = malloc(rows * sizeof(double *)); for(i=0; i<rows; i++){ mat[i] = shm_ptr + cols*i; }
- Use either semaphores or message queues to coordinate updates to the Heat Matrix between processes.
- If you employ message queues, keep in mind that you will definitely want to use the "tag" mechanism which allows messages to be put into a queue with an integer associated to it that may correspond to the recipient. I found it useful to have two message queues, on for messages to the right and one for messages to the left.
- If you use semaphores, you may wish to use an array of semaphores or several arrays for left/right boundary values.
- At the end of the computation, a single process should print out the entire Heat Matrix as was the case in the MPI version. This will be easiest if a block of shared memory is utilized rather than having each process privately allocate its own sections of the heat matrix.
2.2 Timing Script
Use the provided script time-heat.sh
to produce a table of times for
various sizes of heat matrices and numbers of processors. You should
run your code on zeus.vse.gmu.edu
as it has 4 physical processors
(8 counting hardware "hyperthreading") to demonstrate any potential
speedups.
> make gcc -g -o heat heat.c gcc -g -o ipc_heat ipc_heat.c > time-heat.sh rows cols p time 1000 1000 1 0.02 1000 1000 2 0.01 1000 1000 4 0.01 ... 20000 10000 4 0.84 20000 10000 8 0.98
2.3 What to Turn In
Submit both your code ipc_heat.c
and a word document which discusses
the following issues regarding this program (and contains answers to
the remaining problems).
- Describe the overall design of your
ipc_heat
program. Discuss how you spawned processes and how many total processes you utilized for the program. - Include discussion of the System V IPC mechanisms you used to coordinate cooperating processes to finish the heat calculation.
- Include a table generated using the
time-heat.sh
script provided as run onzeus.vse.gmu.edu
. - Discuss whether the timings from
zeus
indicate it is possible to speed up the heat computation using IPC or if it suffers from the same communication overhead that the MPI does negating the value of multiple cooperating processes.
3 (30%) Problem 2: Basics of Shared Memory Architecture
Answer the following questions from the textbook in your Homework Write-up.
3.1 Grama 2.7
What are the major differences between message-passing and shared-address-space computers? Also outline the advantages and disadvantages of the two.
3.2 Grama 2.8
Why is it difficult to construct a true shared-memory computer? What
is the minimum number of switches for connecting P
processors to a
shared memory with M
words (where each word can be accessed
independently)?
3.3 Grama 2.9
Of the four PRAM models (EREW, CREW, ERCW, and CRCW), which model is the most powerful? Why? Define what is meant by "define" what you mean by most powerful here.
4 (30%) Problem 3: OpenMP Patternlets
Courtesy of the fine folks at CS in Parallel, the code distribution
for this HW has a directory called OpenMP-patternlets
which contains
a series of codes with which to experiment. Each code contains
instructions on the intended steps to take to learn from it, mostly
surrounding commenting or uncommenting lines and observing outputs.
Answer the following questions based on your experience with this code. You may need to do some additional research to answer some of these questions, perhaps examining the required OpenMP Tutorial reading. Put your answers in your Homework Write-up.
- The
#pragma omp parallel
directive launches multiple threads to perform a computation. How many threads are used by default? How does one adjust the number of threads launched? /There are at least two ways to change the number of threads which you should describe./ - How do threads in OpenMP obtain a unique identifier and determine the total number of threads being used?
- OpenMP provides several easy ways to distribute multiple loop iterations over cooperating threads. Describe these and make sure to discuss the differences how loop iterations are distributed to the different thread numbers.
- When multiple threads are altering the same shared variable in parallel loop, the integrity of the variable's value can be compromised. Describe some ways that this can be avoided for loops that sum or count. Give a few code examples drawn from the exercises.
- Describe OpenMP's notion of a
private
variable. Demonstrate its use and the effects of making a variableprivate
during a parallel loop. - Several exercises deal with
critical
andatomic
sections facilities. Do some research and report what the difference between these two. Describe their relative strengths and limitations.