High Performance Scientific Computing   AMath 483/583 Class Notes   Spring Quarter, 2011

Homework 4

# Homework 5¶

Note

Due date modified!

Due Friday, May 13, 2011, by 11:00pm PDT, by pushing to your bitbucket repository.

The goals of this homework are to

• gain experience with recursive subroutines and write a Fortran program from scratch.
• gain some more practice with OpenMP and coarse-grain parallelism.

Make sure you update your clone of the class repository before doing this assignment:

```\$ cd \$CLASSHG
\$ hg pull -u     # pulls and updates```

You should also create a directory \$MYHG/homeworks/homework5 to work in since this is where your modified versions of the files will eventually need to be.

Within in this directory you should have two subdirectories parta and partb for Part A and B below.

## Assignment:¶

### Part A¶

2. Write a recursive subroutine movedisks in a module movedisks_mod.f90 and a main program testtowers.f90 that solves this problem for any number of disks n and prints out the sequence of moves required to move n disks from rod 1 to rod 3. This program should give the following output, for example, where ‘disk 1` is the smallest disk and disk n is the largest one:

```\$ make test
./testtowers.exe
Input n
1
Move disk   1 from rod 1 to rod 3

\$ make test
./testtowers.exe
Input n
2
Move disk   1 from rod 1 to rod 2
Move disk   2 from rod 1 to rod 3
Move disk   1 from rod 2 to rod 3

\$ make test
./testtowers.exe
Input n
3
Move disk   1 from rod 1 to rod 3
Move disk   2 from rod 1 to rod 2
Move disk   1 from rod 3 to rod 2
Move disk   3 from rod 1 to rod 3
Move disk   1 from rod 2 to rod 1
Move disk   2 from rod 2 to rod 3
Move disk   1 from rod 1 to rod 3```

Hints:

• Your recursive subroutine should be of the form

```recursive subroutine movedisks(n1, n2, rod_start, rod_end)
implicit none
integer, intent(in) :: n1, n2, rod_start, rod_end
integer :: rod_other```

and should print out the moves required to move disks n1 through n2 from rod_start to rod_end.

• If there is only one disk to move (n1==n2) then it can print the required move.

• If there are more disks to move then it should

• move all disks but the bottom one from rod_start to rod_other (by recursively calling movedisks)
• print a line saying to move the bottom disk from rod_start to rod_end
• move all disks but the bottom one from rod_other to rod_end (by recursively calling movedisks)
• rod_other = 6 - (rod_start + rod_end) is the other rod from the three available rods.

### Part B¶

Modify your code from Homework 4 Part B so that it uses OpenMP with coarse-grain parallelism. The module quadrature_mod should be modified in the following ways...

1. You can delete the trapezoid subroutine and modify only simpson.

2. nthreads should still be a module variable that can be set in the main program. The subroutine simpson should decide how to split up the n intervals between threads and should include loops on i based on thread-specific values istart and iend to sum up the values needed to compute the Simpson approximation to the integral. Each thread should compute its own private int_thread based on its portion of the interval [a,b] and then add int_thread into the shared value integral.

See \$CLASSHG/openmp/pisum2.f90 for some ideas on how to convert your previous code into this form.

3. Have each thread compute a private a_thread and b_thread that are the endpoints of the interval over which this thread is approximating the integral, and print these values out as in the sample output below.

4. Eventually test it with k=10000 and n = 50000000 (the same problem as in Homework 4 Part b.

In this case running the code should give something like:

```Using OpenMP with   2 threads
Thread  0 will take i =         1 through i =  25000000
Thread  1 will take i =  25000001 through i =  50000000

Simpson:        n =  50000000
approx =  0.9819716972381E+01
true =    0.9819716972381E+01
error =  -0.465E-12
g was evaluated  50000001 times by thread 0
g was evaluated  50000001 times by thread 1
6.75 real        12.65 user         0.04 sys```

The first part is printed out by subroutine simpson, the second part by the main program in testquad.f90.

5. Your module quadrature_mod should also work if we provide a different main program and/or functions module for grading purposes.

Make sure your directory \$MYHG/homeworks/homework5 contains the following subdirectories and files:

• parta
• testtowers.f90
• movedisks_mod.f90
• Makefile
• partb