UW AMath High Performance Scientific Computing
AMath 483/583 Class Notes
Spring Quarter, 2011

Table Of Contents

Previous topic

Homework 4

Next topic

Downloading and installing software for this class

This Page

Homework 5


Due date modified!

Due Friday, May 13, 2011, by 11:00pm PDT, by pushing to your bitbucket repository.

The goals of this homework are to

  • gain experience with recursive subroutines and write a Fortran program from scratch.
  • gain some more practice with OpenMP and coarse-grain parallelism.

Make sure you update your clone of the class repository before doing this assignment:

$ hg pull -u     # pulls and updates

You should also create a directory $MYHG/homeworks/homework5 to work in since this is where your modified versions of the files will eventually need to be.

Within in this directory you should have two subdirectories parta and partb for Part A and B below.


Part A

  1. Read the Wikipedia page http://en.wikipedia.org/wiki/Tower_of_Hanoi about the Towers of Hanoi game.

  2. Write a recursive subroutine movedisks in a module movedisks_mod.f90 and a main program testtowers.f90 that solves this problem for any number of disks n and prints out the sequence of moves required to move n disks from rod 1 to rod 3. This program should give the following output, for example, where ‘disk 1` is the smallest disk and disk n is the largest one:

    $ make test
     Input n
    Move disk   1 from rod 1 to rod 3
    $ make test
     Input n
    Move disk   1 from rod 1 to rod 2
    Move disk   2 from rod 1 to rod 3
    Move disk   1 from rod 2 to rod 3
    $ make test
     Input n
    Move disk   1 from rod 1 to rod 3
    Move disk   2 from rod 1 to rod 2
    Move disk   1 from rod 3 to rod 2
    Move disk   3 from rod 1 to rod 3
    Move disk   1 from rod 2 to rod 1
    Move disk   2 from rod 2 to rod 3
    Move disk   1 from rod 1 to rod 3


  • Read the description of the recursive algorithm on the Wikipedia page.

  • Your recursive subroutine should be of the form

    recursive subroutine movedisks(n1, n2, rod_start, rod_end)
        implicit none
        integer, intent(in) :: n1, n2, rod_start, rod_end
        integer :: rod_other

    and should print out the moves required to move disks n1 through n2 from rod_start to rod_end.

  • If there is only one disk to move (n1==n2) then it can print the required move.

  • If there are more disks to move then it should

  • move all disks but the bottom one from rod_start to rod_other (by recursively calling movedisks)
  • print a line saying to move the bottom disk from rod_start to rod_end
  • move all disks but the bottom one from rod_other to rod_end (by recursively calling movedisks)
  • rod_other = 6 - (rod_start + rod_end) is the other rod from the three available rods.

Part B

Modify your code from Homework 4 Part B so that it uses OpenMP with coarse-grain parallelism. The module quadrature_mod should be modified in the following ways...

  1. You can delete the trapezoid subroutine and modify only simpson.

  2. nthreads should still be a module variable that can be set in the main program. The subroutine simpson should decide how to split up the n intervals between threads and should include loops on i based on thread-specific values istart and iend to sum up the values needed to compute the Simpson approximation to the integral. Each thread should compute its own private int_thread based on its portion of the interval [a,b] and then add int_thread into the shared value integral.

    See $CLASSHG/openmp/pisum2.f90 for some ideas on how to convert your previous code into this form.

  3. Have each thread compute a private a_thread and b_thread that are the endpoints of the interval over which this thread is approximating the integral, and print these values out as in the sample output below.

  4. Eventually test it with k=10000 and n = 50000000 (the same problem as in Homework 4 Part b.

    In this case running the code should give something like:

    Using OpenMP with   2 threads
     points_per_thread =     25000000
    Thread  0 will take i =         1 through i =  25000000
       Corresponding to a_thread = -0.100000E+02 and b_thread = -0.350000E+01
    Thread  1 will take i =  25000001 through i =  50000000
       Corresponding to a_thread = -0.350000E+01 and b_thread =  0.300000E+01
    Simpson:        n =  50000000
       approx =  0.9819716972381E+01
       true =    0.9819716972381E+01
       error =  -0.465E-12
       g was evaluated  50000001 times by thread 0
       g was evaluated  50000001 times by thread 1
            6.75 real        12.65 user         0.04 sys

    The first part is printed out by subroutine simpson, the second part by the main program in testquad.f90.

  5. Your module quadrature_mod should also work if we provide a different main program and/or functions module for grading purposes.

Make sure your directory $MYHG/homeworks/homework5 contains the following subdirectories and files:

  • parta
  • testtowers.f90
  • movedisks_mod.f90
  • Makefile
  • partb
  • testquad.f90
  • quadrature_mod.f90
  • functions_mod.f90
  • Makefile

When you are done, don’t forget to use the hg add, hg commit, and hg push commands to push these to your bitbucket repository for grading.