Running MPI on the ITE Cluster
(Updated May 19, 2008)

Please read the entire FAQ



    Q: What is the address of the cluster?
    A: There are 6 nodes in the cluster. You can log on to any one of
       them from an on-campus machine. Their addresses are:
           mpi-node0.ite.gmu.edu
           mpi-node1.ite.gmu.edu
           ...
           mpi-node5.ite.gmu.edu

    Q: How can I log on to the cluster?
    A: You need to use ssh to log on to any of these machines. 
       Use your GMU login name and password.

    Q: How can I copy files to and from the cluster?
    A: Use scp or sftp. These are the only ways.

    Q: I can log on from a lab at the university but I can't from home or
       from wireless on campus.
    A: For security purposes direct access to the cluster is permitted
       only from the on-campus wired network. If you need to log on from an
       off-campus location or from a laptop with wireless connection, log on
       to mason.gmu.edu first (or any other on-campus server) and then ssh 
       to the cluster.

    Q: How do I start lam on a single machine?
    A: lamboot -v

    Q: How do I start lam on multiple machines?
    A: Create a file that lists the short names of the machines you want to use 
       (example below) and then use it when starting lam:

        -bash-3.00$ cat hostfile
        node0
        node1
        node2
        node3
        node4
        #node5   (this machine is commented out)
        -bash-3.00$ lamboot -v hostfile  

    Q: When lam starts I need to enter my password for each node. Even if
       I do that, lamboot on remote nodes fails. What should I do?
    A: To authenticate without providing your password every time, 
       you can take advantage of public key authentication.
       First, generate a public/private key pair:

          -bash-3.00$ ssh-keygen -t dsa
          Generating public/private dsa key pair.
          Enter file in which to save the key (/home/jradziko/.ssh/id_dsa): (Hit RETURN)
          Enter passphrase: (Hit RETURN)
          Enter same passphrase again: (Hit RETURN)
          Your identification has been saved in /home/jradziko/.ssh/id_dsa.
          Your public key has been saved in /home/jradziko/.ssh/id_dsa.pub.
          The key fingerprint is:
          15:e7:cc:34:fa:d5:88:7c:69:ab:e5:23:9a:19:d2:18 jradziko@mpi-node1.ite.gmu.edu
          -bash-3.00$

       You should leave the password blank.  Your public key is in ~/.ssh/id_dsa.pub
       and the private key in ~/.ssh/id_dsa

       Add your public key to authorized keys:

          -bash-3.00$ cd ~/.ssh
          -bash-3.00$ cat id_dsa.pub >> authorized_keys
          -bash-3.00$ chmod a+r authorized_keys

       Log onto each node using ssh to add the node to your list of known hosts.

       You need to follow the above steps only once.

    Q: What are the different ways to compile an MPI program?
    A: Type "man lamcc" to get the answer.

    Q: How do I run an MPI C program after I've started lam?
    A: Compile your program.
       Use mpirun -np xxx to start xxx copies of your code.
       Halt the lam daemon before you try to log out!

          -bash-3.00$ mpicc -o myprog myprog.c
          -bash-3.00$ mpirun -np 4 myprog
          -bash-3.00$ lamhalt -v
          -bash-3.00$ 

    Q: My program didn't finish correctly. What do I do?
    A: Use lamclean to remove processes and messages from all nodes before starting 
       the program again.
          -bash-3.00$ lamclean -v

    Q: When I try to log out, the system hangs.
    A: Make sure you've halted lam.