DMFT-MPI Home Page
This page is intended to inform about current and former
releases, available platforms, necessary libraries and current
benchmarks for the MPI version of the DMFT-QMC (dynamical mean field
theory, treated with quantum monte carlo) program. At present, the
code itself is not public; therefore all links to the code are local.
Authors
serial version: |
Martin Ulmke (original author)
Karsten Held (general AB phases, reorganisation)
Nils Blümer (F-term)
Jan Schlipf (t-t' hopping, reorganisation)
Joachim Wahle (reorganisation)
|
parallel version: |
Nils Blümer
|
Features
... to be filled ...
Release notes
Current version is 1.0 (17 Sep 1999): DMFT-MPI.uum, DMFT-MPI.F, libDMFT-MPI.F (all links local)
General remarks about distribution method
The program is distributed as a single file, which is a compressed,
uuencoded, and self-executing tar-file. After download the file has
to be made executable (chmod +x filename) and executed (./filename).
This invokes uncompression of all necessary files into the present
directory and starts compilation for the detected platform. Up to
comments the header of the distribution file looks like this:
#!/bin/csh -f
uudecode $0
chmod 644 DMFT-MPI.tar.gz
gunzip -c DMFT-MPI.tar.gz | tar -xvf -
rm $0 DMFT-MPI.tar.gz
./Configure > options.arch
make
exit
The script named Configure holds informations
relevant to compilation and linking of Fortran programs about all
platforms on which the program has been tested. It is intended to be
universal in the sense that it can also be used for other projects.
The variables defined in Configure are used in the
Makefile which is also easy to generalize.
Available platforms
The program has been tested on
Compilation
The program contains several compiler macro-variables (which are usually set by
the Makefile in the form -D<VARNAME> or
-D<VARNAME>=<VALUE>)
- MPI: if set, the Message Passing Interface Standard is used to build a parallel program; otherwise, a serial version is built.
- SPRNG: if set, the Scalable Parallel Pseudo-Random Number Generator Library is used for random number generation (various algorithms available, default: 48-bit Linear Congruential Generator); otherwise the Lewis-Goodman-Miller algorithm (from Numerical Recipies) is used. At present, parallel execution requires SPRNG.
- LMAX: maximum number of time slices; if not set (or set to 0), a
default of 129 is set in program. Best way to set it is to compile with
make SIZE=<size>.
Usage
The program expects input on stdin, and outputs results to stdout,
several files with names of the form
<run-prefix>.<type-suffix>, and input.tmp; the run-prefix
is set from the input on stdin. If an old self energy is to be used
as initialization, it is expected in the file <run-prefix>.self.in.
An example for valid input
(run-prefix=MPI_L050) is here.
The serial version of the program (compiled without MPI) is interactively
run as follows (e.g., on our RS/6000 cluster):
DMFT-MPI < <run-prefix>.in > <run-prefix>.out
The code is run as follows (provided, executable DMFT-MPI is in path);
<numproc:> is the number of processors required:
- IBM RS/6000: (serial)
DMFT-MPI < <run-prefix>.in > <run-prefix>.out
, batch-file
- IBM RS/6000 SP:
poeauth -procs 4 -hfile /ptmp/nilsb/host.list && poe /ptmp/nilsb/Test/DMFT-MPI_SP_050 < MPI_L050.in > MPI_L050.out -procs 4 -hfile /ptmp/nilsb/host.list
, batch-file
- CRAY T90: (serial, no interactive usage), batch-file
- CRAY T3E/1200:
mpprun -n <numproc:> DMFT-MPI < <run-prefix>.in > <run-prefix>.out
, batch-file
- Siemens/Fujitsu VPP 700:
JOBEXEC -em simplex -vp <numproc:> -mem 256 /b/b5101bn/DMFT-MPI/DMFT-MPI_VPP_116 < /b/b5101bn/DMFT-MPI/Benchmarks-VPP/MPI_L116.in > /b/b5101bn/DMFT-MPI/Benchmarks-VPP/MPI_L116.out
, batch-file
Benchmarks

(click to enlarge, also as table)
For optimal memory management the compiler variable LMAX has been chosen equal
to the problem sizes (number of time slices) L=50,80,116,200. Using converged
self energies for input, the fact that small variations are observed also
validates the code. More information about the input
is given here.
Available computing resources
- IBM RS/6000, local cluster: numerous machines (some Power2, Power3), see Load Leveler (command: xloadl) for dep. of physics. reserved for us: qmc (Power2, 512 MB main memory)
- IBM RS/6000 SP, local: (all) 14 nodes for dep. of physics; 4 for interactive use (news group, current configuration)
- CRAY T3E/1200, NIC-ZAM Juelich: 512 nodes, 10500 KE/month (3000 CPU-hours) for our group (hac04).
- CRAY T90, NIC-ZAM Juelich: 10 nodes (originally 16), 5750 KE/month (50 CPU-hours at normal priority) for our group (hac04).
- CRAY T90, LRZ München: 4 nodes, 6000 CPU-s/day for dep. of physics
- Siemens/Fujitsu VPP 700, LRZ München: 54 nodes, 139 CPU-h/day for
dep. of physics
Nils Blümer Last changed:
[an error occurred while processing this directive]
01-Oct-13 11:26