Auswahlmenü

	Startseite
	CV
	Präsentationen
	Veröffentlichungen
	Doktorarbeit
	Vorlesungen
	Evaluation
	Computing

Links


	KOMET 337
	Román Orús
	Matteo Rizzi
	Daniel Rost
	Institut für Physik
	Universität Mainz

	Transregio SFB/TR49
	Forschergruppe 1346

nils-uni@bluemer.name

Prof. Dr. Nils Blümer

Achtung: viele Informationen auf diesen Seiten sind veraltet! Für aktuelle Informationen über Nils Blümer siehe Webseiten der KU.

Computing

Wesentliche Grundlagen für erfolgreiche Arbeiten in der Rechnergestützten Physik sind offensichtlich der Zugang zu ausreichenden Rechnerressourcen sowie die Beherrschung der Techniken (High-Performance Computing, IT-Management) zu deren effizientem Einsatz. Allgemein ist Informationsmanagement eine wichtige Aufgabe in der Führung von Arbeitsgruppen und Kollaborationen, insbesondere in der Wissenschaft (mit einer ständigen Fluktuation von Team-Mitgliedern) sowie in der Selbstorganisation. Im Folgenden diskutiere ich einige Aspekte, die für meine Gruppe besonders relevant sind oder von allgemeinem Interesse sein könnten.

Gruppen-HPC-Cluster

Meine Gruppe nutzt und betreibt einen eigenen (im Rechenzentrum untergebrachten) HPC-Linux-Cluster mit mehr als 200 CPU-Kernen, aktuell mit der folgenden Hardware:

main file server: Supermicro 4U chassis with 24 SAS/SATA slots, 2x Intel Xeon Hapertown Quad Core E5430, 16 GB RAM, ARECA RAID SAS controller 20x ARC-1680ix with 20x 1 TB SAS hard disks (SEAGATE ST31000640SS), 2x 900 W redundant power supply, 2x 10 GbE (CX4), IPMI 2 interface
2 nodes, each with 2x Intel Xeon L5410, 16 GB, IPMI (in 1x Supermicro Twin 6015TC-T-10G)
16 nodes, each with 2x Intel Xeon Nehalem E5520, 8 GB, IPMI (in 4x Supermicro Twin2 6026TT-BTRF)
8 nodes, each with 2x Intel Xeon Westmere E5620, 8 GB, IPMI (in 2x Supermicro Twin2 6026TT-HTRF)
main network: HP ProCurve Switch 2900-48G (J9050A): 48 GBit-ports, 2x 10 GbE Ports (CX4), 2 optical 10 GbE ports (X2)
service network (IPMI): LevelOne Switch FastEthernet Switch 48Port+2PortGBE+1PortSFP (unmanaged)
update installed in 11/2013: new 1U file server Supermicro 1018D-73MTF with 1x Intel Xeon E3-1240V3, 16 GB RAM, 8x SAS controller LSI 2308, 8x 960 GB Crucial M500 SSD, 2x 10 GbE (CX4), IPMI 2 interface

Setup:

Fully scalable "diskless" setup of compute nodes: custom Debian images loaded in ramdisks, using the Tivoli Provisioning Manager for OS Deployment (formerly Rembo boot server), 500 GB hard disks per node only for logs and scratch data
SLURM batch system
Intel Cluster Studio for Linux with Open MPI, GNU Compiler Collection
Monitoring using Ganglia and Nagios
Disk images and support by Q-Leap Networks GmbH (Dr. Roland Fehrenbacher)

Previous clusters (with P. van Dongen)

16 nodes with 2x AMD AthlonMP 1200, 2 GB, Tyan S2460 (year 2001-2006)
7 nodes with 2x AMD AthlonMP 2200, 1 GB, Tyan S2466 (year 2002-2006)
4 nodes with 2x AMD Opteron 244, 2 GB, Rioworks HDAMA (year 2003)
10 nodes with 2x AMD Opteron 246, 2 GB, Tyan S2882 (year 2004)
4 nodes with 2x AMD Opteron 270, 2 GB (year 2005)
8 nodes with 2x AMD Opteron 2216, 4 GB (year 2006)
file servers (years 2001, 2005)
Rembo boot server, PBS Pro / OpenPBS queueing system

Group HPC cluster in 2003

Group HPC cluster ~ 2008

Information management

TWiki/FosWiki for collaboration, group calendar, documentation of ressources, programs, procedures, research data management, grading, etc.
Subversion version control system for code development and collaborative writing of papers (while preserving all needed figures and corresponding data sets) and proposals
rsnapshot for (local and remote) snapshots, e.g., of user data
Unison file synchronizer (e.g. for two-way synchronization between laptops and NFS and between web servers)
Sympa based mailing lists (with archives etc.)
central department seminar web page (implemented using PHP+MySQL by Markus Himmerich, now administrated by Andreas Nussbaumer)

Usage of supercomputers and central HPC clusters

My experience with supercomputers and central HPC clusters dates back to 1995 and includes the following machines:

Cray Y-MP, Cray T90, Cray T3E, IBM p690 Power4 Cluster Jump, IBM BlueGene/P JUGENE, Intel Xeon cluster JUROPA at KFA Jülich/FZ Jülich (NIC/JSC)
SGI PowerChallenge and Connection Machine CM-5 at the NCSA
IBM SP2 at the Cornell Theory Center
Cray T90, Fujitsu VPP700 (with contributions to specification) at LRZ Munich
IBM SP at the University Augsburg
Linuxcluster LC1, LC2 (with contributions to specification and benchmarking), and Mogon (with contributions to specification) at the ZDV Mainz

For better overview over jobs at different sites, I had created a portal for supercomputer batch queues.

Code development

Research codes and tools

DMFT+HF-QMC code (Release 1 of 1999)
real-space DMFT+HF-QMC code (serial version plus scripts and code for maximum entropy analytic continuation; see documentation and lecture notes)
Documentation of determinantal QMC code by Fakher Assaad (snapshot from our Twiki/Foswiki platform).

Sample codes and templates

Demonstration programs for constructing field lines and solving the poisson equation (year 2000; see also my more recent lecture on field theory).
Program for generating generic and nonreversive random walks; computes probabilities for self-avoiding walks (SAWs): random_walk.c
Simple Molecular dynamics code for Lennard-Jones fluid (template): MD_LJ_v05a.c, code: MD_LJ.c, both use pointer_utils.c; for explanations, see comp-sim_hw5.pdf
Monte Carlo code for 2 D square Ising model with Metropolis single-spin flips oder Wolf cluster updates: mc_Ising_2D.c
Statistische Auswertung einer Zeitreihe (unter Berücksichtigung der Autokorrelation; Steuerung über Kommandozeile: -h für Hilfe): stats_template.c (highlighted html), stats_template.c, stats_v1_4.c (highlighted html), stats_v1_4.c; Windows package (mit cygwin1.dll): stats_windows.zip.
Einfacher Zufallszahlengenerator: random_nums_v4.c (highlighted html), random_nums_v4.c
MC-Simulation für das 2D Ising-Modell (Templat): mc_Ising_2D_template2a.c (highlighted html), mc_Ising_2D_template2a.c
Code für polynomiale Least-squares-fits: polyfit7.c (vorerst nur zur Illustration, verweist auf nicht-freie Module, also nicht kompilierbar!)
Simple Molecular dynamics code for Lennard-Jones fluid (template): MD_LJ_v05a.c uses pointer_utils.c

See also course pages on computer simulations and numerical methods listed on my lectures page.

Parallelization, tuning, porting, and benchmarking

(to be continued, sample benchmark results shown below)

Miscellaneous

Arbeitsprobe Linux (Aufgabenstellung zur Auswahl eines Linux-Administrators für das Institut für Physik im Mai 2009)

Druckversion: http://dmft.org/Bluemer/computing.de.shtml?print

Zuletzt geändert: 25-Nov-13