Monitoring HPC
September 4th, 2008 | by jamshed zaidi |
What is HPC???
HPC stands for high performance computing and to get most efficient result set for solving engineering tasks, HPC is used. HPC term is normally used to represent teraflop 10 ^12 (10 power 12) and Super computer also operate at this level of speed. These types of deployment are needed normally in calculating the complex operations at military, bio-informatics and research environments. So far so good, now come to our main topic of discussion HPC Monitoring Tools. Always Open Source market is there to help you in these situations Ganglia, High Performance Computing Monitoring tool used to monitor clustering and grid computing. It has two parts Gmond and Gmetad
What it can provide:
- Monitoring of 2000 node cluster
- Supports software and hardware HPC solutions
- Its efficient algorithm and data structure don’t put load on individual nodes
- External Data Representation XDR format which is obtained by gmond over UPD/TCP connection.
- Gmond (Ganglia Monitoring Daemon) represents specific cluster and runs on individual node and listen, announce, monitors and answer any change in the cluster via unicasting/multicasting.
- Gmetad (Ganglia Meta Daemon) shows set of cluster and timely checks the tree structure of a cluster and parses the XML structure, If it finds any change in the cluster then updates XML file and distribute over UDP/TCP to clients.
- Its has also php based web front end which completely help Administrators and users to monitor HPC and provides reports as requested.
- Gmetad requires rrdtools (round robin tool) and widely supports many unix platforms
- Linux, unix, BSD distributions, SPARC, HPUX, windows XP/2003 server, AIX and many other. Get the implementation detail from here.

One Response to “Monitoring HPC”
By debbie walsh on Sep 5, 2008 | Reply
for those interested in learning more about high performance computing, you can go to http://www.hpcwire.com, free enewsletters are also available for registered site guests