High Performance Computing on Clouds
Mentor: Steffen Möller
Summary: Template environment and helper tools to support distributed batch processing in a Debian cloud environment
- good general system administration (requirement to understand preparation of images)
- shell, Python and/or Perl programming (requirement)
- basics in Debian packaging (optional, to improve issues you find)
- C/Java (optional, helpful to understand Torque/Eucalyptus)
The page HighPerformanceComputing (HPC) summarises a series of tools that are routine for sites providing clusters of machines for large compute jobs. Every hand contributing to make Debian fit the needs of that vibrant community better is certainly appreciated, this project shall pick a particular topic: batch processing in cloud environments. The advent of Cloud computing allows the user to abstract from the need to maintain the machines physically. The user pays for the time needed and starts or stops virtual machines with self-defined images.
The whole effort became very dynamic and most regular HPC tools are not prepared for that. Consequently, users now find it difficult to prepare a large number of instances quickly and to then distribute their computations among those since the instances don't know about each other. And the system's configuration would be required to be informed about newly made available resources. For this project the student shall use Eucalyptus to create one (or multiple) images to build a cluster on demand featuring the Torque queueing system (or SGE or ... ). A respective installation of Eucalyptus can be performed by the student and/or access to such be offered by the mentor.
The project will further strengthen the link of our distribution with Amazon/Eucalyptus cloud installations. Other side-benefits comprise further improvements on the packaging of Torque, Eucalyptus and further associated tools and libraries. The project shall conclude with a smallish paper on what has been achieved and generally improve Cloud/HPC documentation in the wiki.
The code produced shall (as appropriate) be distributed with the packages for Eucalyptus and/or the packages of the batch system used (Torque preferred). This will ensure the availability of the developemnt to the community and underlines the Debianiness of this project.