Differences between revisions 1 and 41 (spanning 40 versions)
Revision 1 as of 2008-05-27 23:09:18
Size: 3734
Comment: first skeletal draft for gravitational wave debian cluster
Revision 41 as of 2008-06-02 17:35:57
Size: 13294
Editor: ?CarstenAulbert
Comment: Added link to VIRGO (sorry for the additional link)
Deletions are marked like this. Additions are marked like this.
Line 2: Line 2:
||<tablewidth="100%">~-[:DebianWiki/EditorGuide#translation:Translation(s)]: none-~ || This text is still in initial state. No deadline yet. To edit this page, follow through the [wiki:self:Teams/Publicity/DebianTimesTeam/Guidelines Debian Times team guidelines] for publishing workflow. || (!) ["/Discussion"] || ||<tablewidth="100%">~-[:DebianWiki/EditorGuide#translation:Translation(s)]: none-~ || This text is open for contributions until 2nd June 2008, 18:00 GMT. To edit this page, follow through the [:Teams/Publicity/DebianTimesTeam/Guidelines:Debian Times team guidelines] for publishing workflow. || (!) ["/Discussion"] ||
Line 5: Line 5:
= Debian GNU/Linux clusters at Max Planck = = Debian GNU/Linux 32.8 TFlops cluster at Max Planck Institute for Gravitational Physics =

The [http://www.aei.mpg.de/english/research/teams/observationalRelativity/index.html Observational Relativity and Cosmology Research Group] is a team of scientists working at the [http://www.aei.mpg.de/hannover-en/66-contemporaryIssues/home/index.html Hannover Branch] of the [http://www.aei.mpg.de/english/contemporaryIssues/home/index.html Max Planck Institute for Gravitational Physics] (Albert Einstein Institute) in [http://en.wikipedia.org/wiki/Hannover Hannover], [http://en.wikipedia.org/wiki/Germany Germany].
Their goal is the direct detection of [http://en.wikipedia.org/wiki/Gravitational_wave gravitational waves], which were first [http://www.einstein-online.info/en/ predicted] by Albert Einstein. They are working with the friends and colleagues within the [http://www.ligo.org/ LIGO] Scientific Community and [http://www.virgo.infn.it/ VIRGO].

The massive computing effort necessary for this research is provided by a [http://www.debian.org/distrib/ Debian GNU / Linux] cluster of 1342 nodes called ATLAS.
Using 10+ TB RAM, approximately 1.3 PB storage and a special network able to transfer almost 4 days worth of DVD movies each second, the cluster achieves a measured performance of 32.8 TFlops.
This performance places the ATLAS Debian GNU / Linux supercomputer at 4th place in Germany, 11th in Europe and 34th worldwide, at a cost of EUR 1.8m (~ US$ 2.8m).



== The benefits of using Debian GNU / Linux at Atlas, Merlin and Morgane supercomputer clusters ==
The ATLAS Debian GNU / Linux cluster was designed, built and has been managed by [http://www.aei.mpg.de/hannover-en/09-staff/00-details/fehrmann/index.html Dr Henning Fehrmann] and [http://www.aei.mpg.de/hannover-en/09-staff/00-details/aulbert/index.html Dr Carsten Aulbert], who have been using Debian GNU / Linux for years.

ATLAS has smaller brother and sister systems in [http://en.wikipedia.org/wiki/Potsdam Potsdam], Germany: [http://gw.aei.mpg.de/resources/computational-resources/merlin-morgane-dual-compute-cluster "Merlin" (1.3 Tflops) and "Morgane" (6 TFlops)], also running Debian GNU / Linux and managed by [http://www.aei.mpg.de/english/php-Skripte/quMembPage/index.php?personKey=grunewald Dr. Steffen Grunewald] for many years; "the experience with them had been very, very good", according to Dr. Aulbert.


"Actually, with RH and its anaconda kickstart installer, for different types of machines (hardware and functionality-wise) I had one single master kickstart file that would have been run through cpp with proper defines set, to produce the actual kickstart file for a specific setup. While this allowed to maintain a single copy of install code, FAI with its class model was a major breakthrough, in readability, functionality, and maintainability. There's no way back now.", said Dr. Grunewald .

Beyond FAI, there are [http://www.techforce.com.br/index.php/news/linux_blog/massive_installation_management_tools_p_1 other useful tools] for massive scale installation, deployment and management of Debian GNU / Linux machines for various scenarios.

"Debian features an extremely large set of packages, making it THE distro of choice for keeping us out of the hassle to package needed software ourselves", said Dr. Aulbert.

"Also Thomas Lange's [http://packages.debian.org/source/etch/fai FAI package] is extremely useful for automatic deployment of Debian [GNU / Linux]. For example, without much tweaking and using only two hosts, we were able to reinstall the cluster in about 2.5 hours and were only limited by those two servers' network connection."

"Two weeks ago I would have written something about the very good security support, given that the reaction to the OpenSSL stuff was very good. I could still do, but in reality we don't need security updates except for the exposed nodes such as head nodes. Everything else is just visible internally."

As additional benefits of using Debian GNU / Linux, he cited:

 * the simplicity of [http://wiki.debian.org/DebianDevelopment creating own packages]
 * how repositories can be set-up easily (using the [http://packages.debian.org/etch/reprepro reprepro] package)
 * using clean build environments ([http://packages.debian.org/etch/pbuilder pbuilder] and similar packages)
 * and, of course, the superb packaging infrastructure in general ([http://packages.debian.org/etch/dpkg dpkg], [http://packages.debian.org/etch/apt apt], [http://packages.debian.org/etch/aptitude aptitude], [http://packages.debian.org/etch/synaptic synaptic] and many [http://packages.debian.org/search?keywords=apt&searchon=names&suite=stable&section=all useful APT tools])
By using Debian GNU / Linux at its clusters, the [http://www.aei.mpg.de/english/research/teams/observationalRelativity/index.html Observational Relativity and Cosmology Research Group] reduced the amount of work needed on the hardware and software infrastructure, compared to other scientific clusters running on other distributions, allowing them to focus on their objective of detecting gravitational waves.

"Personally, I like community distros more since they offer more long-term stability than a distro which is governed by the need of releasing often to generate revenue. Although on the downside it would be better for us to have a more settled [http://release.debian.org release plan] and / or some kind of [http://www.backports.org "stable and supported" backports] [for the specific softwares we use], said Dr. Aulbert.

== Debian is continuously evolving ==

Currently, the [http://www.debian.org Debian Project] is refining its [http://release.debian.org/ release methods] to accomplish a more regular release target of 18 months for the biggest [http://qa.debian.org officially] and [http://security.debian.org security] maintained distribution ever accomplished (24,000+ packages). The expected next release is on track as of May 2008.

The [http://www.backports.org Debian Backports] site has been actively maintained for 5 years by Debian Developers who are the only allowed to upload packages to it. Special requests for an official backport not [http://packages.debian.org/etch-backports/ already available] could be submitted at the [http://bugs.debian.org Debian Bug Tracking System] as wishlist, and could contain the needed patches to the Debian Developer official maintainer [http://packages.debian.org/etch-backports/ backport] the package from [http://packages.debian.org/testing/ Testing] to [http://packages.debian.org/stable/ Stable].

The Debian Project is at discussions at [http://lists.debian.org/debian-devel/2008/05/threads.html its developers mailing list] to improve its [http://www.debian.org/security/audit/ auditing] and [http://qa.debian.org quality] processes to prevent in very early stages of development any [http://www.debian.org/security/ security] and [http://qa.debian.org quality] issues at such large set of packages, beyond the prompt reaction Security Team for released packages.


== About the ATLAS cluster ==

The ATLAS cluster, [http://en.wikipedia.org/wiki/LINPACK linpack] measured 32.8 TFlops and a theoretical peak of about 50 TFlops, consists of 1342 [http://supermicro.com/ Supermicro] computer nodes ([http://www.intel.com/cd/channel/reseller/asmo-na/eng/products/server/processors/q3200/feature/index.htm Intel Xeon 3220] quad-cores 2,4 GHz, 8 GB RAM, 500 GB Hitachi HDD, IPMI remote management) along with 31 data servers (2x [http://www.intel.com/cd/products/services/emea/deu/processors/xeon5000/344530.htm Intel Xeon E5345] 2,33 GHz, 16 GB RAM, [http://www.areca.com.tw/ Areca] 1261ML, 16x750 GB Hitachi HDD) plus 4 similar head nodes with 4 x 750 GB HDD. Those are all running [http://www.debian.org/distrib/ Debian GNU / Linux] 4.0 Etch with a few modifications like custom kernel and Condor queuing system. Additional storage space is supplied by 13 [http://www.sun.com/servers/x64/x4500/ Sun Fire X4500] running Solaris 10. The system was built from off-the-shelf computers from a German company, [http://www.pyramid.de/ Pyramid Computer GmbH].

One of the many special hardware components they have is the network from [http://www.wovensystems.com/ Woven Systems] which is a hierarchical fully non-blocking network. The EFX 1000 core switch features 144 10 Gb/s CX4 ports and connects currently to 32 TRX100 edge switches which feature 48 1 Gb/s ports and 4x10 Gb/s uplinks, reaching 2880 Gb/s. Also their Sun Fire X4500 are directly connected to the core switch.


According to Dr. Grunewald, the [http://gw.aei.mpg.de/resources/computational-resources/merlin-morgane-dual-compute-cluster Merlin] Debian GNU / Linux [http://wiki.debian.org/DebianBeowulf Beowulf] 180 nodes cluster (launched in 2002) initially ran on a rpm based distribution, but in 2004 migrated to Debian GNU / Linux after the rpm distro vendor changed its licensing model. The total computing power of the 360 CPU cores has been estimated to be more than 1.3 Tflops peak; the data storage capacity is about 20 TB mirrored.

The Morgane Debian GNU / Linux [http://wiki.debian.org/DebianBeowulf Beowulf] cluster, [http://gw.aei.mpg.de/resources/computational-resources/merlin-morgane-dual-compute-cluster/overview/tasks-structure-parameters/ consisting] of 615 compute nodes, 15 storage nodes, and some head nodes, launched in December 2006. The total computing power of the 1230 CPU cores has been estimated to be more than 6 Tflops peak, the data storage capacity is about 100 TB.

[:DebianWiki/EditorGuide#translation:Translation(s)]: none

This text is open for contributions until 2nd June 2008, 18:00 GMT. To edit this page, follow through the [:Teams/Publicity/DebianTimesTeam/Guidelines:Debian Times team guidelines] for publishing workflow.

(!) ["/Discussion"]


Debian GNU/Linux 32.8 TFlops cluster at Max Planck Institute for Gravitational Physics

The [http://www.aei.mpg.de/english/research/teams/observationalRelativity/index.html Observational Relativity and Cosmology Research Group] is a team of scientists working at the [http://www.aei.mpg.de/hannover-en/66-contemporaryIssues/home/index.html Hannover Branch] of the [http://www.aei.mpg.de/english/contemporaryIssues/home/index.html Max Planck Institute for Gravitational Physics] (Albert Einstein Institute) in [http://en.wikipedia.org/wiki/Hannover Hannover], [http://en.wikipedia.org/wiki/Germany Germany]. Their goal is the direct detection of [http://en.wikipedia.org/wiki/Gravitational_wave gravitational waves], which were first [http://www.einstein-online.info/en/ predicted] by Albert Einstein. They are working with the friends and colleagues within the [http://www.ligo.org/ LIGO] Scientific Community and [http://www.virgo.infn.it/ VIRGO].

The massive computing effort necessary for this research is provided by a [http://www.debian.org/distrib/ Debian GNU / Linux] cluster of 1342 nodes called ATLAS. Using 10+ TB RAM, approximately 1.3 PB storage and a special network able to transfer almost 4 days worth of DVD movies each second, the cluster achieves a measured performance of 32.8 TFlops. This performance places the ATLAS Debian GNU / Linux supercomputer at 4th place in Germany, 11th in Europe and 34th worldwide, at a cost of EUR 1.8m (~ US$ 2.8m).

The benefits of using Debian GNU / Linux at Atlas, Merlin and Morgane supercomputer clusters

The ATLAS Debian GNU / Linux cluster was designed, built and has been managed by [http://www.aei.mpg.de/hannover-en/09-staff/00-details/fehrmann/index.html Dr Henning Fehrmann] and [http://www.aei.mpg.de/hannover-en/09-staff/00-details/aulbert/index.html Dr Carsten Aulbert], who have been using Debian GNU / Linux for years.

ATLAS has smaller brother and sister systems in [http://en.wikipedia.org/wiki/Potsdam Potsdam], Germany: [http://gw.aei.mpg.de/resources/computational-resources/merlin-morgane-dual-compute-cluster "Merlin" (1.3 Tflops) and "Morgane" (6 TFlops)], also running Debian GNU / Linux and managed by [http://www.aei.mpg.de/english/php-Skripte/quMembPage/index.php?personKey=grunewald Dr. Steffen Grunewald] for many years; "the experience with them had been very, very good", according to Dr. Aulbert.

"Actually, with RH and its anaconda kickstart installer, for different types of machines (hardware and functionality-wise) I had one single master kickstart file that would have been run through cpp with proper defines set, to produce the actual kickstart file for a specific setup. While this allowed to maintain a single copy of install code, FAI with its class model was a major breakthrough, in readability, functionality, and maintainability. There's no way back now.", said Dr. Grunewald .

Beyond FAI, there are [http://www.techforce.com.br/index.php/news/linux_blog/massive_installation_management_tools_p_1 other useful tools] for massive scale installation, deployment and management of Debian GNU / Linux machines for various scenarios.

"Debian features an extremely large set of packages, making it THE distro of choice for keeping us out of the hassle to package needed software ourselves", said Dr. Aulbert.

"Also Thomas Lange's [http://packages.debian.org/source/etch/fai FAI package] is extremely useful for automatic deployment of Debian [GNU / Linux]. For example, without much tweaking and using only two hosts, we were able to reinstall the cluster in about 2.5 hours and were only limited by those two servers' network connection."

"Two weeks ago I would have written something about the very good security support, given that the reaction to the OpenSSL stuff was very good. I could still do, but in reality we don't need security updates except for the exposed nodes such as head nodes. Everything else is just visible internally."

As additional benefits of using Debian GNU / Linux, he cited:

By using Debian GNU / Linux at its clusters, the [http://www.aei.mpg.de/english/research/teams/observationalRelativity/index.html Observational Relativity and Cosmology Research Group] reduced the amount of work needed on the hardware and software infrastructure, compared to other scientific clusters running on other distributions, allowing them to focus on their objective of detecting gravitational waves.

"Personally, I like community distros more since they offer more long-term stability than a distro which is governed by the need of releasing often to generate revenue. Although on the downside it would be better for us to have a more settled [http://release.debian.org release plan] and / or some kind of [http://www.backports.org "stable and supported" backports] [for the specific softwares we use], said Dr. Aulbert.

Debian is continuously evolving

Currently, the [http://www.debian.org Debian Project] is refining its [http://release.debian.org/ release methods] to accomplish a more regular release target of 18 months for the biggest [http://qa.debian.org officially] and [http://security.debian.org security] maintained distribution ever accomplished (24,000+ packages). The expected next release is on track as of May 2008.

The [http://www.backports.org Debian Backports] site has been actively maintained for 5 years by Debian Developers who are the only allowed to upload packages to it. Special requests for an official backport not [http://packages.debian.org/etch-backports/ already available] could be submitted at the [http://bugs.debian.org Debian Bug Tracking System] as wishlist, and could contain the needed patches to the Debian Developer official maintainer [http://packages.debian.org/etch-backports/ backport] the package from [http://packages.debian.org/testing/ Testing] to [http://packages.debian.org/stable/ Stable].

The Debian Project is at discussions at [http://lists.debian.org/debian-devel/2008/05/threads.html its developers mailing list] to improve its [http://www.debian.org/security/audit/ auditing] and [http://qa.debian.org quality] processes to prevent in very early stages of development any [http://www.debian.org/security/ security] and [http://qa.debian.org quality] issues at such large set of packages, beyond the prompt reaction Security Team for released packages.

About the ATLAS cluster

The ATLAS cluster, [http://en.wikipedia.org/wiki/LINPACK linpack] measured 32.8 TFlops and a theoretical peak of about 50 TFlops, consists of 1342 [http://supermicro.com/ Supermicro] computer nodes ([http://www.intel.com/cd/channel/reseller/asmo-na/eng/products/server/processors/q3200/feature/index.htm Intel Xeon 3220] quad-cores 2,4 GHz, 8 GB RAM, 500 GB Hitachi HDD, IPMI remote management) along with 31 data servers (2x [http://www.intel.com/cd/products/services/emea/deu/processors/xeon5000/344530.htm Intel Xeon E5345] 2,33 GHz, 16 GB RAM, [http://www.areca.com.tw/ Areca] 1261ML, 16x750 GB Hitachi HDD) plus 4 similar head nodes with 4 x 750 GB HDD. Those are all running [http://www.debian.org/distrib/ Debian GNU / Linux] 4.0 Etch with a few modifications like custom kernel and Condor queuing system. Additional storage space is supplied by 13 [http://www.sun.com/servers/x64/x4500/ Sun Fire X4500] running Solaris 10. The system was built from off-the-shelf computers from a German company, [http://www.pyramid.de/ Pyramid Computer GmbH].

One of the many special hardware components they have is the network from [http://www.wovensystems.com/ Woven Systems] which is a hierarchical fully non-blocking network. The EFX 1000 core switch features 144 10 Gb/s CX4 ports and connects currently to 32 TRX100 edge switches which feature 48 1 Gb/s ports and 4x10 Gb/s uplinks, reaching 2880 Gb/s. Also their Sun Fire X4500 are directly connected to the core switch.

According to Dr. Grunewald, the [http://gw.aei.mpg.de/resources/computational-resources/merlin-morgane-dual-compute-cluster Merlin] Debian GNU / Linux [http://wiki.debian.org/DebianBeowulf Beowulf] 180 nodes cluster (launched in 2002) initially ran on a rpm based distribution, but in 2004 migrated to Debian GNU / Linux after the rpm distro vendor changed its licensing model. The total computing power of the 360 CPU cores has been estimated to be more than 1.3 Tflops peak; the data storage capacity is about 20 TB mirrored.

The Morgane Debian GNU / Linux [http://wiki.debian.org/DebianBeowulf Beowulf] cluster, [http://gw.aei.mpg.de/resources/computational-resources/merlin-morgane-dual-compute-cluster/overview/tasks-structure-parameters/ consisting] of 615 compute nodes, 15 storage nodes, and some head nodes, launched in December 2006. The total computing power of the 1230 CPU cores has been estimated to be more than 6 Tflops peak, the data storage capacity is about 100 TB.

About the Debian Project

Debian GNU / Linux is [http://www.debian.org/ports/#nonlinux one] of the [http://www.debian.org/intro/free free libre] operating systems ([http://www.debian.org/ports/#released GNU/Linux], [http://www.debian.org/ports/hurd GNU/Hurd], [http://www.debian.org/ports/netbsd/ GNU/NetBSD], [http://www.debian.org/ports/kfreebsd-gnu/ GNU/kFreeBSD)], running 18733+ [http://qa.debian.org officially] maintained [http://packages.debian.org packages] on [http://www.debian.org/ports 15 hardware platforms], from [http://www.debian.org/ports/arm/ cell phones] and [http://www.linux-sh.org network devices] to [http://www.debian.org/ports/s390/ mainframes] and [http://wiki.debian.org/DebianBeowulf supercomputers], developed by more than [http://asdfasdf.debian.net/~tar/bugstats/?8 two thousand] volunteers from [http://www.debian.org/devel/developers.loc all over the world] who [http://www.debian.org/devel/ collaborate] via [http://www.us.debian.org/support the internet] on the [http://www.debian.org Debian Project].

Debian's dedication to [http://www.debian.org/intro/free Free Libre Open Source Software], its [http://www.debian.org/devel/constitution constitutional] non-profit nature, its [http://vote.debian.org/ open] and [http://en.wikipedia.org/wiki/Meritocracy meritocratic] development model, [http://www.debian.org/intro/organization organization] and social [http://www.techforce.com.br/index.php/news/linux_blog/scientific_study_about_debian_governance_and_organization governance] make it [http://www.debian.org/doc/manuals/project-history/ a first] among free libre operating system distributions.

The Debian project's key strengths are [http://www.debian.org/devel/people its volunteer base], its dedication to the [http://www.debian.org/social_contract Debian Social Contract] and the [http://www.debian.org/devel/constitution Debian Constitution], and its [http://wiki.debian.org/WhyDebianForDevelopers commitment] to [http://bugs.debian.org/release-critical/ provide the best] operating systems [http://release.debian.org/ attainable], following a strict quality [http://www.debian.org/doc/debian-policy policy], working with an established [http://qa.debian.org/ QA Team] and helpful [http://www.debian.org/users/ users] reporting [http://bugs.debian.org bugs, suggestions], [http://lists.debian.org exchanging ideas], and [http://wiki.debian.org registering experiences].

You can [http://www.debian.org/intro/help help] Debian Project without [http://www.debian.org/devel/join joining] it and [http://wiki.debian.org/DebianForNonCoderContributors even not being a ][http://wiki.debian.org/DebianForNonCoderContributors programmer], or being a development and or service [http://www.debian.org/partners/ partner] company or institution at the [http://www.debian.org/partners/partners Debian Partner Program], or simply making various [http://www.debian.org/donations donations] to the Debian Project.

Debian Project news, press releases and press coverage can be found from the official Debian wiki [http://wiki.debian.org/News page]. PR contact at [http://lists.debian.org/debian-publicity debian-publicity list].