Differences between revisions 38 and 39
Revision 38 as of 2009-04-08 07:02:42
Size: 8137
Editor: ?AlastairMcKinstry
Comment: Some Meteorological data formats
Revision 39 as of 2009-09-22 06:07:18
Size: 7825
Editor: FranklinPiat
Comment: Fix DebianBug links
Deletions are marked like this. Additions are marked like this.
Line 39: Line 39:
  * [[http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=522775|libemos]]   * [[DebianBug:522775|libemos]]
Line 44: Line 44:
  * [[http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=522775|libemos]]   * [[DebianBug:522775|libemos]]
Line 48: Line 48:
  * [[http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=519184|gribapi]]
  * [[http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=522772|cdo]] - Climate Data Operators
  * [[http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=522984|magics++]] plotting software
  * [[DebianBug:519184|gribapi]]
  * [[DebianBug:522772|cdo]] - Climate Data Operators
  * [[DebianBug:522984|magics++]] plotting software
Line 58: Line 58:
 Data format supported by [[http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=522772|cdo]]  Data format supported by [[DebianBug:522772|cdo]]
Line 60: Line 60:
 Data format supported by [[http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=522772|cdo]]  Data format supported by [[DebianBug:522772|cdo]]
Line 62: Line 62:
 Data format supported by [[http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=522772|cdo]]  Data format supported by [[DebianBug:522772|cdo]]

An important component of scientific work is being able to take your data with you as you move from one position to another, and being able to work with the data files on the computer systems at your new institute. Similarly, it's vital to be able to exchange data files with colleagues or just read your own files in multiple different packages.

Therefore it is important to have standards-based data formats that are openly and well documented so that anyone can implement a reader and writer for the format. Please use this page to list:

  • the data formats you use
  • the Debian packages needed for working with the format
  • software used with that format that's not in Debian

hdf5

Hierarchical Data format is an extremely flexible format, possibly too flexible for its own good

  • Open Spec: YES
  • Packages

netCDF

Network Common Data Format is a set of software libraries and machine-independent data formats that support the creation, access, and sharing of array-oriented scientific data. The netCDF4 API also supports HDF5 data formats, which will probably take over.

FITS

Flexible Image Transport System was developed for astronomy, but could be used by many disciplines. One notable feature is good support for World Coordinates, i.e. translation between pixel coordinates and physical coordinates such as Longitude & Latitude, Frequency, Stokes parameters (polarisation). Arbitrary numbers of dimensions are supported as well, but not so flexibly as in hdf5.

  • Open Spec: YES
  • Packages

Meteorological formats

BUFR

  • The WMO FM-94 BUFR, Binary Universal Form for the Representation of Meteorological data, is a binary code designed to represent, employing a continuous binary stream, any meteorological data. It has been designed to achieve efficient exhange and storage of meteorogical and oceanographic data. It is self defining, table driven and very flexible data representation system, especially for huge volumes of data.
  • Open Spec: YES
  • Packages:

CREX The FM 95 -XII CREX is standard WMO Character form for the Representation and EXchange of meteorogical and other data. It is self defining, table driven and very flexible data representation system. It is specially useful in the cases where binary representation of data is not possible due to the lack of computer handling capabilities.

GRIB "Gridded Binary" format for binary data, used by many forecast applications.

ODB "Observation database" format used by some Meteo France and others for the ALADIN forecasting system.

FA / LFI files

  • Used by Meteo France for the ALADIN forecasting system. ALADIN includes a tool "gl" which can be used to translate these to grib format. FA files do not use a "FA" extension: that is used by the FASTA gene sequencing software.

SERVICE

  • Data format supported by cdo

EXTRA

  • Data format supported by cdo

IEG

  • Data format supported by cdo

XML variants

Name

Open Spec?

Debian Packages

VOTable

YES

DASDNA

YES

MIPE

GPL

YES

GraphML

RNAML

BioMedCentral format

Do not know

MAGE-ML

Do not know

MODS

Do not know

dicom

Digital Imaging and Communications in Medicine is a classical format for medical computing imaging.

Chemical MIME/file types

Chemical MIME types can be introduced to the Linux desktop with chemical-mime-data. You will find most information about these MIME types and the project in the source of the package.

All chemical applications (e.g. xdrawchem or openbabel), which can handle the freedesktop.org MIME specs benefit from this package. Older specs for e.g. GNOME <= 2.4 or KDE <= 3.x are a bit harder to support, because their magic databases are not expandable.

The MIME-types are not part of the official shared-mime-info package/projects, because these MIME-types have never been registered with IANA (see also http://lists.freedesktop.org/archives/xdg/2005-May/006858.html).

BioDAS

BioDAS is a Distributed Annotation System for genome work - more a protocol than a data format. It uses XML for the sequence data.

General

  • microformats may be a useful avenue to explore.

  • A raw digital camera format is essential for scientific imaging work. Debian has the ufraw package

  • IPTC metadata looks interesting and has fairly open licence terms.

  • AAF is an interesting example of an advanced interchange format (for multimedia), with facilities for adding metadata and tracking change history. There's a SDK at ?SourceForge.

  • THREDDS is a data publication service for environmental science data. See also LDM and the Internet Data Distribution system.

  • xmedcon may be useful to some