Differences between revisions 36 and 37
Revision 36 as of 2009-04-06 09:08:52
Size: 5819
Editor: ?AlastairMcKinstry
Comment: Add netcdf
Revision 37 as of 2009-04-06 09:09:38
Size: 5818
Editor: ?AlastairMcKinstry
Comment:
Deletions are marked like this. Additions are marked like this.
Line 18: Line 18:
[[http://www.unidata.ucar.edu/software/netcdf/ Network Common Data Format]] is a set of software libraries and machine-independent data formats that support the creation, access, and sharing of array-oriented scientific data. The netCDF4 API also supports HDF5 data formats, which will probably take over. [[http://www.unidata.ucar.edu/software/netcdf/| Network Common Data Format]] is a set of software libraries and machine-independent data formats that support the creation, access, and sharing of array-oriented scientific data. The netCDF4 API also supports HDF5 data formats, which will probably take over.
Line 24: Line 24:
 
Line 54: Line 54:
[[http://www.ch.ic.ac.uk/chemime/|Chemical MIME types]] can be introduced to the Linux desktop with [[http://chemical-mime.sf.net/|chemical-mime-data]]. You will find most information about these MIME types and the project in the source of the package.  [[http://www.ch.ic.ac.uk/chemime/|Chemical MIME types]] can be introduced to the Linux desktop with [[http://chemical-mime.sf.net/|chemical-mime-data]]. You will find most information about these MIME types and the project in the source of the package.

An important component of scientific work is being able to take your data with you as you move from one position to another, and being able to work with the data files on the computer systems at your new institute. Similarly, it's vital to be able to exchange data files with colleagues or just read your own files in multiple different packages.

Therefore it is important to have standards-based data formats that are openly and well documented so that anyone can implement a reader and writer for the format. Please use this page to list:

  • the data formats you use
  • the Debian packages needed for working with the format
  • software used with that format that's not in Debian

hdf5

Hierarchical Data format is an extremely flexible format, possibly too flexible for its own good

  • Open Spec: YES
  • Packages

netCDF

Network Common Data Format is a set of software libraries and machine-independent data formats that support the creation, access, and sharing of array-oriented scientific data. The netCDF4 API also supports HDF5 data formats, which will probably take over.

FITS

Flexible Image Transport System was developed for astronomy, but could be used by many disciplines. One notable feature is good support for World Coordinates, i.e. translation between pixel coordinates and physical coordinates such as Longitude & Latitude, Frequency, Stokes parameters (polarisation). Arbitrary numbers of dimensions are supported as well, but not so flexibly as in hdf5.

  • Open Spec: YES
  • Packages

XML variants

Name

Open Spec?

Debian Packages

VOTable

YES

DASDNA

YES

MIPE

GPL

YES

GraphML

RNAML

BioMedCentral format

Do not know

MAGE-ML

Do not know

MODS

Do not know

dicom

Digital Imaging and Communications in Medicine is a classical format for medical computing imaging.

Chemical MIME/file types

Chemical MIME types can be introduced to the Linux desktop with chemical-mime-data. You will find most information about these MIME types and the project in the source of the package.

All chemical applications (e.g. xdrawchem or openbabel), which can handle the freedesktop.org MIME specs benefit from this package. Older specs for e.g. GNOME <= 2.4 or KDE <= 3.x are a bit harder to support, because their magic databases are not expandable.

The MIME-types are not part of the official shared-mime-info package/projects, because these MIME-types have never been registered with IANA (see also http://lists.freedesktop.org/archives/xdg/2005-May/006858.html).

BioDAS

BioDAS is a Distributed Annotation System for genome work - more a protocol than a data format. It uses XML for the sequence data.

General

  • microformats may be a useful avenue to explore.

  • A raw digital camera format is essential for scientific imaging work. Debian has the ufraw package

  • IPTC metadata looks interesting and has fairly open licence terms.

  • AAF is an interesting example of an advanced interchange format (for multimedia), with facilities for adding metadata and tracking change history. There's a SDK at ?SourceForge.

  • THREDDS is a data publication service for environmental science data. See also LDM and the Internet Data Distribution system.

  • xmedcon may be useful to some