Differences between revisions 1 and 69 (spanning 68 versions)
Revision 1 as of 2006-06-04 08:00:16
Size: 4537
Editor: PaulWise
Comment: Create unicode coverage of debian fonts page
Revision 69 as of 2018-03-19 05:40:51
Size: 4364
Editor: ?KessVargavind
Comment: Manjari font update
Deletions are marked like this. Additions are marked like this.
Line 1: Line 1:
This page aims to document the Unicode coverage of all the fonts (including non-free ones) in Debian. It was created using gucharmap. You should also read the ["DebianInstaller/GUIFonts"] page to find out about the fonts used by the installer. #language en
This page aims to document the Unicode coverage of all the fonts (including non-free ones) in Debian. It was created using `fontforge`, `font-manager` and `gucharmap`. You should also read the [[DebianInstaller/GUIFonts]] page to find out about the fonts used by the installer.

Please add any fonts which might fill in some of these blocks to the [[Fonts/Missing]] page.

TODO: automatically create this page based on the latest unicode-data package as well as all fonts in Debian unstable, using the fonts review:

https://pkg-fonts.alioth.debian.org/review/
Line 5: Line 12:
Wikipedia languages whose names Firefox could not render. Inuktitut and Yi could not be rendered at all. Kannada and Sichuan Yi had some problems. Firefox seems to render all Wikipedia languages:
Line 7: Line 14:
http://meta.wikimedia.org/wiki/List_of_Wikipedias  http://meta.wikimedia.org/wiki/List_of_Wikipedias
   [[https://phabricator.wikimedia.org/diffusion/MW/browse/master/languages/data/Names.php]]
Line 9: Line 18:
= Strange =

!DejaVu sans uses some of the private use area, is this valid?
Line 14: Line 20:
With missing characters mentioned, most of these are available as bitmap glyphs in ''Unifont'' or ''Unifont Upper''.
Line 15: Line 22:
 * Symbol modifier letters: U+02EF to U+02F2, U+02F4 to U+02F6, U+02F8 to U+02FF
 * Hebrew: U+05C6. Looks like several other characters in this block use this symbol - U+05A2, U+05C5 and U+05C7 also have the square with numbers in them.
 * Syriac: U+072D to U+072F, U+074D to U+074F
 * Devanagari: U+097D
 * Gujarati: U+0ABC, U+0AE1 to U+0AE3
 * Tamil: U+0BB6, U+0BE6, U+0BE6, U+0BF3 to U+0BFA
 * Teluga: looks like it draws on 2 chars in some other sections - U+0C55 and U+25CC
 * Kannada: U+0C8C, U+0CBC, U+0CBD, U+0CE1 and draws on U+25CC
 * Sinhala: 18 missing chars, plus it draws on a few others. 25 chars missing parts in total.
 * Thai: 16 chars draw on non-printing character U+200D
 * Tibetan: U+0FD0 and U+0FD1 look buggy, as they show the letters 0FD0 and 0FD1 instead of being symbols. The font is Tibetan Machine Uni.
 * Ethiopic: U+1207, U+1247, U+1287
 * Phonetic extensions: 18 characters missing
 * Phonetic extensions supplement: 23
 * General punctuation: 5
 * Superscripts and subscripts: 5
 * Currency: german penny and former argentinian currency (AUSTRAL)
 * Letterlike symbols: 19
 * Arrows: 26
 * Mathmatical operators: 13
 * Misc. symbols: 14
 * Coptic: > 60
 * CJK radicals supplement: about 20
 * CJK symbols and punctuation: 15
 * Katakana: U+30FF
 * Enclosed CJK letters and month: about half
 * CJK compatability: about a third
 * CJK Unified Ideographs: this is a huge block, with about an eighth not defined.
 * Arabic presentation forms A: about 2 thirds to 3 quarters empty
 * CJK compatability forms: 5
 * small form variants: U+FE58 SMALL EM DASH
 * half width and fullwidth forms: the hangul ones, 20 or so
 * linear B syllabry: 14
 * linear B ideograms: about half
 * Aegean numbers: 9 - subunit ones
 * Deseret: 4 - captial/small EW and OI
 * '''Bengali''': U+09FC and U+09FD
 * '''Elbasan''': U+10500 and U+10502 to U+10527
 * '''Grantha''': U+11300 to U+11302, U+11305 to U+1130C, U+1130F, U+11310, U+11313 to U+1132B, U+1132A to U+11330, U+11332, U+11333, U+11335 to U+11339, U+1133D to U+11344, U+11347, U+11348, U+1134B to U+1134D, U+11350, U+11357, U+1135D to U+11363, U+11366 to U+1136C and U+11370 to U+11374
 * '''Gujarati''': U+0AFA to U+0AFF
 * '''Ideographic symbols and punctuation''' U+16FE1
 * '''Kannada''': U+0C80
 * '''Khojki''': U+1123E
 * '''Limbu''': U+191D to U+191E
 * '''Old Hungarian''': U+10C81, U+10C89, U+10C8B, U+10C90 to U+10C92, U+10C94 to U+10C98, U+10C9D to U+10C9F, U+10CA1, U+10CA8, U+10CA9, U+10CAC, U+10CC0 to U+10CF2 and U+10CFA to U+10CFF
 * '''Old Permic''': U+10352, U+10353, U+1035A to U+1035E, U+10363 to U+10365, U+10368 to U+1036B and U+10371 to U+1037A
 * '''Tangut''' U+17002 to U+17004, U+17006 to U+1702F, U+17032 to U+17037 and U+17039 to U+187EC
 * '''Tangut components''' U+1880C to U+1880F, U+18811, U+18813, U+18816 to U+18818, U+1881C to U+1881F, U+18825, U+18826, U+18829, U+1882E to U+18830, U+18834 to U+1883C, U+1883E to U+1884C, U+1884E to U+18852, U+18857, U+18859, U+1885A, U+1885D, U+1885E, U+18860, U+18865, U+18867, U+18868, U+1886B to U+1886E, U+18870 to U+1887C, U+1887E to U+18881, U+18883 to U+1889B, U+1889D to U+188AF, U+188B1 to U+188C6, U+188C8 to U+188D6, U+188D9 to U+188E8, U+188EA to U+18919, U+1891E, U+18921 to U+18942, U+18946 to U+18959, U+1895B to U+18975, U+18977 to U+18AD4 and U+18AD6 to U+18AF2
 * '''Vedic extensions''' U+1CF7
Line 52: Line 36:
= Mostly empty = = Completely empty blocks =
Line 54: Line 38:
Misc tech, Control pics, misc math symbols A, CJK Unified Ideographs Extension A & B, CJK Unified Ideographs supplement. Most of these blocks have bitmap glyphs available in ''Unifont'' or ''Unifont Upper''.
Line 56: Line 40:
For the CJK parts, Arne Götje (高盛華) says that they are seldom used, that CJK daily use characters are supported and that he is working on the rest of them, but they won't be finished any time soon.  * '''Ahom''' (U+11700...)
 * '''Bassa Vah''' (U+16AD0...)
 * '''Bhaiksuki''' (U+11C00...)
 * '''Caucasian Albanian''' (U+10530...)
 * '''Duployan''' (U+1BC00...)
 * '''Hatran''' (U+108E0...)
 * '''Khojki''' (U+11200...)
 * '''Khudawadi''' (U+112B0...)
 * '''Mahajani''' (U+11150...)
 * '''Manichaean''' (U+10AC0...)
 * '''Marchen''' (U+11C70...)
 * '''Masaram Gondi''' (U+11D00...)
 * '''Mende Kikakui''' (U+1E800...)
 * '''Miao''' (U+16F00...)
 * '''Modi''' (U+11600...)
 * '''Mongolian supplement''' (U+11660...)
 * '''Mro''' (U+16A40...)
 * '''Multani''' (U+11280...)
 * '''Nabataean''' (U+10880...)
 * '''Newa''' (U+11400...)
 * '''Nushu''' (U+1B170...)
 * '''Old North Arabian''' (U+10A80...)
 * '''Pahawh Hmong''' (U+16B00...)
 * '''Palmyrene''' (U+10860...)
 * '''Pau Cin Hau''' (U+11AC0...)
 * '''Psalter Pahlavi''' (U+10B80...)
 * '''Sharada''' (U+11180...)
 * '''Siddham''' (U+11580...)
 * '''Sora Sompeng''' (U+110D0...)
 * '''Soyombo''' (U+11A50...)
 * '''Sutton signwriting''' (U+1D800...)
 * '''Syriac supplement''' (U+0860...)
 * '''Takri''' (U+11680...)
 * '''Tirhuta''' (U+11480...)
 * '''Warang Citi''' (U+118A0...)
 * '''Zanabazar square''' (U+11A00...)
Line 58: Line 77:
= No Glyphs or almost empty =

Ethiopic supplement, Unified Canadian Aboriginal Syllabics, Myanmar, Ogham, Tagalog, Buhid, Tanbanwa, Mongolian, New Tai Lue, Kanbun, ethiopic extended, Katakana phonetic extensions, Yijing hexagram symbols, Yi symbols, Yi radicals, ancient greek numbers, byzantine musical symbols, musical symbols, ancient greek musical notation, Tai Xian Jing Symbols, combining diacritical marks, OCR, supplemental arrows A & B, misc math symbols B, supplemental math operators, misc symbols and arrows, supplemental punctuation, ideographic description characters, modifier tone letters, variation selectors, mathematical alphanumeric symbols and variation selectors supplement and combining diacritical marks for symbols (only one character - from !FreeSerif).

For myanmar, at the moment there is a proposal to update this block to support Mon and some other languages from Burma. There are several sources for fonts for this, although some intrude on parts of Unicode that are not yet defined. Burmese is in a state of flux at the moment.

Also Klingon uses U+!F8D0 to U+F8FF in the private use area, but there are no fonts for it in Debian.
Some conscripts in the [[http://www.evertype.com/standards/csur/|ConScript Unicode Registry]] (such as Klingon, Tengwar and Visible speech) have bitmap glyphs available in ''Unifont CSUR''.
Line 68: Line 81:
http://www.unifont.org/fontguide/
http://www.alanwood.net/unicode/fonts.html
http://en.wikipedia.org/wiki/Unicode_font
http://www.evertype.com/celtscript/
http://www.travelphrases.info/fonts.html
http://my.wikipedia.org/wiki/Wikipedia:Font
http://sourceforge.net/projects/prahita
http://sourceforge.net/projects/uniburma
http://www.evertype.com/celtscript/ogfont.html
http://www.gov.nu.ca/font.htm
http://www.gov.nu.ca/Nunavut/English/font/
http://www.evertype.com/celtscript/inuktitut.html
http://www.google.com/search?q=Inuktitut+fonts
http://www.evertype.com/celtscript/ogfont.html
 http://en.wikipedia.org/wiki/Unicode_font
 
 http://www.unicode.org/standard/supported.html
 
 http://www.unicode.org/standard/unsupported.html
 
 http://www.unifont.org/fontguide/
 
 http://www.alanwood.net/unicode/fonts.html
 
 http://www.travelphrases.info/fonts.html

This page aims to document the Unicode coverage of all the fonts (including non-free ones) in Debian. It was created using fontforge, font-manager and gucharmap. You should also read the DebianInstaller/GUIFonts page to find out about the fonts used by the installer.

Please add any fonts which might fill in some of these blocks to the Fonts/Missing page.

TODO: automatically create this page based on the latest unicode-data package as well as all fonts in Debian unstable, using the fonts review:

https://pkg-fonts.alioth.debian.org/review/

Quick test

Firefox seems to render all Wikipedia languages:

Incomplete blocks

With missing characters mentioned, most of these are available as bitmap glyphs in Unifont or Unifont Upper.

  • Bengali: U+09FC and U+09FD

  • Elbasan: U+10500 and U+10502 to U+10527

  • Grantha: U+11300 to U+11302, U+11305 to U+1130C, U+1130F, U+11310, U+11313 to U+1132B, U+1132A to U+11330, U+11332, U+11333, U+11335 to U+11339, U+1133D to U+11344, U+11347, U+11348, U+1134B to U+1134D, U+11350, U+11357, U+1135D to U+11363, U+11366 to U+1136C and U+11370 to U+11374

  • Gujarati: U+0AFA to U+0AFF

  • Ideographic symbols and punctuation U+16FE1

  • Kannada: U+0C80

  • Khojki: U+1123E

  • Limbu: U+191D to U+191E

  • Old Hungarian: U+10C81, U+10C89, U+10C8B, U+10C90 to U+10C92, U+10C94 to U+10C98, U+10C9D to U+10C9F, U+10CA1, U+10CA8, U+10CA9, U+10CAC, U+10CC0 to U+10CF2 and U+10CFA to U+10CFF

  • Old Permic: U+10352, U+10353, U+1035A to U+1035E, U+10363 to U+10365, U+10368 to U+1036B and U+10371 to U+1037A

  • Tangut U+17002 to U+17004, U+17006 to U+1702F, U+17032 to U+17037 and U+17039 to U+187EC

  • Tangut components U+1880C to U+1880F, U+18811, U+18813, U+18816 to U+18818, U+1881C to U+1881F, U+18825, U+18826, U+18829, U+1882E to U+18830, U+18834 to U+1883C, U+1883E to U+1884C, U+1884E to U+18852, U+18857, U+18859, U+1885A, U+1885D, U+1885E, U+18860, U+18865, U+18867, U+18868, U+1886B to U+1886E, U+18870 to U+1887C, U+1887E to U+18881, U+18883 to U+1889B, U+1889D to U+188AF, U+188B1 to U+188C6, U+188C8 to U+188D6, U+188D9 to U+188E8, U+188EA to U+18919, U+1891E, U+18921 to U+18942, U+18946 to U+18959, U+1895B to U+18975, U+18977 to U+18AD4 and U+18AD6 to U+18AF2

  • Vedic extensions U+1CF7

Completely empty blocks

Most of these blocks have bitmap glyphs available in Unifont or Unifont Upper.

  • Ahom (U+11700...)

  • Bassa Vah (U+16AD0...)

  • Bhaiksuki (U+11C00...)

  • Caucasian Albanian (U+10530...)

  • Duployan (U+1BC00...)

  • Hatran (U+108E0...)

  • Khojki (U+11200...)

  • Khudawadi (U+112B0...)

  • Mahajani (U+11150...)

  • Manichaean (U+10AC0...)

  • Marchen (U+11C70...)

  • Masaram Gondi (U+11D00...)

  • Mende Kikakui (U+1E800...)

  • Miao (U+16F00...)

  • Modi (U+11600...)

  • Mongolian supplement (U+11660...)

  • Mro (U+16A40...)

  • Multani (U+11280...)

  • Nabataean (U+10880...)

  • Newa (U+11400...)

  • Nushu (U+1B170...)

  • Old North Arabian (U+10A80...)

  • Pahawh Hmong (U+16B00...)

  • Palmyrene (U+10860...)

  • Pau Cin Hau (U+11AC0...)

  • Psalter Pahlavi (U+10B80...)

  • Sharada (U+11180...)

  • Siddham (U+11580...)

  • Sora Sompeng (U+110D0...)

  • Soyombo (U+11A50...)

  • Sutton signwriting (U+1D800...)

  • Syriac supplement (U+0860...)

  • Takri (U+11680...)

  • Tirhuta (U+11480...)

  • Warang Citi (U+118A0...)

  • Zanabazar square (U+11A00...)

Some conscripts in the ConScript Unicode Registry (such as Klingon, Tengwar and Visible speech) have bitmap glyphs available in Unifont CSUR.

Some links