Differences between revisions 1 and 90 (spanning 89 versions)
Revision 1 as of 2006-06-04 08:00:16
Size: 4537
Editor: PaulWise
Comment: Create unicode coverage of debian fonts page
Revision 90 as of 2021-09-15 18:51:16
Size: 4486
Editor: ?KessVargavind
Comment: Unicode 14
Deletions are marked like this. Additions are marked like this.
Line 1: Line 1:
This page aims to document the Unicode coverage of all the fonts (including non-free ones) in Debian. It was created using gucharmap. You should also read the ["DebianInstaller/GUIFonts"] page to find out about the fonts used by the installer. #language en
This page aims to document the Unicode coverage of all the fonts (including non-free ones) in Debian. It was created using `fontforge`, `font-manager` and `gucharmap`. You should also read the [[DebianInstaller/GUIFonts]] page to find out about the fonts used by the installer.

Please add any fonts which might fill in some of these blocks to the [[Fonts/Missing]] page.

TODO: automatically create this page based on the latest unicode-data package as well as all fonts in Debian unstable, using the fonts review:

https://pkg-fonts.alioth.debian.org/review/
Line 5: Line 12:
Wikipedia languages whose names Firefox could not render. Inuktitut and Yi could not be rendered at all. Kannada and Sichuan Yi had some problems. Firefox seems to render all Wikipedia languages:
Line 7: Line 14:
http://meta.wikimedia.org/wiki/List_of_Wikipedias  http://meta.wikimedia.org/wiki/List_of_Wikipedias
   [[https://phabricator.wikimedia.org/diffusion/MW/browse/master/languages/data/Names.php]]
Line 9: Line 18:
= Strange =

!DejaVu sans uses some of the private use area, is this valid?
Line 14: Line 20:
With missing characters mentioned, most of these are available as bitmap glyphs in ''Unifont'' or ''Unifont Upper''.
Line 15: Line 22:
 * Symbol modifier letters: U+02EF to U+02F2, U+02F4 to U+02F6, U+02F8 to U+02FF
 * Hebrew: U+05C6. Looks like several other characters in this block use this symbol - U+05A2, U+05C5 and U+05C7 also have the square with numbers in them.
 * Syriac: U+072D to U+072F, U+074D to U+074F
 * Devanagari: U+097D
 * Gujarati: U+0ABC, U+0AE1 to U+0AE3
 * Tamil: U+0BB6, U+0BE6, U+0BE6, U+0BF3 to U+0BFA
 * Teluga: looks like it draws on 2 chars in some other sections - U+0C55 and U+25CC
 * Kannada: U+0C8C, U+0CBC, U+0CBD, U+0CE1 and draws on U+25CC
 * Sinhala: 18 missing chars, plus it draws on a few others. 25 chars missing parts in total.
 * Thai: 16 chars draw on non-printing character U+200D
 * Tibetan: U+0FD0 and U+0FD1 look buggy, as they show the letters 0FD0 and 0FD1 instead of being symbols. The font is Tibetan Machine Uni.
 * Ethiopic: U+1207, U+1247, U+1287
 * Phonetic extensions: 18 characters missing
 * Phonetic extensions supplement: 23
 * General punctuation: 5
 * Superscripts and subscripts: 5
 * Currency: german penny and former argentinian currency (AUSTRAL)
 * Letterlike symbols: 19
 * Arrows: 26
 * Mathmatical operators: 13
 * Misc. symbols: 14
 * Coptic: > 60
 * CJK radicals supplement: about 20
 * CJK symbols and punctuation: 15
 * Katakana: U+30FF
 * Enclosed CJK letters and month: about half
 * CJK compatability: about a third
 * CJK Unified Ideographs: this is a huge block, with about an eighth not defined.
 * Arabic presentation forms A: about 2 thirds to 3 quarters empty
 * CJK compatability forms: 5
 * small form variants: U+FE58 SMALL EM DASH
 * half width and fullwidth forms: the hangul ones, 20 or so
 * linear B syllabry: 14
 * linear B ideograms: about half
 * Aegean numbers: 9 - subunit ones
 * Deseret: 4 - captial/small EW and OI
 * '''Ahom''': U+11740 to U+11746
 * '''Arabic''': U+061D
 * '''Arabic extended-A''': U+08B5, U+08C8 to U+08D2 and U+08E2
 * '''Arabic presentation forms-A''': U+FBC2, U+FD40 to U+FD4F, U+FDCF and U+FDFE to U+FDFF
 * '''Balinese''': U+1B4C, U+1B7D and U+1B7E
 * '''Bopomofo extended''': U+31BC to U+31BF
 * '''Brahmi''': U+11070 to U+11075
 * '''Chakma''': U+11147
 * '''CJK unified ideographs''': U+9FF0 to U+9FFF
 * '''CJK unified ideographs extension A''': U+4DB6 to U+4DBF
 * '''CJK unified ideographs extension B''': U+2A6D7 to U+2A6DE and U+2A6DF
 * '''CJK unified ideographs extension C''': U+2B735 to U+2B738
 * '''CJK unified ideographs extension G''': U+30000 to U+30728, U+3072A to U+30EDC, U+30EDF to U+3106B and U+3106D to U+3134A
 * '''Combining diacritical marks extended''': U+1AC1 to U+1ACE
 * '''Combining diacritical marks supplement''': U+1DFA
 * '''Currency symbols''': U+20C0
 * '''Enclosed alphanumeric supplement''': U+1F10D to U+1F10F, U+1F16D to U+1F16F and U+1F1AD
 * '''Geometric shapes extended''': U+1F7F0
 * '''Glagolitic''': U+2C2F and U+2C5F
 * '''Hebrew''': U+05EF
 * '''Ideographic symbols and punctuation''' U+16FF0 and U+16FF1
 * '''Kaithi''': U+110C2
 * '''Kana extended-A''': U+1B11F to U+1B122
 * '''Kannada''': U+0CDD
 * '''Latin extended-D''': U+!A7C0, U+!A7C1, U+!A7D0, U+!A7D1, U+!A7D3, U+!A7D5 to U+!A7D9 and U+!A7F2 to U+!A7F4
 * '''Mongolian''': U+180F
 * '''Musical symbols''': U+1D1E9 and U+1D1EA
 * '''Oriya''': U+0B55
 * '''Supplemental punctuation''': U+2E53 to U+2E5D
 * '''Supplemental symbols and U+pictographs''': U+1F979 and U+1F9CC
 * '''Symbols and U+pictographs extended-A''': U+1FA7B, U+1FA7C, U+1FAA9 to U+1FAAC, U+1FAB7 to U+1FABA, U+1FAC3 to U+1FAC5, U+1FAD7 to U+1FAD9, U+1FAE0 to U+1FAE7 and U+1FAF0 to U+1FAF6
 * '''Tagalog''': U+170D, U+1715 and U+171F
 * '''Tangut components''' U+18AF3 to U+18AFF
 * '''Telugu''': U+0C3C and U+0C5D
 * '''Takri''': U+116B9
 * '''Transport and U+map symbols''': U+1F6DD to U+1F6DF
 * '''Vedic extensions''' U+1CFA
Line 52: Line 60:
= Mostly empty = = Completely empty blocks =
Line 54: Line 62:
Misc tech, Control pics, misc math symbols A, CJK Unified Ideographs Extension A & B, CJK Unified Ideographs supplement. Most of these blocks have bitmap glyphs available in ''Unifont'' or ''Unifont Upper''.
Line 56: Line 64:
For the CJK parts, Arne Götje (高盛華) says that they are seldom used, that CJK daily use characters are supported and that he is working on the rest of them, but they won't be finished any time soon.  * '''Arabic extended B''' (U+0870...)
 * '''Chorasmian''' (U+10FB0...)
 * '''Cypro-Minoan''' (U+12F90...)
 * '''Dives Akuru''' (U+11900...)
 * '''Ethiopic extended-B''' (U+1E7E0...)
 * '''Indic Siyaq numbers''' (U+1EC71...)
 * '''Kana extended-B''' (U+1AFF0...)
 * '''Khitan small script''' (U+18B00...)
 * '''Latin extended-F''' (U+10780...)
 * '''Latin extended-G''' (U+1DF00...)
 * '''Lisu supplement''' (U+11FB0)
 * '''Makasar''' (U+11EE0...)
 * '''Nandinagari''' (U+119A0...)
 * '''Old Uyghur''' (U+10F70...)
 * '''Ottoman Siyaq numbers''' (U+1ED00...)
 * '''Syriac supplement''' (U+0860...)
 * '''Tangsa''' (U+16A70...)
 * '''Tangut supplement''' (U+18D00...)
 * '''Toto''' (U+1E290...)
 * '''Unified Canadian Aboriginal Syllabics extended-A''' (U+11AB0...)
 * '''Vithkuqi''' (U+10570...)
 * '''Znamenny musical notation''' (U+1CF00...)
Line 58: Line 87:
= No Glyphs or almost empty =

Ethiopic supplement, Unified Canadian Aboriginal Syllabics, Myanmar, Ogham, Tagalog, Buhid, Tanbanwa, Mongolian, New Tai Lue, Kanbun, ethiopic extended, Katakana phonetic extensions, Yijing hexagram symbols, Yi symbols, Yi radicals, ancient greek numbers, byzantine musical symbols, musical symbols, ancient greek musical notation, Tai Xian Jing Symbols, combining diacritical marks, OCR, supplemental arrows A & B, misc math symbols B, supplemental math operators, misc symbols and arrows, supplemental punctuation, ideographic description characters, modifier tone letters, variation selectors, mathematical alphanumeric symbols and variation selectors supplement and combining diacritical marks for symbols (only one character - from !FreeSerif).

For myanmar, at the moment there is a proposal to update this block to support Mon and some other languages from Burma. There are several sources for fonts for this, although some intrude on parts of Unicode that are not yet defined. Burmese is in a state of flux at the moment.

Also Klingon uses U+!F8D0 to U+F8FF in the private use area, but there are no fonts for it in Debian.
Some conscripts in the [[http://www.evertype.com/standards/csur/|ConScript Unicode Registry]] (such as Klingon, Tengwar and Visible speech) have bitmap glyphs available in ''Unifont CSUR''.
Line 68: Line 91:
http://www.unifont.org/fontguide/
http://www.alanwood.net/unicode/fonts.html
http://en.wikipedia.org/wiki/Unicode_font
http://www.evertype.com/celtscript/
http://www.travelphrases.info/fonts.html
http://my.wikipedia.org/wiki/Wikipedia:Font
http://sourceforge.net/projects/prahita
http://sourceforge.net/projects/uniburma
http://www.evertype.com/celtscript/ogfont.html
http://www.gov.nu.ca/font.htm
http://www.gov.nu.ca/Nunavut/English/font/
http://www.evertype.com/celtscript/inuktitut.html
http://www.google.com/search?q=Inuktitut+fonts
http://www.evertype.com/celtscript/ogfont.html
 http://en.wikipedia.org/wiki/Unicode_font
 
 http://www.unicode.org/standard/supported.html
 
 http://www.unicode.org/standard/unsupported.html
 
 http://www.unifont.org/fontguide/
 
 http://www.alanwood.net/unicode/fonts.html
 
 http://www.travelphrases.info/fonts.html

This page aims to document the Unicode coverage of all the fonts (including non-free ones) in Debian. It was created using fontforge, font-manager and gucharmap. You should also read the DebianInstaller/GUIFonts page to find out about the fonts used by the installer.

Please add any fonts which might fill in some of these blocks to the Fonts/Missing page.

TODO: automatically create this page based on the latest unicode-data package as well as all fonts in Debian unstable, using the fonts review:

https://pkg-fonts.alioth.debian.org/review/

Quick test

Firefox seems to render all Wikipedia languages:

Incomplete blocks

With missing characters mentioned, most of these are available as bitmap glyphs in Unifont or Unifont Upper.

  • Ahom: U+11740 to U+11746

  • Arabic: U+061D

  • Arabic extended-A: U+08B5, U+08C8 to U+08D2 and U+08E2

  • Arabic presentation forms-A: U+FBC2, U+FD40 to U+FD4F, U+FDCF and U+FDFE to U+FDFF

  • Balinese: U+1B4C, U+1B7D and U+1B7E

  • Bopomofo extended: U+31BC to U+31BF

  • Brahmi: U+11070 to U+11075

  • Chakma: U+11147

  • CJK unified ideographs: U+9FF0 to U+9FFF

  • CJK unified ideographs extension A: U+4DB6 to U+4DBF

  • CJK unified ideographs extension B: U+2A6D7 to U+2A6DE and U+2A6DF

  • CJK unified ideographs extension C: U+2B735 to U+2B738

  • CJK unified ideographs extension G: U+30000 to U+30728, U+3072A to U+30EDC, U+30EDF to U+3106B and U+3106D to U+3134A

  • Combining diacritical marks extended: U+1AC1 to U+1ACE

  • Combining diacritical marks supplement: U+1DFA

  • Currency symbols: U+20C0

  • Enclosed alphanumeric supplement: U+1F10D to U+1F10F, U+1F16D to U+1F16F and U+1F1AD

  • Geometric shapes extended: U+1F7F0

  • Glagolitic: U+2C2F and U+2C5F

  • Hebrew: U+05EF

  • Ideographic symbols and punctuation U+16FF0 and U+16FF1

  • Kaithi: U+110C2

  • Kana extended-A: U+1B11F to U+1B122

  • Kannada: U+0CDD

  • Latin extended-D: U+A7C0, U+A7C1, U+A7D0, U+A7D1, U+A7D3, U+A7D5 to U+A7D9 and U+A7F2 to U+A7F4

  • Mongolian: U+180F

  • Musical symbols: U+1D1E9 and U+1D1EA

  • Oriya: U+0B55

  • Supplemental punctuation: U+2E53 to U+2E5D

  • Supplemental symbols and U+pictographs: U+1F979 and U+1F9CC

  • Symbols and U+pictographs extended-A: U+1FA7B, U+1FA7C, U+1FAA9 to U+1FAAC, U+1FAB7 to U+1FABA, U+1FAC3 to U+1FAC5, U+1FAD7 to U+1FAD9, U+1FAE0 to U+1FAE7 and U+1FAF0 to U+1FAF6

  • Tagalog: U+170D, U+1715 and U+171F

  • Tangut components U+18AF3 to U+18AFF

  • Telugu: U+0C3C and U+0C5D

  • Takri: U+116B9

  • Transport and U+map symbols: U+1F6DD to U+1F6DF

  • Vedic extensions U+1CFA

Completely empty blocks

Most of these blocks have bitmap glyphs available in Unifont or Unifont Upper.

  • Arabic extended B (U+0870...)

  • Chorasmian (U+10FB0...)

  • Cypro-Minoan (U+12F90...)

  • Dives Akuru (U+11900...)

  • Ethiopic extended-B (U+1E7E0...)

  • Indic Siyaq numbers (U+1EC71...)

  • Kana extended-B (U+1AFF0...)

  • Khitan small script (U+18B00...)

  • Latin extended-F (U+10780...)

  • Latin extended-G (U+1DF00...)

  • Lisu supplement (U+11FB0)

  • Makasar (U+11EE0...)

  • Nandinagari (U+119A0...)

  • Old Uyghur (U+10F70...)

  • Ottoman Siyaq numbers (U+1ED00...)

  • Syriac supplement (U+0860...)

  • Tangsa (U+16A70...)

  • Tangut supplement (U+18D00...)

  • Toto (U+1E290...)

  • Unified Canadian Aboriginal Syllabics extended-A (U+11AB0...)

  • Vithkuqi (U+10570...)

  • Znamenny musical notation (U+1CF00...)

Some conscripts in the ConScript Unicode Registry (such as Klingon, Tengwar and Visible speech) have bitmap glyphs available in Unifont CSUR.

Some links