North America's Indigenous Languages
This page is for developers interestested in developing language support for North America's indigenous languages. It lists the information we have and ideas how to implement those languages into Debian.
0. Introduction
North America has over 100 tribes each with their own language and dialects. To date none of those languages is supported by any operating system on the market. All of the Aborignal Languages have an ISO 639-3 code. Many of them are documented in Ethnologue as well as at Languagegeek. We can create locales for these languages as well as getting orthographies into fontconfig.
The languages use latin characters with diacritics, Syllabics, and Cherokee script. The idea is to creat .orth files for fontconfig for each language. These can then be submitted to fontconfig and the data can be used to create xkb keyboard layouts. The keyboard layouts can then be used to type in locale information.
Once all of the font, keyboard, and locale information is entered, then the real work of translating the operating system begins.
1. Languages Overview
Approximately fourty-five First Nations, Inuktitut, and Métis languages are spoken in Canada. The languages with the greatest number of speakers are Cree-Montagnais-Naskapi (60,000); Ojibwe (40,000); Inuktitut (20,000); Chipewyan (4,000-12,000); Mi’kmaq (3,000-5,000); Mohawk (3,800); Assiniboine (3,600); Slave (3,000); Babine, Dogrib, Carrier, Chilcotin, and Blackfoot (2,000 each); Gitksan and Malecite (1,000 each); Gwich’in (500 in Canada, 700 in Alaska); and Nisgha (700-1000). 1
Language (en) |
Language (native) |
ISO639-3 |
Proposed Locale |
Fontconfig |
xkb |
Remarks |
Western Abenaki |
abe_CA |
|
ca(abe) |
|
||
Algonquin |
alq_CA |
|
ca(alq) |
|
||
Assiniboine |
asb_CA |
|
ca(asb) |
|
||
Attikamek-Cree |
atj_CA |
|
ca(atj) |
|
||
Siksika–Kainaa–Piikani |
bla_CA |
|
ca(bla) |
|
||
Bella Coola |
blc_CA |
|
ca(blc) |
|
||
Central Carrier, Southern Carrier |
caf_CA |
|
ca(caf) |
|
||
Cayuga |
cay_CA |
|
ca(cay) |
|
||
Chipewyan |
chp_CA |
|
ca(chp) |
|
||
Cherokee |
chr_CA |
ca(chr) |
|
|||
Cheyenne |
chy_CA |
|
ca(chy) |
|
||
Klallam |
clm_CA |
|
ca(clm) |
|
||
Comox/Homalco/Klahoose/Sliammon |
coo_CA |
|
ca(coo) |
|
||
Southern East Cree |
crj_CA |
|
ca(crj) |
|
||
Plains Cree |
crk_CA |
|
ca(crk) |
|
||
Northern East Cree |
crl_CA |
|
ca(crl) |
|
||
Moose Cree |
crm_CA |
|
ca(crm) |
|
||
Swampy Cree |
csw_CA |
|
ca(csw) |
|
||
Woods Cree |
cwd_CA |
|
ca(cwd) |
|
||
Tlicho |
dgr_CA |
|
ca(dgr) |
|
||
Inuktitut, Eastern |
esb_CA |
|
ca(esb) |
|
||
Inuktitut, Western |
esb_CA |
|
ca(esb) |
|
||
Inupiaq |
esi_CA |
|
ca(esi) |
|
||
Gitksan |
git_CA |
|
ca(git) |
|
||
Kutchin |
gwi_CA |
|
ca(gwi) |
|
||
Hän |
haa_CA |
|
ca(haa) |
|
||
Haida Language |
hdn_CA |
|
ca(hdn) |
|
||
Heiltsuk |
hei_CA |
|
ca(hei) |
|
||
Halq’eméylem, Hǝn̓q̓ǝmin̓ǝm̓, Hul’q’umín’um’ |
hur_CA |
|
ca(hur) |
|
||
Inuvialukton |
ikt_CA |
|
ca(ikt) |
|
||
Kanza |
ksk_CA |
|
ca(ksk) |
|
||
Kutenai |
kut_CA |
|
ca(kut) |
|
||
Kwakiutl |
kwk_CA |
|
ca(kwk) |
|
||
Lillooet |
lil_CA |
|
ca(lil) |
|
||
Teton |
lkt_CA |
|
ca(lkt) |
|
||
Lushootseed |
lut_CA |
|
ca(lut) |
|
||
Menominee |
mez_CA |
|
ca(mez) |
|
||
Mandan |
mhq_CA |
|
ca(mhq) |
|
||
Micmac |
mic_CA |
|
ca(mic) |
|
||
Miwok |
mkq_CA |
|
ca(mkq) |
|
||
Innu-Montagnais |
moe_CA |
|
ca(moe) |
|
||
Mohawk |
moh_CA |
|
ca(moh) |
|
||
Creek |
mus_CA |
|
ca(mus) |
|
||
Navajo |
nav_CA |
|
ca(nav) |
|
||
Nishga |
ncg_CA |
|
ca(ncg) |
|
||
Nootka |
noo_CA |
|
ca(noo) |
|
||
Naskapi |
nsk_CA |
|
ca(nsk) |
|
||
Ojibwa |
ojb_CA |
|
ca(ojb) |
|
||
Oji-Cree |
ojs_CA |
|
ca(ojs) |
|
||
Oneida |
one_CA |
|
ca(one) |
|
||
Onondaga |
ono_CA |
|
ca(ono) |
|
||
Odawa |
otw_CA |
|
ca(otw) |
|
||
Potawatomi |
pot_CA |
|
ca(pot) |
|
||
Sahtú Got’ine, K’áshogot’ine, Shihgot’ine |
scs_CA |
|
ca(scs) |
|
||
Seneca |
see_CA |
|
ca(see) |
|
||
Sekani |
sek_CA |
|
ca(sek) |
|
||
Shuswap |
shs_CA |
ca(shs) |
|
|||
Stoney |
sto_CA |
|
ca(sto) |
|
||
Straits Salish |
str_CA |
|
ca(str) |
|
||
Thompson |
thp_CA |
|
ca(thp) |
|
||
Tlingit |
tli_CA |
|
ca(tli) |
|
||
Tsimshian |
tsi_CA |
|
ca(tsi) |
|
||
Tuscarora |
tus_CA |
|
ca(tus) |
|
||
Wiyot |
wiy_CA |
|
ca(wiy) |
|
||
Southern Slavey |
xsl_CA |
|
ca(xsl) |
|
||
Yokuts |
yok_CA |
|
ca(yok) |
|
2. Discussion
2.1. General Information
- Some people have been doing work for the Inuktitut and Ojibway languages.
- Ubuntu Secwepemc Translators are translating into Secwepemctsin, with work also being submitted into Debian.
- Orthographies should be submitted to fontconfig so we can figure out proper font coverage for each language.
- ttf-lg-aboriginal is a font that should work for most languages.
2.2. Keyboard Layout Ideas
The default keyboard layout in US and Canada is the standard US English layout (104 keys). There are three types of writing amongst the North American Aboriginal/Indigenous Population: Roman Orthographies, Syllabics, and Cherokee script.
The Roman Orthographies can use an include(us) statement, with a third-level chooser to type the accents.
There is an Inuktitut keyboard available in xkeyboard-config. Will have to see if this Inuktitut keyboard can be used for the other languages that use syllabics as well.
For the Cherokee Script there are a two keyboard layouts out there:
http://arapaho.nsuok.edu/~lindee/cherokee/ - this has been created and is part of xkeyboard-config in testing http://www.cherokee.org/Extras/Downloads/Font/Keyboard.htm
3. Fonts with support for the special characters
Font |
Debian pkg |
Remarks |
ttf-lg-aboriginal |
|
|
ttf-sil-charis |
|
|
ttf-dejavu |
|
The best font coverage for Aboriginal languages comes from the Debian package ttf-lg-aboriginal. Using fc-list :lang=chr, the Aboriginal Series font is the only font that shows up. There is also a Cherokee font from the Cherokee Nation will have to find the status
4. Locale Data Collection
Currently the one locale that has been created and is accepted upstream is shs_CA. There are locales for nsk_CA, kut_CA and mic_CA that have not been submitted upstream.
5. Fontconfig
Fontconfig has a directory to add .orth files these contain a listing of the Unicode code points necessary to display a language. The GDM uses fontconfig to determine if a certain language is displayable.
There is currently an shs.orth file upstream and a few more patches have been submitted. Should make patches against the Debian packages until the .orth files work there way through the release cycle.
Surveyors
Neskie Manuel <neskiem AT gmail DOT com>