(Japanese) Wiki links

Method to edit non-UTF-8 encoded text files under UTF-8 vim

vim operated under UTF-8 environment can handle any text data of any known languages. But not all text data are stored in UTF-8 encoding. Thus, in order to edit non-UTF-8 encoded text files, you need to convert file contents using iconv before and after the edit. The choice of encoding used for the conversion can be listed with iconv -l command. (The processing of the encoding method name is case insensitive but differentiates hyphen and underscore.) Since it is quite cumbersome to manually convert encoding in shell, I describe easier method to access encoding conversion via vim.

As a matter of fact, standard vim under UTF-8 environment is set to fileencodings=ucs-bom,utf-8,default,latin1, each encoding in this list is tried in this order during the read process if it succeeds. At last, automatically latin1 is selected but it will not be readable display if chosen encoding is a wrong one. In this case, you can read it correctly by reloading with correct encoding.

Method to edit non-UTF-8 encoded text files from commandline

Method to edit non-UTF-8 encoded text files via GUI

You can add menu to gvim to cope with non-UTF-8 encoding files by adding following script to ~/.vimrc. You should customize this to your required encodings.

Here, please select Reload with ++enc... for reloading file and Save with ++enc... for save file in a particular encoding.

"
" Menu:                 Access to old encodings and conversion
" Translated By:        Osamu AOKI  <osamu@debian.org>
" Last Change:          30-Dec-2006.
if has('iconv')
  " Check iconv version
  let support_jisx0213 = (iconv("\x87\x64\x87\x6a", 'cp932', 'euc-jisx0213') ==# "\xad\xc5\xad\xcb") ? 1 : 0
  "

  an 10.328.100.100 &File.&Reload\ with\ ++enc\.\.\..&SJIS<Tab>fenc=cp932 :e ++enc=cp932<CR>
  if !support_jisx0213
    an 10.328.100.110 &File.&Reload\ with\ ++enc\.\.\..EUC&JP<Tab>fenc=euc-jp :e ++enc=euc-jp<CR>
    an 10.328.100.120 &File.&Reload\ with\ ++enc\.\.\..J&IS<Tab>fenc=iso-2022-jp :e ++enc=iso-2022-jp<CR>
  else
    an 10.328.100.110 &File.&Reload\ with\ ++enc\.\.\..EUC&JP<Tab>fenc=euc-jisx0213 :e ++enc=euc-jisx0213<CR>
    an 10.328.100.120 &File.&Reload\ with\ ++enc\.\.\..J&IS<Tab>fenc=iso-2022-jp-3 :e ++enc=iso-2022-jp-3<CR>
  endif
  an 10.328.100.130 &File.&Reload\ with\ ++enc\.\.\..EUC&KR<Tab>fenc=euckr :e ++enc=euckr<CR>
  an 10.328.100.140 &File.&Reload\ with\ ++enc\.\.\..&GB2312(zh_CN)<Tab>fenc=gb2312 :e ++enc=gb<CR>
  an 10.328.100.150 &File.&Reload\ with\ ++enc\.\.\..&BIG5(zh_TW)<Tab>fenc=big5 :e ++enc=big5<CR>
  an 10.328.100.200 &File.&Reload\ with\ ++enc\.\.\..-SEPRELOAD1- <Nop>
  an 10.328.100.201 &File.&Reload\ with\ ++enc\.\.\..latin&1<Tab>fenc=latin1 :e ++enc=latin1<CR>
  an 10.328.100.202 &File.&Reload\ with\ ++enc\.\.\..latin&2<Tab>fenc=latin2 :e ++enc=latin2<CR>
  an 10.328.100.203 &File.&Reload\ with\ ++enc\.\.\..latin&3<Tab>fenc=latin3 :e ++enc=latin3<CR>
  an 10.328.100.204 &File.&Reload\ with\ ++enc\.\.\..latin&4<Tab>fenc=latin4 :e ++enc=latin4<CR>
  an 10.328.100.205 &File.&Reload\ with\ ++enc\.\.\..latin&5<Tab>fenc=latin5 :e ++enc=latin5<CR>
  an 10.328.100.206 &File.&Reload\ with\ ++enc\.\.\..latin&6<Tab>fenc=latin6 :e ++enc=latin6<CR>
  an 10.328.100.207 &File.&Reload\ with\ ++enc\.\.\..latin&7<Tab>fenc=latin7 :e ++enc=latin7<CR>
  an 10.328.100.208 &File.&Reload\ with\ ++enc\.\.\..latin&8<Tab>fenc=latin8 :e ++enc=latin8<CR>
  an 10.328.100.209 &File.&Reload\ with\ ++enc\.\.\..latin&9<Tab>fenc=latin9 :e ++enc=latin9<CR>
  an 10.328.100.210 &File.&Reload\ with\ ++enc\.\.\..latin1&0<Tab>fenc=latin10 :e ++enc=latin10<CR>
  an 10.328.100.800 &File.&Reload\ with\ ++enc\.\.\..-SEPRELOAD2- <Nop>
  an 10.328.100.900 &File.&Reload\ with\ ++enc\.\.\..&UTF-8<Tab>fenc=utf-8 :e ++enc=utf-8<CR>

  " Save with ++enc as ...
  an 10.360.120.100 &File.&Save\ with\ ++enc\.\.\..&SJIS<Tab>fenc=cp932 :browse confirm saveas ++enc=cp932<CR>
  if !support_jisx0213
    an 10.360.120.110 &File.&Save\ with\ ++enc\.\.\..EUC&JP<Tab>fenc=euc-jp :browse confirm saveas ++enc=euc-jp<CR>
    an 10.360.120.120 &File.&Save\ with\ ++enc\.\.\..J&IS<Tab>fenc=iso-2022-jp :browse confirm saveas ++enc=iso-2022-jp<CR>
  else
    an 10.360.120.110 &File.&Save\ with\ ++enc\.\.\..EUC&JP<Tab>fenc=euc-jisx0213 :browse confirm saveas ++enc=euc-jisx0213<CR>
    an 10.360.120.120 &File.&Save\ with\ ++enc\.\.\..J&IS<Tab>fenc=iso-2022-jp-3 :browse confirm saveas ++enc=iso-2022-jp-3<CR>
  endif
  an 10.360.120.130 &File.&Save\ with\ ++enc\.\.\..EUC&KR<Tab>fenc=euckr :browse confirm saveas ++enc=euck<CR>
  an 10.360.120.140 &File.&Save\ with\ ++enc\.\.\..&GB(zh_CN)<Tab>fenc=gb :browse confirm saveas ++enc=gb<CR>
  an 10.360.120.150 &File.&Save\ with\ ++enc\.\.\..&BIG5(zh_TW)<Tab>fenc=big5 :browse confirm saveas ++enc=big5<CR>
  an 10.360.120.200 &File.&Save\ with\ ++enc\.\.\..-SEPSAVE1- <Nop>
  an 10.360.120.201 &File.&Save\ with\ ++enc\.\.\..latin&1<Tab>fenc=latin1 :browse confirm saveas ++enc=latin1<CR>
  an 10.360.120.202 &File.&Save\ with\ ++enc\.\.\..latin&2<Tab>fenc=latin2 :browse confirm saveas ++enc=latin2<CR>
  an 10.360.120.203 &File.&Save\ with\ ++enc\.\.\..latin&3<Tab>fenc=latin3 :browse confirm saveas ++enc=latin3<CR>
  an 10.360.120.204 &File.&Save\ with\ ++enc\.\.\..latin&4<Tab>fenc=latin4 :browse confirm saveas ++enc=latin4<CR>
  an 10.360.120.205 &File.&Save\ with\ ++enc\.\.\..latin&5<Tab>fenc=latin5 :browse confirm saveas ++enc=latin5<CR>
  an 10.360.120.206 &File.&Save\ with\ ++enc\.\.\..latin&6<Tab>fenc=latin6 :browse confirm saveas ++enc=latin6<CR>
  an 10.360.120.207 &File.&Save\ with\ ++enc\.\.\..latin&7<Tab>fenc=latin7 :browse confirm saveas ++enc=latin7<CR>
  an 10.360.120.208 &File.&Save\ with\ ++enc\.\.\..latin&8<Tab>fenc=latin8 :browse confirm saveas ++enc=latin8<CR>
  an 10.360.120.209 &File.&Save\ with\ ++enc\.\.\..latin&9<Tab>fenc=latin9 :browse confirm saveas ++enc=latin9<CR>
  an 10.360.120.210 &File.&Save\ with\ ++enc\.\.\..latin1&0<Tab>fenc=latin10 :browse confirm saveas ++enc=latin10<CR>
  an 10.360.120.800 &File.&Save\ with\ ++enc\.\.\..-SEPSAVE2- <Nop>
  an 10.360.120.900 &File.&Save\ with\ ++enc\.\.\..&UTF-8<Tab>fenc=utf-8 :browse confirm saveas ++enc=utf-8<CR>
endif

" filler to avoid the line above being recognized as a modeline
" filler
" filler
" filler

Input characters not found in the keyboad

See (English) (Japanese) as example.