Russification Fundamentals
Encodings
Encoding is a way of assigning numeric codes to characters in character set.
There are several encodings used for representating cyrillic characters.
Two most popular encodings are:
- KOI8-R
-
KOI8-R is an official Internet standard encoding, as defined in
Registration of a Cyrillic Character Set (RFC 1489).
It is also a de facto cyrillic standard for e-mail and NNTP news.
In addition to that it is a standard encoding for UNIX systems.
- CP1251
-
CP1251 is MS Widows Code Page 1251.
It is a Microsoft standard encoding for cyrillic characters.
It also happens to be de facto standard for MS Windows platforms.
Fonts
Font is a table of glyphs, one for each character in character set.
Since glyphs are assigned to character numeric codes,
it is important to understand that
every font is written for a particular encoding
(there are CP1251 fonts and there are KOI8-R fonts).
Keyboard Layouts
This is a table that establishes correspondence
between keyboard keys and characters generated by them.
Since characters are repsesented by their numeric codes,
it is important to know what encoding is used in
a given keyboard layout table.
There are two popular cyrillic keyboard layouts.
JTsUKENG
Standard Russian typewriter keyboard layout.
Default in most computer systems.
Lower register:
Upper register (shifted):
YaWERTY
Phonetic cyrillic keyboard layout in which
cyrillic character 'A' is assigned to latin key 'A',
cyrillic character 'O' is assigned to latin 'O',
characters ',' and '.' retain their positions.
Very convenient if you have to type both in Russian and English.
Lower register:
Upper register:
Great Encoding War
Yes, there is a war out there between KOI8-R and CP1251.
You do not want to be a casualty of this war.
The best way to achieve this is:
-
On MS Windows systems use only CP1251 fonts and keyboard internally.
Since all the good Internet Windows software
(Netscape Navigator 4.0 and Microsoft Internet Explorer 3.0)
knows that KOI8-R is a standard external encoding,
it will automatically convert your mail/news from CP1251 to KOI8-R
when you send it out and from KOI8-R to CP1251 when you receiving it.
-
On UNIX systems use KOI8-R fonts and keyboards.
You do not need any encoding/decoding in your applications.
|