How to help your computer read and write Russian

I.   Introduction

<-- Go back to chapters


Encodings

Encoding is a way of assigning numeric codes to characters in character set. There are several encodings used for representating cyrillic characters. Two most popular encodings are:

KOI8-R
KOI8-R is an official Internet standard encoding, as defined in Registration of a Cyrillic Character Set (RFC 1489).
It is also a de facto cyrillic standard for e-mail and NNTP news. In addition to that it is a standard encoding for UNIX systems.

 

CP1251
CP1251 is MS Widows Code Page 1251.
It is a Microsoft standard encoding for cyrillic characters.
It also happens to be de facto standard for MS Windows platforms.

Fonts

Font is a table of glyphs, one for each character in character set.
Since glyphs are assigned to character numeric codes, it is important to understand that every font is written for a particular encoding (there are CP1251 fonts and there are KOI8-R fonts).

Keyboard Layouts

This is a table that establishes correspondence between keyboard keys and characters generated by them. Since characters are repsesented by their numeric codes, it is important to know what encoding is used in a given keyboard layout table.

There are two popular cyrillic keyboard layouts.

JTsUKENG

Standard Russian typewriter keyboard layout. Default in most computer systems.

Lower register:

Upper register (shifted):

YaWERTY

Phonetic cyrillic keyboard layout in which cyrillic character 'A' is assigned to latin key 'A', cyrillic character 'O' is assigned to latin 'O', characters ',' and '.' retain their positions. Very convenient if you have to type both in Russian and English.

Lower register:

Upper register:

Great Encoding War

Yes, there is a war out there between KOI8-R and CP1251.
You do not want to be a casualty of this war.
The best way to achieve this is:

Go back to chapters

Credits to Siber Systems Inc..