User Tools

Site Tools


encodings_and_locales

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
encodings_and_locales [2014/04/05 13:21]
sgrubnick
encodings_and_locales [2014/04/05 13:28] (current)
sgrubnick
Line 9: Line 9:
 Example of "old style" encodings: Example of "old style" encodings:
   * CP437       (very common for scripts)   * CP437       (very common for scripts)
-  * ISO-8859-1 ​ (very common for non-English ​european ​languages)+  * ISO-8859-1 ​ (very common for non-English ​European ​languages)
   * KOI8-R ​     (very common for Russian)   * KOI8-R ​     (very common for Russian)
  
Line 16: Line 16:
 ====New Style==== ====New Style====
 The "new style" uses a model where every character maps to a 32 bit integer (Unicode), and then is converted into 1 to 6 bytes which maximizes compatibility with the "old style" for ASCII. (UTF-8). ​ Although there are many incompatible ways to convert Unicode to data, UTF-8 is the one used on Unix. The "new style" uses a model where every character maps to a 32 bit integer (Unicode), and then is converted into 1 to 6 bytes which maximizes compatibility with the "old style" for ASCII. (UTF-8). ​ Although there are many incompatible ways to convert Unicode to data, UTF-8 is the one used on Unix.
 +
 +=====Encodings and Iconv=====
 +EPIC uses the iconv system to convert between encodings. ​ You can use any encoding that is supported by your system'​s iconv! ​ On my system, I can see all of the supported encodings with
 +
 +     iconv --list
 +
  
 =====Locales===== =====Locales=====
-The software ​you use needs to know what encoding you are using in order to properly handle your input and output.  In Unix, this is done with "​locales"​.+Now maybe you understand ​what encoding you are using.  ​But you have to tell the software you run about it. 
 +In Unix, this is done with "​locales"​.
  
 You can see the list of locales available on your system with the You can see the list of locales available on your system with the
Line 26: Line 33:
 A locale looks like //​language//​_//​country//​.//​encoding//​.  ​ A locale looks like //​language//​_//​country//​.//​encoding//​.  ​
  
-Some exampls:+Some examples:
  
 ^ Encoding Name ^ Encoding Explanation ^ ^ Encoding Name ^ Encoding Explanation ^
Line 44: Line 51:
 Then every program I run knows that I am using UTF-8 as my character encoding. Then every program I run knows that I am using UTF-8 as my character encoding.
  
 +Some programs, like GNU Screen have problems with UTF-8. ​ People have reported good success if you completely shut down your GNU Screen session (not just detach!) and set LC_ALL and then restart a new screen session. ​ You could also use TMUX which appears to handle UTF-8 very well.
  
 +Some programs, like XTerm, can support either the old or new style, based on menu options. ​ I really should create a good document discussing that, as well as other popular terminal emulators.
  
 +Your font also plays a role.  It's one thing for the software to know what encoding you're using, but if you use an incorrect font for that encoding, you still won't see what you expect. ​ I need to document what I know about using an appropriate UTF-8 font.
  
 =====See also===== =====See also=====
-This page is closely related to the [[encoding]] command.+This page deals with how YOU tell EPIC what YOU are using. ​ But other people on irc will probably be using different encodings. ​ The [[encoding]] command ​is how you tell EPIC what other people are using. 
encodings_and_locales.txt · Last modified: 2014/04/05 13:28 by sgrubnick