Site Tools


why_no_utf8

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

why_no_utf8 [2008/12/10 19:06] (current)
Line 1: Line 1:
 +======Why doesn'​t EPIC support utf8?======
 +A question that comes up frequently is whether epic supports utf8 or not, and 
 +if it does not, when will it be supported? The simple answer is that it does 
 +not support utf8 because of a lack of expertise at converting programs to use 
 +utf8 within the epic community. Therefore, interested contributers are having ​
 +to learn all about the unicode way of doing things as they go along which is 
 +much slower than if someone who had done this before would step in and help 
 +us write the code to implement the many design changes
 +
 +Converting from ascii to unicode is very invasive to a program, and there are 
 +important questions to consider when you ask what it really means to support ​
 +utf8. This is not an exhaustive list but gives you an idea of the size of 
 +the effort.
 +
 +=====Column Counting=====
 +UTF8 breaks from the longstanding tradition that one byte equals one glyph 
 +equals one column on the screen. This affects things like column counting, ​
 +which is important for the input line, and for line wrapping. Much code has 
 +to be rewritten for this.
 +
 +=====Talking to people who can't do utf8======
 +The historical way of handling national character sets is to use code pages, ​
 +which map 128 glyphs into code points 128-255. Normally this is handled by 
 +the user's terminal emulator so epic has never had to worry about the details.
 +There will always be irc users who aren't using utf8 clients, so it will 
 +always be required for the client to support a remote target (channel or user) 
 +who can't do utf8. If you exchange messages and you're using utf8 and the 
 +other person isn't, then everything will be garbled. It is necessary for the 
 +client to be able to convert FROM utf8 TO any other encoding, and vice versa, ​
 +to really support utf8.
 +
 +=====Using utf8 when you don't have a utf8 terminal emulators=====
 +Additionally,​ there will always be epic users who aren't using utf8 terminal ​
 +emulators. But these users would like to be able to join utf8 channels and 
 +have everything Just Work. It is necessary for epic to be able to convert ​
 +FROM any input encoding TO utf8 and back again for these users.
 +
 +=====Scripts,​ /echo, and backwards compatability=====
 +Finally, once you open the door to unicode, you're talking about being able 
 +to support any encoding. How will this impact things like scripts? How will 
 +the /​echo'​s in your script output if you encode it in utf8 but the person ​
 +who uses your script doesn'​t use a utf8 emulator? We see this problem today 
 +when people use the default vga code page for linux console, but their 
 +scripts look all weird when you use them in a latin-1 font. So there needs 
 +to be some way for scripts to convert between encodings. ​
 +
 +======Summary======
 +I'm not trying to discourage you from thinking that epic will never have 
 +proper unicode support, but to help you understand this is not a simple ​
 +matter and the lack of any outside assistance means the work will be slow 
 +and steady, because there is a large amount of code to be written. Eventually ​
 +it will happen, but the only way to make it happen sooner is to help us write 
 +the code or recruit someone who will help us write the code. 
 +
  
why_no_utf8.txt ยท Last modified: 2008/12/10 19:06 (external edit)