Archive of UserLand's first discussion group, started October 5, 1998.

Easy road to Double Byte in Frontier Land

Author:Lixian B. Chiu
Posted:4/14/1999; 1:50:02 PM
Topic:Easy road to Double Byte in Frontier Land
Msg #:5016
Prev/Next:5015 / 5017

While I am busy patching Frontier 6 to handle Chinese DG and search, I found an extremely easy way to use Double Byte in Frontier. The beauty of this method is that it doesn't break any website framework's function. Comparing it to MWU, it gives much lese headache to Frontier users who want to manage a double-byte language website.

The trick is to use UTF-8. UTF-8 provides backward compatibility to ASCII, so you don't have to worry about messing the non-ASCII characters with the standard 127 ASCII characters. What that means is, almost all of the special charaters that Frontier uses are safe under UTF-8 (with one exception, the chevron [«]; but I think we are supposed to use "//" instead starting from Frontier 5).

To use UTF-8 in Frontier, you need a Mac. If anyone knows a way to translate double byte to and from Unicode, please let me know. You also need an OSAX called "TEC" (http://www.bekkoame.ne.jp/~iimori/sw/TECOSAX.html). Yes, it's from Hideaki Iimori, the same person who brought us MWU.

Anyway, here is how to use TEC to make a double byte web site. First of all, make sure you turn off the pref "isoFilter". Then in your firstFilter script, write a few lines of code to convert adrPageTable^.adrobject^ from the classic double byte coding to "UTF-8" (you can write it in any way you want, it's very simple, if anyone interest, I can publish mine). The in your finalFilter, add a line (preferably at the end of the finalFilter script) to convert adrPageTable^.renderedText from "UTF-8" to the classic encoding that you want. Also, you should convert the adrPageTable^.adrobject^ back to the original encoding so that you can edit it easily.

If you don't want to do so many conversion during the rendering, you can skip the firstfilter, and "pre-convert" all your web page objects into "UTF-8". This way, the only conversion you will need is for the renderedText. When you need to edit a page object, just convert it back to the classic encoding manually.

Also, this method can also be used in Frontier 6's mainResponder. If search engine is not involved, you only need to change a few things. For example, with htmlInterfaces.people, I reuse almost all the default scripts. The only thing that I need to change is to catch the Posted Data, and convert them before they enter mainResponder.

(P.S. it wouldn't be too difficult to write the code conversion in UserTalk since Unicode conversion is nothing more than character mapping, and the maps are publicly accessible. Of course, an UserTalk solution won't be as fast as a C/C++ solution, but I can't program in these languages. If anyone wants to write a cross-platform DLL to do that, and needs help, please let me know, I will do as much as I can to help.)


There are responses to this message:


This page was archived on 6/13/2001; 4:49:21 PM.

© Copyright 1998-2001 UserLand Software, Inc.