unicode (UTF8) support patch (alpha)

Forum for technical discussions regarding development. If you have a general suggestion, problem or comment, please use one of the other forums.

Moderator: OpenTTD Developers

Post Reply
gpsoft
Engineer
Engineer
Posts: 26
Joined: 25 Sep 2004 19:40
Location: Bratislava, Slovakia

unicode (UTF8) support patch (alpha)

Post by gpsoft »

Hi everybody:
Some information from readme:

This patch enables partial unicode (unicode strings in UTF8 encoding) support in openttd. Now there are supported unicode chars from 0x000 to 0x7FF, so it is enough to support russian, greek and all central european languages, which have different charset. Ascii codes from basic ascii table (0x00 to 0x7F) and openttd special codes from 0x80 to 0xBF) are mapped to original openttd charset. UTF8 encoded strings from 0x080 to 0x7FF are mapped to extended charset (unicode space).

So, I tested it with russian fonts, because I had no other fonts :)). I can't read it really, because I don't know russian, but I found some words ("russian", "english") with web dictionaries.
These fonts are loaded in this test binary to basic cyrillic alphabet positions in unicode table (0x410-0x44F).
This patch reserves 0x1680 (5760) sprite space, so maybe there is too small space for load own newgrf sprites (I didn't raised NUM_SPRITES).

This patch is not tested very well, I used only few strings to test basic funcionality of patch (I don't know russian :) ). I didn't test for example big and tiny characters.

So, I need testing this, next more translated strings in unicode charset with utf8 encoding, and another .grf fonts (especially for central european languages - latin2 charset).

See the files and screenshot to more information.

I hope, this patch will be usefull for people who don't use Latin1 charset and after bugfixing will be in one of next openttd releases.

gpsoft.
Attachments
screenshots with russian strings
screenshots with russian strings
russian2.png (283.29 KiB) Viewed 5622 times
screenshots with russian strings
screenshots with russian strings
russian1.png (283.1 KiB) Viewed 5622 times
unicode.zip
Path, compiled win32 binary and some other files.
(536.57 KiB) Downloaded 254 times
}T{Reme [Q_G]
Route Supervisor
Route Supervisor
Posts: 389
Joined: 04 Feb 2004 23:24
Contact:

Post by }T{Reme [Q_G] »

Great stuff :) Hmmm have you tested the program with japanese / chinese as well? Just wondering if that would work.
Siggy not gonna work unless someone allows javascripting...
User avatar
Dextro
Chief Executive
Chief Executive
Posts: 701
Joined: 12 Jan 2005 21:56
Location: Lisboa, Portugal
Contact:

Post by Dextro »

this is something that had been missing for some time now and I remember a thread about it's development somewhere in these forums :?
Uncle Dex Says: Follow the KISS Principle!
User avatar
Hadez
Traffic Manager
Traffic Manager
Posts: 217
Joined: 22 Jul 2004 21:25
Location: Jablonec nad Nisou, Czech republic
Contact:

Post by Hadez »

FYI the topic is here: http://tt-forums.net/viewtopic.php?t=9988. Hope to see the Czech and Slovak characters in the game soon :-)
gpsoft
Engineer
Engineer
Posts: 26
Joined: 25 Sep 2004 19:40
Location: Bratislava, Slovakia

Post by gpsoft »

}T{Reme [Q_G] wrote:Great stuff :) Hmmm have you tested the program with japanese / chinese as well? Just wondering if that would work.
No, I tested only russian fonts, I have no other fonts. But I think, japanese fonts have higher code positions in unicode table, than 0x800.

But it is possible to solve other ways. It is no problem to support true unicode (full 16-bit chars), but I need more space to sprites. Another way is to do some relocation, I can do it without problems, but then it is not true unicode.

Please tell me some information about chinese and japanese fonts. How many characters are used in your alphabet ? I know nothing about this.
gpsoft
Engineer
Engineer
Posts: 26
Joined: 25 Sep 2004 19:40
Location: Bratislava, Slovakia

Post by gpsoft »

Hadez wrote:FYI the topic is here: http://tt-forums.net/viewtopic.php?t=9988. Hope to see the Czech and Slovak characters in the game soon :-)
Yes, I hope too. This is the reason, why I am doing this (I am from Slovakia) . :) But I can't find iso8859-2 fonts (=.grf sprites), it is the problem. So, somebody must do it, but I don't know, how to create right grf icons.
User avatar
Hadez
Traffic Manager
Traffic Manager
Posts: 217
Joined: 22 Jul 2004 21:25
Location: Jablonec nad Nisou, Czech republic
Contact:

Post by Hadez »

Maybe you could ask devs?
}T{Reme [Q_G]
Route Supervisor
Route Supervisor
Posts: 389
Joined: 04 Feb 2004 23:24
Contact:

Post by }T{Reme [Q_G] »

Using the charmap program distributed by default on windows should help you sort out problems with making grf files for any language you have installed on your system. (start -> programs -> accessories -> system tools)

Yes I do think adding character sets as large as japanese and chinese will be a problem if you are going to use sprites to draw text. Is it possible to rewrite the code so it uses the system's printing functions to draw strings on the screen instead of using sprites? If you are using true UTF-8 encoding and map the .lang files to the correct character space it should be.
Siggy not gonna work unless someone allows javascripting...
gpsoft
Engineer
Engineer
Posts: 26
Joined: 25 Sep 2004 19:40
Location: Bratislava, Slovakia

Post by gpsoft »

It is not possible to use system fonts, because it is OS specific, so the we have problems on another platforms (linux, os/2, macos). I don't want to make the game incompatible between operating systems. The seconds reason to not use the system's fonts are problems with font sizes. We need 3 font size with specific heights.

So, we will use grf charsets and own fonts. I think it is not difficult to made a charset, possibly we can made it with some automatic tools. But first I need to know something about drawing fonts.
I will ask developers after going home, I have no irc access now.

About encoding: Now I am using UTF8 encoding, but only range from 0x0000 to 0x07FF. Of course, it is no problem to do full 16-bit encoding (from 0x0000 to 0xFFFF, utf8 range is above 16 bit, but all charsets are covered in 16 bit). But I need to allocate more space to sprites, and it needs much recoding.

Maybe I will use internally some different encoding with char remapping from unicode, but I am sure, the input lang files will have true utf8 encoding.
}T{Reme [Q_G]
Route Supervisor
Route Supervisor
Posts: 389
Joined: 04 Feb 2004 23:24
Contact:

Post by }T{Reme [Q_G] »

Hmm.... I dunno.. but im pretty sure its possible to release a custom truetype (or fixed) font along with openttd.. and load this file. (should work platform-independent, by distributing multiple formats of the font file) I've seen many games use custom font sets.

Dont get me wrong, I agree on your comment about differences in font shapes and sizes. Im just thinking about those "too many sprites" problems people have been having recently.

Just poked around in the SDL docs... and found that the whole system is already there : http://www.libsdl.org/cgi/docwiki.cgi/SDL_5fttf
Siggy not gonna work unless someone allows javascripting...
gpsoft
Engineer
Engineer
Posts: 26
Joined: 25 Sep 2004 19:40
Location: Bratislava, Slovakia

Post by gpsoft »

It is more easy to do it with .grf files.
too many sprites problems are easy to fix in future.
Please answer to my question about number of characters in chinese and japanese charset.
Nanaki13
Traffic Manager
Traffic Manager
Posts: 151
Joined: 08 Jan 2005 16:08

Post by Nanaki13 »

I have a couple of japanese fonts on my system, but i'm no expert on the matter. People say that there are over 40K characters, but only around 2K-3K in daily use.
Those fonts do have a LOT of chars in them.
User avatar
orudge
Administrator
Administrator
Posts: 24930
Joined: 26 Jan 2001 20:18
Skype: orudge
Location: Banchory, UK
Contact:

Post by orudge »

Weren't these all things that were worked on and or solved by Pipian? It seems a shame to let all that work go to waste, perhaps someone should get in touch with Pipian... old topic here.
gpsoft
Engineer
Engineer
Posts: 26
Joined: 25 Sep 2004 19:40
Location: Bratislava, Slovakia

Post by gpsoft »

orudge wrote:Weren't these all things that were worked on and or solved by Pipian? It seems a shame to let all that work go to waste, perhaps someone should get in touch with Pipian... old topic here.
I saw that topic, but he didn't released any patch about it.
Is there any patch released in sourceforge patch system ? I can't find any other patch or information about unicode.
But I'll try to contact Pipian, I hope he is visiting this site. Thank you for this information.
Pipian
Engineer
Engineer
Posts: 122
Joined: 10 Jul 2004 02:25
Contact:

Post by Pipian »

Found myself a bit busier than expected when coding the Unicode section. I still have the bitmap fonts prepared (Don't worry, it'll handle all the languages that most major fonts can handle, like Slovak and Greek and so forth), and the outline for the bitmap-rendering. I just never had the time to finish up working it in with the string rendering. Seems like a good idea, but it would be nice if we could extend to the entire charset (like I was trying to do, but found myself overwhelmed to do). I'd be willing to trade out the existing code if someone else has a bit more time to finish it up...
Post Reply

Return to “OpenTTD Development”

Who is online

Users browsing this forum: No registered users and 12 guests