Difference between revisions of "Talk:LCL Unicode Support"

From Free Pascal wiki
(Borut's thoughts)
Line 12: Line 12:
 
Maybe haveing a common interface-also between win32a ,win32u and wince-and move those functions/code which need modification to another place.
 
Maybe haveing a common interface-also between win32a ,win32u and wince-and move those functions/code which need modification to another place.
 
[[User:Roozbeh|Roozbeh]]
 
[[User:Roozbeh|Roozbeh]]
 +
 +
The goal is to enable LCL to support Unicode on WinNT+, at the same time not breaking any existing code and not departing from Lazarus spirit. This means relying on UTF-8 in strings internally, on either UTF-8 or ISO-pages in application strings (ISOpages option for backward compatibility; I would prefere systemwide global variable to make possible for the developer to define that at application start) and on either *A or *W (with appropriate conversions UTF-8<-->ISO and UTF-8/ISO<-->UTF-16), depending on the machine on which the developed application runs.
 +
 +
However, we have to be aware of the performance penalty: it is not just about (resource) strings and string constants of the application GUI, but for instance of DB-aware components. In this respect the Tnt approach is much better. However, achieving a uniform Unicode-supporting LCL (and RTL) for all Lazarus target platforms would, IMHO, realy make the difference for Lazarus.
 +
 +
This means enabling the existing components to adapt their operation, based on the capabilities of the win system on which the application is running. This also means - we should not forget this - to Unicode-enable/clone/modify lots of non visual units (all file system communication in the first place - remamber that on WinNT something like this C:\äöüćčšбвгд\филе.txt is a valid path/file name!) too. One can find lots of such stuff in Tnt.
 +
 +
Roozbeh, as you have already found, unicows just enables the mapping (*W API calls to *A API calls, using a single target ISO page (no mixing of Cyrilic and Arabic) on old systems for applications using *W, but does not provide any Unicode capability on those systems.
 +
 +
Vincent, regarding the option win32u - I have thought of that myself, but am not sure. Roozbeh has made a good point, of which I agree in principle. Maybe cloning win32 to win32u and starting to work there would just be safer in the first time, as we learn ourselves and geather experience. Later we could merge, if it then will seem better. Just my views. [[User:Borut|Borut]] 16:16, 22 May 2006 (CEST)

Revision as of 15:16, 22 May 2006

Some questions / things unclear

I want to undertand more and have the following question:

  • Suppose the LCL encodes all its strings using UTF8, could you display for instance cyrilic or Arabic in a Russian or Arabian version of win98, i.e. without using the *W winapi calls? As far as I know both together will be impossible then, because no code page supports both charsets. Please correct me, if wrong.

Maybe it is best, if we create a new widgetset win32u, that uses the unicode functions instead of ascii functions. Vincent 14:33, 20 May 2006 (CEST)

No it doesnt,actually with use of *W api calls it doesnt too!Win98 doesnt support *W calls,it is supported with unicode interface layer ie unicows,which it also doesnt do anything special just convert utf16 to ansi and opposite if required.

We can do that new widgetset too,it maybe make the code also more clear and maybe we can have some special optimizations too! but somebody should make sure other stuff between these 2 interfaces become sync and also as i did almost the very same for wince interface i can tell code conversion is not more than 5% of all codes,so we are doing lots of code duplication which is not good!

Maybe haveing a common interface-also between win32a ,win32u and wince-and move those functions/code which need modification to another place. Roozbeh

The goal is to enable LCL to support Unicode on WinNT+, at the same time not breaking any existing code and not departing from Lazarus spirit. This means relying on UTF-8 in strings internally, on either UTF-8 or ISO-pages in application strings (ISOpages option for backward compatibility; I would prefere systemwide global variable to make possible for the developer to define that at application start) and on either *A or *W (with appropriate conversions UTF-8<-->ISO and UTF-8/ISO<-->UTF-16), depending on the machine on which the developed application runs.

However, we have to be aware of the performance penalty: it is not just about (resource) strings and string constants of the application GUI, but for instance of DB-aware components. In this respect the Tnt approach is much better. However, achieving a uniform Unicode-supporting LCL (and RTL) for all Lazarus target platforms would, IMHO, realy make the difference for Lazarus.

This means enabling the existing components to adapt their operation, based on the capabilities of the win system on which the application is running. This also means - we should not forget this - to Unicode-enable/clone/modify lots of non visual units (all file system communication in the first place - remamber that on WinNT something like this C:\äöüćčšбвгд\филе.txt is a valid path/file name!) too. One can find lots of such stuff in Tnt.

Roozbeh, as you have already found, unicows just enables the mapping (*W API calls to *A API calls, using a single target ISO page (no mixing of Cyrilic and Arabic) on old systems for applications using *W, but does not provide any Unicode capability on those systems.

Vincent, regarding the option win32u - I have thought of that myself, but am not sure. Roozbeh has made a good point, of which I agree in principle. Maybe cloning win32 to win32u and starting to work there would just be safer in the first time, as we learn ourselves and geather experience. Later we could merge, if it then will seem better. Just my views. Borut 16:16, 22 May 2006 (CEST)