[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [E-devel] Evas Utf-8 patch



On Wed, 1 Nov 2006 09:48:54 +0100 Simon TRENY <simon.treny@free.fr> babbled:

> On Wed, 1 Nov 2006 08:42:00 +0900,
> Carsten Haitzler (The Rasterman) <raster@rasterman.com> wrote :
> 
> > On Tue, 31 Oct 2006 23:21:46 +0100 Simon TRENY <simon.treny@free.fr>
> > babbled:
> > 
> > > Hi,
> > > 
> > > Here is a patch that changes the behaviour of
> > > evas_common_font_utf8_get_next() and
> > > evas_common_font_utf8_get_pref(): for now, these functions return 0
> > > at the first invalid UTF-8 char met, and do not continue further.
> > > With this patch, the functions will try to return a char code even
> > > if the char is invalid and they will only return 0 at the end (or
> > > at the start for get_prev()) of the string.
> > > 
> > > The returned code will probably be incorrect, but at least, when
> > > Evas will render the string, the text won't be cut at the first
> > > invalid char position. It happens quite often when we use E17 in an
> > > accented language: the accented title of the windows are often
> > > incomplete because the title is often encoded in ascii 8-bit (at
> > > least for apps in French). It also helps when you have to render
> > > text and you don't know the encoding (Metadata, subtitles, ...).
> > > 
> > > Note that with this patch, evas_common_font_utf8_get_next() will
> > > return valid char code if the string is encoded in ascii 8-bit.
> > 
> > i think this is more a matter that with e we need to convert whatever
> > local encoding for icccm props (if the app isnt using utf8 for its
> > string encodings of properites like titme) and convert to utf8.
> > ecore_x SHOULD be doing that, so it ALWAYS presents utf8 text, but
> > does not.
> > 
> 
> > fair enough to be more forgiving of malformed utf8 strings - but the
> > problem just changes from being cut off to garbage in the middle of
> > the string.
> 
> Thanks, I see that you applied the patch :)
> Another solution to avoid garbage could be to skip every malformed
> characters. If the string is malformed, I see no other solutions than
> to try to guess what the character is (which is done with this patch)
> but it may result in garbage, or to skip the char.

the real solution is to know the encoding of the title string if it is not
already a utf8 one - and convert.

-- 
------------- Codito, ergo sum - "I code, therefore I am" --------------
The Rasterman (Carsten Haitzler)    raster@rasterman.com
裸好多
Tokyo, Japan (東京 日本)