Re: [E-devel] Evas Utf-8 patch

On Wed, 1 Nov 2006 08:42:00 +0900,
Carsten Haitzler (The Rasterman) <raster@rasterman.com> wrote :

> On Tue, 31 Oct 2006 23:21:46 +0100 Simon TRENY <simon.treny@free.fr>
> babbled:
> > Hi,
> > 
> > Here is a patch that changes the behaviour of
> > evas_common_font_utf8_get_next() and
> > evas_common_font_utf8_get_pref(): for now, these functions return 0
> > at the first invalid UTF-8 char met, and do not continue further.
> > With this patch, the functions will try to return a char code even
> > if the char is invalid and they will only return 0 at the end (or
> > at the start for get_prev()) of the string.
> > 
> > The returned code will probably be incorrect, but at least, when
> > Evas will render the string, the text won't be cut at the first
> > invalid char position. It happens quite often when we use E17 in an
> > accented language: the accented title of the windows are often
> > incomplete because the title is often encoded in ascii 8-bit (at
> > least for apps in French). It also helps when you have to render
> > text and you don't know the encoding (Metadata, subtitles, ...).
> > 
> > Note that with this patch, evas_common_font_utf8_get_next() will
> > return valid char code if the string is encoded in ascii 8-bit.
> i think this is more a matter that with e we need to convert whatever
> local encoding for icccm props (if the app isnt using utf8 for its
> string encodings of properites like titme) and convert to utf8.
> ecore_x SHOULD be doing that, so it ALWAYS presents utf8 text, but
> does not.

> fair enough to be more forgiving of malformed utf8 strings - but the
> problem just changes from being cut off to garbage in the middle of
> the string.

Thanks, I see that you applied the patch :)
Another solution to avoid garbage could be to skip every malformed
characters. If the string is malformed, I see no other solutions than
to try to guess what the character is (which is done with this patch)
but it may result in garbage, or to skip the char.