Unicode characters

Richard Laager rlaager at wiktel.com
Wed Nov 21 17:11:46 EST 2007


On Tue, 2007-11-20 at 10:40 +0200, Francois Botha wrote:
> I refer to http://pidgin.im/pipermail/tracker/2007-October/016770.html

Referring to an archived ticket e-mail instead of the ticket URL itself
is dumb. For everyone else, here's the right URL:
http://developer.pidgin.im/ticket/3535

That said, this ticket is entirely unrelated to your problem.

> I did a simple test on IRC with the unicode character ê, U+00EA.  On
> Windows, this is entered either as ALT-136 or ALT-0234.

If you're talking about entering things into Pidgin with alt codes, I
don't know if that'll work. Pidgin is written with GTK+, which does
Unicode inserts with Ctrl-Shift-u.

> If I send the character, my colleagues can view it without problem
> (they're on MIRC).  If they send the characters, I see only question
> marks.
> 
> I attach a screenshot and a wireshark sniff session.  Seems the
> correct "ea" characters are being received by my Pidgin.

This is not UTF-8. See:
http://www.fileformat.info/info/unicode/char/00ea/index.htm

LATIN SMALL LETTER E WITH CIRCUMFLEX is U+00EA, which is encoded as 0xC3
0xAA, which you can see you're sending in the trace.

If your buddy is using ISO-8859-1 (iso-latin1) as their encoding, that
would explain why they're sending 0xEA. In this case, you may want to
set your encoding preference (in that IRC account's options in the
account editor) to "UTF-8,ISO-8859-1".  If they're using Pidgin, then
apparently the alt insert bytes are getting sent over the wire literally
or something.

Richard
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://pidgin.im/pipermail/devel/attachments/20071121/2b51e1c3/attachment.sig>


More information about the Devel mailing list