Optimize jabber_id_new()

Ethan Blanton elb at pidgin.im
Tue Jun 30 12:33:04 EDT 2009


Mark Doliner spake unto us the following wisdom:
> The jabber_id_new() function in libpurple/protocols/jabber/jutil.c is
> pretty expensive.  It creates a JabberID struct given the string
> version of a Jabber username (i.e. it splits
> "mark.doliner at gmail.com/Home" into "mark.doliner" "gmail.com" and
> "Home").  It also lowercases the node and domain, does utf8
> normalization, and does stringprep validation to ensure the JID is
> comprised only of characters allowed by the XMPP RFC.
> 
> We've optimized this function at Meebo.  In our testing we found that
> the vast majority of JIDs are made of these characters: a-z A-Z 0-9 @
> / { | } ~ . [ \ ] ^ _ ;  And so we do a quick first pass over the
> given string.  If the string contains only these characters than we
> skip g_utf8_normalize() and skip stringprep and only lowercase the
> node and domain.  Otherwise we do everything.
> 
> How do people feel about me checking this change into the jabber code
> in libpurple?  Meebo probably has a larger percentage of
> English-speaking users than Pidgin, so maybe our results are unfairly
> biased.  Does anyone know how common non-ASCII JIDs are?

This seems very reasonable to me.  If the "expensive" checks are
expensive enough that Meebo cares about them, we should avoid them
when they are unnecessary.  If it turns out that the short-circuit
checks are too expensive when jids *are* non-ASCII (which, looking at
the source, I doubt), we can revisit this again.  I share your and
Daniel's intuition that most jids will be ASCII anyway.

I say commit it as-is.  :-)

Ethan

-- 
The laws that forbid the carrying of arms are laws [that have no remedy
for evils].  They disarm only those who are neither inclined nor
determined to commit crimes.
		-- Cesare Beccaria, "On Crimes and Punishments", 1764
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 481 bytes
Desc: Digital signature
URL: <http://pidgin.im/pipermail/devel/attachments/20090630/271f5ebf/attachment.sig>


More information about the Devel mailing list