Optimize jabber_id_new()
Mark Doliner
mark at kingant.net
Tue Jun 30 14:51:27 EDT 2009
On Tue, Jun 30, 2009 at 11:04 AM, Paul Aurich<paul at darkrain42.org> wrote:
> And Mark Doliner spoke on 06/30/2009 01:31 AM, saying:
> <snip>
>> In other words: How does everyone feel about the attached patch?
>
> Like Daniel and Ethan, I think this is fine, with a few comments (see below).
Cool. I'll make a few changes and check something in.
> + for (c = str; *c != '\0'; c++) {
> + switch (*c) {
> + case '@':
> + if (!at)
> + at = c;
> + break;
> + case '/':
> + if (!at) {
> + /*
> + * If at is NULL then c is still pointing to the node
> + * name, and node names can not contain forward slashes
> + */
> + return NULL;
> + }
> + if (!slash)
> + slash = c;
> + break;
> + default:
>
> This introduces a few corner cases where invalid JIDs would get through (I
> think something like "foo@@bar.com"). That's probably not too much of an
> issue. It's also technically valid to have a JID of the form
> "domain/resource", which this would fail to validate (again, not actually
> too big of an issue).
Yeah, probably not too common, but I think correctness here is pretty important.
> I'd envision something like this (assuming enum { NODE, DOMAIN, RESOURCE }
> state), which I don't think should add much overhead, but addresses both of
> those issues:
>
> case '@':
> if (state == NODE && !at) {
> at = c;
> state = DOMAIN;
> } else if (state == DOMAIN) {
> /* An '@' isn't valid in the domain portion of a JID */
> return NULL;
> }
> break;
>
> case '/':
> if (!slash) {
> slash = c;
> state = RESOURCE;
>
> if (*(c+1) == '\0')
> /*
> * A JID containing '/' cannot have a 0-length
> * resource.
> */
> return NULL;
> }
> break;
>
> For the needs_validation == FALSE case, I would recommend moving to
> g_ascii_strdown().
I think these changes make sense. I went with g_utf8_strdown()
instead of g_ascii_strdown() because the former takes a length
argument, and we want to dup and lowercase pieces of the string. I'll
check if g_utf8_strdown() is significantly more expensive, and if it
is then maybe write our own g_ascii_strdown_len().
I just realized that there might also be a problem where we allow some
characters in the node that should only be allowed in the resource.
I'll try to unit test this before checking it in.
-Mark
More information about the Devel
mailing list