Optimize jabber_id_new()

Mark Doliner mark at kingant.net
Tue Jun 30 14:51:27 EDT 2009


On Tue, Jun 30, 2009 at 11:04 AM, Paul Aurich<paul at darkrain42.org> wrote:
> And Mark Doliner spoke on 06/30/2009 01:31 AM, saying:
> <snip>
>> In other words: How does everyone feel about the attached patch?
>
> Like Daniel and Ethan, I think this is fine, with a few comments (see below).

Cool.  I'll make a few changes and check something in.

> +       for (c = str; *c != '\0'; c++) {
> +               switch (*c) {
> +                       case '@':
> +                               if (!at)
> +                                       at = c;
> +                               break;
> +                       case '/':
> +                               if (!at) {
> +                                       /*
> +                                        * If at is NULL then c is still pointing to the node
> +                                        * name, and node names can not contain forward slashes
> +                                        */
> +                                       return NULL;
> +                               }
> +                               if (!slash)
> +                                       slash = c;
> +                               break;
> +                       default:
>
> This introduces a few corner cases where invalid JIDs would get through (I
> think something like "foo@@bar.com"). That's probably not too much of an
> issue. It's also technically valid to have a JID of the form
> "domain/resource", which this would fail to validate (again, not actually
> too big of an issue).

Yeah, probably not too common, but I think correctness here is pretty important.

> I'd envision something like this (assuming enum { NODE, DOMAIN, RESOURCE }
> state), which I don't think should add much overhead, but addresses both of
> those issues:
>
> case '@':
>        if (state == NODE && !at) {
>                at = c;
>                state = DOMAIN;
>        } else if (state == DOMAIN) {
>                /* An '@' isn't valid in the domain portion of a JID */
>                return NULL;
>        }
>        break;
>
> case '/':
>        if (!slash) {
>                slash = c;
>                state = RESOURCE;
>
>                if (*(c+1) == '\0')
>                        /*
>                         * A JID containing '/' cannot have a 0-length
>                         * resource.
>                         */
>                        return NULL;
>        }
>        break;
>
> For the needs_validation == FALSE case, I would recommend moving to
> g_ascii_strdown().

I think these changes make sense.  I went with g_utf8_strdown()
instead of g_ascii_strdown() because the former takes a length
argument, and we want to dup and lowercase pieces of the string.  I'll
check if g_utf8_strdown() is significantly more expensive, and if it
is then maybe write our own g_ascii_strdown_len().

I just realized that there might also be a problem where we allow some
characters in the node that should only be allowed in the resource.
I'll try to unit test this before checking it in.

-Mark


More information about the Devel mailing list