security review and patches for libpurple
Dan Auerbach
dtauerbach at eff.org
Sun Jul 17 18:04:42 EDT 2011
On 07/17/2011 02:54 PM, Ethan Blanton wrote:
> Dan Auerbach spake unto us the following wisdom:
>>> I also note a number of changes of the form:
>>>
>>> buffer = malloc(some expression);
>>> [...]
>>> -strcpy(buffer, source);
>>> +g_strlcpy(buffer, source, some expression);
>>>
>>> This is not substantially superior to the preexisting strcpy. It may
>>> silence some automated tools, but it introduces an additional way in
>>> which the code in question can be wrong. Specifically, the expression
>>> which is duplicated in the malloc and strlcpy can diverge with
>>> disastrous consequences. Assuming the expression is correct, the
>>> strcpy implementation is more robust than the strlcpy implementation.
>>> If the expression is incorrect, all bets are off in both
>>> implementations without more contextual information.
>> I understand the concern about the expressions diverging. However, I
>> think the benefits of following the standard of using strlcpy outweigh
>> these concerns. There are two points that I would raise in support:
>>
>> 1. There are many other ways in which the code might change in which
>> leaving strcpy would actually be worse than strlcpy. For example,
>> code might be copied and pasted such that malloc was no longer
>> invoked before strcpy. So I'd say analysis above that strcpy is
>> more robust is true modulo certain assumptions about how the code
>> may change with time, but that without those assumptions it's not
>> as clear as to which implementation is safer in the long run.
> Sure. I'm not sure, on a whole, the strlcpy is really safer, though,
> without more work (see below).
>
>> 2. More generally, there may be situations in which the semantics of
>> an "unsafe" function such as strcpy are more desirable for a number
>> of reasons. But I think it still is good to exercise discipline and
>> stick to safe functions, even if they have locally undesirable
>> effects, for the long-term security and stylistic consistency of
>> the code base.
> Maybe. That depends a *lot* on context. This is a real disadvantage
> of static analysis tools -- the person reading the output has to take
> a careful look at what the code actually does. There are other errors
> in the patches which I believe are the result of static analysis
> limitations (such as a +1 added to a malloc where it doesn't belong).
> I bring this up as a general "gotcha" to keep an eye on, not as
> specific criticism.
I agree that looking at the code carefully is essential to good
application of static analysis tools. And though I tried to do this,
clearly my brain was off some of the time :) But I'd still say that
eliminating unsafe functions is a good idea, independent of the valid
limitations of static analysis.
>> That said, there are other ways to change the code that use safe
>> functions and also avoid or minimize the concern above, and looking
>> into this is certainly worthwhile. For example, there could be a
>> wrapper that handles the malloc call and saves the mallocd size
>> "some expression" to another variable, used later in strlcpy. Even
>> the small step of setting "var = some expression", and then passing
>> in var both to malloc and to strlcpy helps to mitigate the concern,
>> and should be implemented. That we failed to implement these easy
>> mitigation strategies is an error, and in the future we will try to
>> be more diligent in making the patches as good as possible.
> In quite a few places, I have added a variable to store the results of
> the expression. Things like:
>
> +len = strlen(str) + 1;
> -buf = malloc(strlen(str) + 1);
> -g_strlcpy(buf, str, strlen(str) + 1);
> +buf = malloc(len);
> +g_strlcpy(buf, str, len);
>
> In general, this is a much superior solution. This was actually the
> point of my note, not that strlcpy is *inferior* to strcpy. Bounds
> checks with on-the-spot calculated bounds must be used *very*
> carefully, whereas bounds which are tied by the compiler to other
> manipulations and allocations are much safer.
>
> The real cake is, of course, sizeof() where applicable. :-)
Ah I see, yes that makes sense of course, and yes, sizeof is ideal.
> I have corrected these patterns in the patches where they occurred,
> and I have corrected fencepost errors where they occurred. I've
> applied about half of the patches with or without some modification.
> Some of the patches have more serious errors or make assumptions I
> need to validate before they can be applied. I hope to send a list of
> those patches and my specific concerns "soon".
>
> Thanks for your continued input,
> Ethan
Thanks for looking at this in detail and applying the superior solution
where possible. I should confess that the strlcpy part of the series of
patches we sent is probably the most error-prone, relied most heavily on
static analysis, and given the errors you've found, probably should be
taken more as a guide.
Dan
More information about the security
mailing list