Using different hash algorithm in purple_util_get_image_checksum()

Ethan Blanton elb at pidgin.im
Wed Jul 1 12:46:02 EDT 2009


Mark Doliner spake unto us the following wisdom:
> On Tue, Jun 30, 2009 at 7:56 PM, Ethan Blanton<elb at pidgin.im> wrote:
> > A 32-bit hash brings collisions into the realm of possibility, as it
> > requires only about 2^16 icons in the store for the probability of
> > collision to rise to about 50%.
> >
> > We don't necessarily need a SHA-sized hash, but we need more than
> > Adler32.  I suggest, however, that there's not much in between that
> > will have *practical* implementations which are faster than SHA.  (SHA
> > implementations in crypto libraries are heavily optimized, as are many
> > Adler32 implementations.  I am unaware of, for example, a 64-bit
> > checksum or hashing function with similar widespread optimization.)
> > It's possible that AES-128 would demonstrate benefit, or even DES with
> > a 56-bit key; you might want to benchmark.
> 
> At Meebo we chose to create two 32-bit hashes, one from one chunk of
> the icon data and one from a different chunk, then concat them to
> create a 64-bit checksum.  Vijay from Meebo tells me that he seems to
> remember the sha-1 implementation in glib 2.16 is much faster than the
> one in our cipher.c.

This is a good compromise.  And, in fact, can probably be made "secure
enough", particularly as the actual hash values are not leaked outside
of Meebo.  It sounds to me like a good solution for 3.x is maybe to
allow for Pidgin's CAS to be offloaded to a UI-provided function.
This would let installations such as Meebo (with differing concerns
from single-instance Pidgin) take more complicated addressing measures.

We could quite possibly fruitfully make the specific hash functions we
use conditional on available library implementations.

> > (In reality, I submit that we *do* need a cryptographically secure hash
> >  function.  While this is not always the case for content-addressed
> >  storage, in our case we are storing objects which are provided by
> >  untrusted users from the network.  It is relatively easy to
> >  reverse-collide an Adler32 checksum, and possible for an individual
> >  to reverse-collide a DES "hashed" block.  This would mean that a
> >  malicious user with a timing advantage could pollute your icon store
> >  and prevent legitimate buddy icons from being fetched.  I sketch:
> >
> >  Mallory sees that Alice has changed her buddy icon, but Bob is not
> >  online.  Mallory computes an icon which collides with Alice's in the
> >  Pidgin buddy icon CAS.  When Bob comes online, Mallory rapidly sends
> >  Bob this new buddy icon information as his own buddy icon, storing it
> >  in Bob's CAS.  When Bob engages in conversation with Alice, Mallory's
> >  buddy icon is now shown for Alice.
> >
> >  The saving grace here is ... nobody cares that much about buddy
> >  icons.  Particularly now that Hulu is around.)
> 
> Yeah, we thought about all of this and didn't deem it important enough
> for us to keep using sha-1.  As you point out it is relatively easy to
> create a chunk of data with an Adler32 checksum that matches another
> chunk of data, and so it would be relatively easy to make someone's
> icon stop appearing.  But it would be harder to replace someone's icon
> with a different image because you would have to create a chunk of
> data that 1. has the same checksum and 2. is a valid image.

... which, with a 24-bit color uncompressed bitmap, is likely a
reasonable thing to do.  Compressed images are, of course, another
matter.

> I'm hoping a few other devs weight in, but it sounds like we want to
> stick with sha-1.  I'm not sure the performance advantage of md5 over
> sha1 justifies the code changes that would be required.  And the
> checksumming matters a lot less when you're running one instance on
> your personal computer than when you have a few thousand users on a
> single server.

Agreed, re: MD5.  Using a more optimized version of SHA, if available,
seems like a good idea, however.

Ethan

-- 
The laws that forbid the carrying of arms are laws [that have no remedy
for evils].  They disarm only those who are neither inclined nor
determined to commit crimes.
		-- Cesare Beccaria, "On Crimes and Punishments", 1764
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 481 bytes
Desc: Digital signature
URL: <http://pidgin.im/pipermail/devel/attachments/20090701/aa762145/attachment.sig>


More information about the Devel mailing list