Open main menu

lensowiki β

Changes

Computer Science/61b/Homework/hw6

191 bytes added, 21:56, 26 April 2007
A tutorial on collision probability: formatting
are good.
If you have <code>N </code> buckets and a good (pseudorandom) hash function, the probabilityof any two keys colliding is <code>1/N</code>. So when you have <code>i </code> keys in the table andinsert key <code>i + 1</code>, the probability that the new key does NOT '''not''' collide with anyold key is <code>(1 - 1/N)^<sup>i</sup></code>. If you insert <code>n </code> distinct items, the expected numberthat WON'T ''won't''' collide is
n-1 sum (1 - 1/N)^<sup>i </sup> = N - N (1 - 1/N)^<sup>n,</sup> i=0
so the expected number of collisions is<code>n - N + N (1 - 1/N)<sup>n</sup></code>.
n - N + N (1 - 1/N)^n. Now, for any <code>n </code> and <code>N </code> you test, you can just plug them into this formula and see
if the number of collisions you're getting is in the ballpark of what you
should expect to get. For example, if you have <code>N </code> = 100 buckets and <code>n </code> = 100
keys, expect about 36.6 collisions.
1,277
edits