Open main menu

lensowiki β

Changes

Computer Science/61b/Homework/hw6

239 bytes added, 03:51, 20 February 2023
m
====Compression functions====
Besides the [http://www.cs.berkeley.edu/~jrs/61bf06/lec/22 lecture notes], compression functions are also covered in Section
9.2.4 of Goodrich and Tamassia. Unfortunately, they make the erroneous claim
that for a hash code i and an N-bucket hash table,
are good.
If you have <code>N </code> buckets and a good (pseudorandom) hash function, the probabilityof any two keys colliding is <code>1/N</code>. So when you have <code>i </code> keys in the table andinsert key <code>i + 1</code>, the probability that the new key does NOT '''not''' collide with anyold key is <code>(1 - 1/N)^<sup>i</sup></code>. If you insert <code>n </code> distinct items, the expected numberthat WON'T ''won't''' collide is
n-1 sum (1 - 1/N)^<sup>i </sup> = N - N (1 - 1/N)^<sup>n,</sup> i=0
so the expected number of collisions is<code>n - N + N (1 - 1/N)<sup>n</sup></code>.
n - N + N (1 - 1/N)^n. Now, for any <code>n </code> and <code>N </code> you test, you can just plug them into this formula and see
if the number of collisions you're getting is in the ballpark of what you
should expect to get. For example, if you have <code>N </code> = 100 buckets and <code>n </code> = 100
keys, expect about 36.6 collisions.
1,277
edits