Publikationsansicht

Linear hash functions (1999)

Abstract
Consider the set H of all linear (or affine) transformations between two vector spaces over a finite field F. We study how good H is as a class of hash functions, namely we consider hashing a set S of size n into a range having the same cardinality n by a randomly chosen function from H and look at the expected size of the largest hash bucket. H is a universal class of hash functions for any finite field, but with respect to our measure different fields behave differently. If the finite field F has n elements then there is a bad set S ⊂ F 2 of size n with expected maximal bucket size Ω(n1/3). If n is a perfect square then there is even a bad set with largest bucket size always at least √ n. (This is worst possible, since with respect to a universal class of hash functions every set of size n has expected largest bucket size below √ n + 1/2.) If, however, we consider the field of two elements then we get much better bounds. The best√previously known upper bound on the expected size of the largest bucket for this class was log O(2 n). We reduce this upper bound to O(log n log log n). Note that this is not far from the guarantee for a random function. There, the average largest bucket would be Θ(log n / log log n). In the course of our proof we develop a tool which may be of independent interest. Suppose we have a subset S of a vector space D over Z2, and consider a random linear mapping of D to a smaller vector space R. If the cardinality of S is larger than cɛ|R | log |R | then with probability 1 − ɛ, the image of S will cover all elements in the range. 1

Details der Publikation
Download http://citeseerx.ist.psu.edu/viewdoc/summary?doi=?doi=10.1.1.125.6956
Quelle http://www.cs.tau.ac.il/~nogaa/PDFS/linhash13.pdf
Mitarbeiter CiteSeerX
Archiv CiteSeerX - Scientific Literature Digital Library and Search Engine (United States)
Typ text
Sprache Englisch
Verknüpfungen 10.1.1.117.9396, 10.1.1.85.7143, 10.1.1.88.2974