22/06/2020

Nearly optimal static las vegas succinct dictionary

Huacheng Yu

Keywords: las vegas algorithm, succinct data structure, locally decodable source coding, dictionary

Abstract: Given a set S of n (distinct) keys from key space [U], each associated with a value from Σ, the static dictionary problem asks to preprocess these (key, value) pairs into a data structure, supporting value-retrieval queries: for any given x∈ [U], valRet(x) must return the value associated with x if x∈ S, or return ⊥ if x∉ S. The special case where |Σ|=1 is called the membership problem. The “textbook” solution is to use a hash table, which occupies linear space and answers each query in constant time. On the other hand, the minimum possible space to encode all (key, value) pairs is only OPT:= ⌈lg2(Un)+nlg2|Σ|⌉ bits, which could be much less. In this paper, we design a randomized dictionary data structure using OPT+lgn+O(lglglglglgU)   bits of space, and it has expected constant query time, assuming the query algorithm can access an external lookup table of size n0.001. The lookup table depends only on U, n and |Σ|, and not the input. Previously, even for membership queries and U≤ nO(1), the best known data structure with constant query time requires OPT+n/lgn bits of space by Pagh (SIAM J. Comput. 2001) and Pundefinedtraşcu (FOCS 2008); the best known using OPT+n0.999 space has query time O(lgn); the only known non-trivial data structure with OPT+n0.001 space has O(lgn) query time and requires a lookup table of size ≥ n2.99 (!). Our new data structure answers open questions by Pundefinedtraşcu and Thorup.

 0
 0
 0
 0
This is an embedded video. Talk and the respective paper are published at STOC 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment
no comments yet
code of conduct: tbd Characters remaining: 140

Similar Papers