Synopsis
Judy1 - maps an Index (word) to a bit
JudyL - maps an Index (word) to a Value (word/pointer)
JudySL - maps an Index (null terminated string) to a Value
JudyHS - maps an Index (array-of-bytes) of Length to a Value
Description
The Judy family of functions supports fully dynamic arrays. These
arrays may be indexed by a 32- or 64-bit word (depending on processor
word size), a null terminated string or an array-of-bytes plus length.
A dynamic array (sparsely populated) can also be thought of as a
mapping function or associative memory.
A Word_t is a typedef unsigned long int in
Judy.h and must be the same size as sizeof(void *) I.E.
a pointer.
Judy1 functions: Index is a
Word_t and Value is just a bit or simply a flag
that Index is present or missing from the array. This can be
thought of as a huge bitmap.
JudyL functions: Index is a
Word_t and Value is a Word_t. This makes
JudyL a pure word-to-word/pointer mapper. JudySL and
JudyHL are based on this property of JudyL.
JudySL functions: Index is a
null-terminated string and Value is a Word_t.
JudyHS functions: Index is an
array-of-bytes of length: Length. Value is a
Word_t. This new addition (May 2004) to Judy is a hybird using
the best features of hashing and Judy methods. The author believes
JudyHS is a good replacement for a hashing method when resizing
the hash table is done during population growth. A correctly tuned
hash method with a static hash table size and population is
unbeatable for speed. However, JudyHS will perform better than
a hashing method with smaller and larger populations than the optimum
hash table size. JudyHS does not have a degenerate performance
case where knowledge of the hash algorithm can be exploited. (I.E.
JudyHS does not use a linked list to handle hash collisions, it uses a
tree of JudyL arrays and a virtual hash table size of 4
billion).
Judy arrays are both speed- and memory-efficient, with
no tuning or configuration required, across a wide range of index set
types (sequential, periodic, clustered, random). Judy's speed and
memory usage are typically better than other data storage models such
as skiplists, linked lists, binary, ternary, b-trees, or even hashing,
and improves with very large data sets.
A Judy array is created merely by defining a null pointer and then
storing (inserting) the first element into the array under that
pointer. The memory used by a Judy array is nearly proportional to
the population (number of elements).
Judy has two Application Program Interfaces (APIs): a C macro
interface, and a function call interface. Because the macro forms are
sometimes faster and have a simpler error handling interface than the
equivalent functions, they are the preferred way of using the Judy
functions.
Since an initial (empty) Judy array is represented by a null pointer,
it is possible to construct an array of Judy arrays. In other words,
a Judy array's Values (except Judy1) can be pointers to other
Judy arrays. This makes it very simple to construct an array with an
arbitrary number of dimensions or Index sizes. (JudySL and
JudyHS are implemented using JudyL this way).
Detailed Description:
|