@Bibtex-file{Theory/hash.bib,
  title =        "Bibliography on Hashing",
  author =       "Nelson H. F. Beebe",
  address =      "Center for Scientific Computing\\ Department of
                 Mathematics\\ University of Utah\\ Salt Lake City, UT
                 84112\\ USA",
  email =        "beebe@math.utah.edu (Internet)",
  supported =    "yes",
  keywords =     "hashing",
  abstract =     "This bibliography records publications on the subject
                 of hashing, i.e., algorithms for lookup of keys in
                 large lists in (on average) constant time.\par For
                 static collections of text, such as data on CD ROMs,
                 minimal perfect hash functions are of considerable
                 interest, and the reader's attention is drawn to the
                 important breakthroughs represented by the work of E.
                 Fox and collaborators (1988--1992), which now permit
                 derivation of hash functions for collections of
                 millions of keys, instead of at most a few hundred with
                 the methods of earlier work. \par Witten, Moffat, and
                 Bell (Witten:1994:MGC) describe very recent work on
                 minimal ordered perfect hash functions, that is, ones
                 in which entries are stored in some predefined order,
                 such as alphabetical; this makes enumeration of a
                 sorted key list trivial. The methods of their book are
                 implemented in software (retrievable on the Internet)
                 for solving the full text search problem: given a word,
                 or word, find all documents in a large collection that
                 contain that word. Their software also supports Boolean
                 search (find A and B or C and not D), and query ranked
                 search (given a list of several words, find documents
                 containing them, and rank them by the number of
                 matches).",
  readme =       "These references have been extracted from a very large
                 computer science bibliography collection on
                 ftp.ira.uka.de in /pub/bibliography to which many
                 people of have contributed. The snapshot of this
                 collection was taken on 5-May-1994, and it consists of
                 441 BibTeX files, 2,672,675 lines, 205,289 entries, and
                 6,375 <at>String{} abbreviations, occupying 94.8MB of
                 disk space. \par Regrettably, the quality of many of
                 those bibliography files is low, with incomplete
                 bibliographic data (missing author initials, page
                 numbers, titles, proceedings cross-references, ....)
                 and spelling and typing errors. Also, because the
                 collection came from many sources, there is much
                 duplication, and I had to spend much longer than I
                 expected identifying duplicates, and merging them
                 manually into single entries with maximal bibliographic
                 information. \par I have corrected all spelling errors
                 that I could identify with the help of two separate
                 spelling programs, though this is difficult with
                 multi-lingual text. The list of spelling exceptions
                 (i.e. words believed to be correctly spelled, but
                 absent from the spelling program dictionaries) is kept
                 in the companion file, hash.sok. \par I have supplied
                 publisher, ISBN, LCCN, page number data to the extent
                 possible with the resources of the U.S. Library of
                 Congress catalog, and other university catalogs
                 accessible on the Internet, particularly the University
                 of California MELVYL catalog, and the Stanford
                 University RLIN catalog (thanks to the willow software
                 from the University of Washington). Their availability
                 is gratefully acknowledged. \par For books published
                 since 1972, when the International Standard Book
                 Numbering system was introduced, ISBNs are particularly
                 important, because they are unique numbers that
                 identify the country group, publisher, and book;
                 bookstores routinely request ISBNs from their
                 customers. \par More than 235 of these references are
                 papers in conference proceedings, and regrettably, for
                 about 30 of them, I have been unable to locate an exact
                 reference to the conference volume in the various
                 on-line library catalogs that I consulted. This is
                 disappointing, because it suggests that the papers will
                 be largely inaccessible. \par Missing data are
                 indicated throughout by question marks. Approximately a
                 third of the bibliographic entries contain them,
                 sigh... \par I will be very grateful to users of this
                 bibliography who can supply me with corrected
                 conference proceedings data for future editions of this
                 bibliography, as well as for new entries. Despite the
                 very large collection from which this data was
                 extracted, more than half of the papers in my personal
                 files of papers on hashing were absent from that
                 collection. Also, most of the references from Knuth's
                 exhaustive study (Knuth:1973:ACP), and from the books
                 by Vitter and Chen (Vitter:1987:DAC), Pieprzyk and
                 Sadeghiyan (Pieprzyk:1993:DHA), and Devroye
                 (Devroye:1986:LNB) were absent, and have been included
                 below. \par Because of my dissatisfaction with the
                 completeness of many of these entries, I have assigned
                 a major version number of 0 to this bibliography,
                 rather than the more usual 1. A substantial amount of
                 updating work remains to be done to remedy this
                 situation, and bring this bibliography up to the
                 standards which should be expected of professionals in
                 the field. This bibliography is nevertheless being made
                 available in its present state in the belief that it
                 will be useful to many people.",
}
