Provided by: libqdbm-dev_1.8.78-12.1build2_amd64 bug

NAME

       QDBM - quick database manager

OVERVIEW

       QDBM  is  a  library  of routines for managing a database.  The database is a simple data file containing
       records, each is a pair of a key and a value.  Every key and value is serial bytes with variable  length.
       Both binary data and character string can be used as a key and a value.  There is neither concept of data
       tables nor data types.  Records are organized in hash table or B+ tree.

       As  for  database  of hash table, each key must be unique within a database, so it is impossible to store
       two or more records with a key overlaps.  The following access methods  are  provided  to  the  database:
       storing  a  record  with  a  key  and  a value, deleting a record by a key, retrieving a record by a key.
       Moreover, traversal access to every key are provided, although the  order  is  arbitrary.   These  access
       methods  are  similar  to  ones  of  DBM  (or  its  followers: NDBM and GDBM) library defined in the UNIX
       standard.  QDBM is an alternative for DBM because of its higher performance.

       As for database of B+ tree, records whose keys are duplicated can be stored.  Access methods of  storing,
       deleting, and retrieving are provided as with the database of hash table.  Records are stored in order by
       a  comparing  function  assigned  by  a  user.   It  is possible to access each record with the cursor in
       ascending or descending order.  According to this mechanism, forward  matching  search  for  strings  and
       range search for integers are realized.  Moreover, transaction is available in database of B+ tree.

EFFECTIVE IMPLEMENTATION OF HASH DATABASE

       QDBM  is  developed  referring  to  GDBM for the purpose of the following three points: higher processing
       speed, smaller size of a database file, and  simpler  API.   They  have  been  achieved.   Moreover,  the
       following  three  restrictions  of traditional DBM: a process can handle only one database, the size of a
       key and a value is bounded, a database file is sparse, are cleared.

       QDBM uses hash algorithm to retrieve records.  If a bucket array has sufficient number of  elements,  the
       time  complexity  of  retrieval  is  `O(1)'.  That is, time required for retrieving a record is constant,
       regardless of the scale of a database.  It is also the same about storing  and  deleting.   Collision  of
       hash  values  is managed by separate chaining.  Data structure of the chains is binary search tree.  Even
       if a bucket array has unusually scarce elements, the time complexity of retrieval is `O(log n)'.

       QDBM attains improvement in retrieval by loading RAM with the whole of a bucket array.  If a bucket array
       is on RAM, it is possible to access a region of a target record by about one path of file operations.   A
       bucket  array  saved  in a file is not read into RAM with the `read' call but directly mapped to RAM with
       the `mmap' call.  Therefore, preparation time on connecting to a database is very short, and two or  more
       processes can share the same memory map.

       If  the  number of elements of a bucket array is about half of records stored within a database, although
       it depends on characteristic of the input, the probability of collision of hash  values  is  about  56.7%
       (36.8%  if  the  same,  21.3%  if  twice, 11.5% if four times, 6.0% if eight times).  In such case, it is
       possible to retrieve a record by two or less paths of file operations.  If it is made into a  performance
       index,  in  order  to  handle  a  database  containing one million of records, a bucket array with half a
       million of elements is needed.  The size of each element is 4 bytes.  That is, if  2M  bytes  of  RAM  is
       available, a database containing one million records can be handled.

       QDBM provides two modes to connect to a database: `reader' and `writer'.  A reader can perform retrieving
       but  neither  storing  nor deleting.  A writer can perform all access methods.  Exclusion control between
       processes is performed when connecting to a database by file locking.  While a writer is connected  to  a
       database, neither readers nor writers can be connected.  While a reader is connected to a database, other
       readers can be connect, but writers can not.  According to this mechanism, data consistency is guaranteed
       with simultaneous connections in multitasking environment.

       Traditional  DBM provides two modes of the storing operations: `insert' and `replace'.  In the case a key
       overlaps an existing record, the insert mode keeps the existing value, while the replace mode  transposes
       it to the specified value.  In addition to the two modes, QDBM provides `concatenate' mode.  In the mode,
       the  specified value is concatenated at the end of the existing value and stored.  This feature is useful
       when adding a element to a value as an array.  Moreover, although DBM has a method to fetch out  a  value
       from  a database only by reading the whole of a region of a record, QDBM has a method to fetch out a part
       of a region of a value.  When a value is treated as an array, this feature is also useful.

       Generally speaking, while succession of updating, fragmentation of available regions occurs, and the size
       of a database grows rapidly.  QDBM deal with this problem by coalescence of dispensable regions and reuse
       of them, and featuring of optimization of a database.  When overwriting a record with a value whose  size
       is  greater  than the existing one, it is necessary to remove the region to another position of the file.
       Because the time complexity of the operation depends on the size of the region  of  a  record,  extending
       values successively is inefficient.  However, QDBM deal with this problem by alignment.  If increment can
       be put in padding, it is not necessary to remove the region.

       As  for  many  file systems, it is impossible to handle a file whose size is more than 2GB.  To deal with
       this problem, QDBM provides a directory  database  containing  multiple  database  files.   Due  to  this
       feature,  it is possible to handle a database whose total size is up to 1TB in theory.  Moreover, because
       database files can be deployed on multiple disks, the speed of updating operations  can  be  improved  as
       with  RAID-0  (striping).   It is also possible for the database files to deploy on multiple file servers
       using NFS and so on.

USEFUL IMPLEMENTATION OF B+ TREE DATABASE

       Although B+ tree database is slower than hash database, it features ordering access to each record.   The
       order  can  be  assigned  by users.  Records of B+ tree are sorted and arranged in logical pages.  Sparse
       index organized in B tree that is multiway balanced tree are maintained for each page.   Thus,  the  time
       complexity of retrieval and so on is `O(log n)'.  Cursor is provided to access each record in order.  The
       cursor  can  jump  to  a  position  specified  by a key and can step forward or backward from the current
       position.  Because each page is arranged as double linked list, the time complexity of stepping cursor is
       `O(1)'.

       B+ tree database is implemented, based on above hash database.  Because each page of B+ tree is stored as
       each record of hash database, B+  tree  database  inherits  efficiency  of  storage  management  of  hash
       database.   Because  the  header  of  each  record  is  smaller  and alignment of each page is calculated
       statistically, in most cases, the size of database file is cut by half compared to one of hash  database.
       Although  operation  of  many pages are required to update B+ tree, QDBM expedites the process by caching
       pages and reducing file operations.  In most cases, because whole  of  the  sparse  index  is  cached  on
       memory, it is possible to retrieve a record by one or less path of file operations.

       B+ tree database features transaction mechanism.  It is possible to commit a series of operations between
       the  beginning and the end of the transaction in a lump, or to abort the transaction and perform rollback
       to the state before the transaction.  Even if  the  process  of  an  application  is  crushed  while  the
       transaction, the database file is not broken.

       In  case  that  QDBM  is built with ZLIB, LZO, or BZIP2 enabled, a lossless data-compression library, the
       content of each page of B+ tree is compressed and stored in a file.  Because each record in  a  page  has
       similar  patterns,  high  efficiency  of  compression is expected due to the Lempel-Ziv algorithm and the
       like.  In case handling text data, the size of a database is reduced to about 25%.  If  the  scale  of  a
       database  is  large  and  disk  I/O  is  the bottleneck, featuring compression makes the processing speed
       improved to a large extent.

SIMPLE BUT VARIOUS INTERFACES

       QDBM provides very simple APIs.  You can perform database I/O as  usual  file  I/O  with  `FILE'  pointer
       defined  in  ANSI  C.   In  the  basic API of QDBM, entity of a database is recorded as one file.  In the
       extended API, entity of a database is recorded as several files in one directory.  Because the  two  APIs
       are very similar with each other, porting an application from one to the other is easy.

       APIs which are compatible with NDBM and GDBM are also provided.  As there are a lot of applications using
       NDBM  or  GDBM, it is easy to port them onto QDBM.  In most cases, it is completed only by replacement of
       header including (#include) and re-compiling.  However, QDBM can not handle database files  made  by  the
       original NDBM or GDBM.

       In  order  to  handle  records  on  memory  easily,  the  utility  API is provided.  It implements memory
       allocating functions, sorting functions, extensible datum, array list, hash map, and so on.  Using  them,
       you can handle records in C language cheaply as in such script languages as Perl or Ruby.

       B+  tree database is used with the advanced API.  The advanced API is implemented using the basic API and
       the utility API.  Because the advanced API is also similar to the basic API and the extended API,  it  is
       easy to learn how to use it.

       In  order  to  handle  an  inverted  index which is used by full-text search systems, the inverted API is
       provided.  If it is easy to handle an inverted index of documents,  an  application  can  focus  on  text
       processing  and  natural  language  processing.   Because this API does not depend on character codes nor
       languages, it is possible to implement a full-text search system which can respond  to  various  requests
       from users.

       Along  with  APIs  for  C,  QDBM provides APIs for C++, Java, Perl, and Ruby.  APIs for C are composed of
       seven kinds: the basic API, the extended API, the  NDBM-compatible  API,  the  GDBM-compatible  API,  the
       utility  API,  the advanced API, and the inverted API.  Command line interfaces corresponding to each API
       are also provided.  They are useful for  prototyping,  testing,  debugging,  and  so  on.   The  C++  API
       encapsulates  database  handling  functions of the basic API, the extended API, and the advanced API with
       class mechanism of C++.  The Java API has native methods calling the basic API, the extended API, and the
       advanced API with Java Native Interface.  The Perl API has methods calling the basic  API,  the  extended
       API,  and the advanced API with XS language.  The Ruby API has method calling the basic API, the extended
       API, and the advanced API as modules of Ruby.  Moreover, CGI scripts for administration of databases  and
       full-text search are provided.

WIDE PORTABILITY

       QDBM is implemented being based on syntax of ANSI C (C89) and using only APIs defined in ANSI C or POSIX.
       Thus,  QDBM  works on most UNIX and its compatible OSs.  As for C API, checking operations have been done
       at least on Linux 2.2, Linux 2.4, FreeBSD 4.8, FreeBSD 5.0, SunOS 5.7, SunOS 5.8, SunOS 5.9, HP-UX 11.00,
       Cygwin 1.3.10, Mac OS X 10.2, and RISC OS 5.03.  Although a database file created by QDBM depends on byte
       order of the processor, to do with it, utilities to dump data in format  which  is  independent  to  byte
       orders are provided.

BUILDING

       For  building  a  program  using  QDBM,  the  program should be linked with a library file `libqdbm.a' or
       `libqdbm.so'.  For example, the following command is executed to build `sample' from `sample.c'.

              gcc -I/usr/local/include -o sample sample.c -L/usr/local/lib -lqdbm

AUTHOR

       QDBM is written by Mikio Hirabayashi.  You can contact the author by e-mail to <mikio@fallabs.com>.   Any
       suggestion or bug report is welcome to the author.

COPYRIGHT

       Copyright(c) 2000-2003 Mikio Hirabayashi

       QDBM is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General
       Public  License  as  published  by  the Free Software Foundation; either version 2 of the License, or any
       later version.

       QDBM is distributed in the hope that it will be useful,  but  WITHOUT  ANY  WARRANTY;  without  even  the
       implied  warranty  of  MERCHANTABILITY  or  FITNESS FOR A PARTICULAR PURPOSE.  See the GNU Lesser General
       Public License for more details.

       You should have received a copy of the GNU Lesser General Public License along with QDBM; if  not,  write
       to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.

SEE ALSO

       depot(3), curia(3), relic(3), hovel(3), cabin(3), villa(3), odeum(3), ndbm(3), gdbm(3)

Man Page                                           2004-04-22                                            QDBM(3)