Ubuntu Manpage: dirfile — a filesystem-based database format for time-ordered binary data

Provided by: libgetdata-doc_0.11.0-13_all

NAME

       dirfile — a filesystem-based database format for time-ordered binary data

DESCRIPTION

The dirfile database format is designed to provide a fast, simple format for storing and reading binary
time-ordered data. Dirfiles can be read using the GetData Library, which provides a reference
implementaiton of these Standards.

The dirfile database is centred around one or more time-ordered data streams (a time stream). Each time
stream is written to the filesystem in a separate file, as binary data. The name of these binary files
correspond to the time stream's field name. Dirfiles support binary data fields for signed and unsigned
integer types of 8 to 64 bits, as well as single and double precision floating-point real or complex data
types.

Two time streams may have different constant sampling frequencies and mechanisms exist within the dirfile
format to ensure these time streams remain properly sequenced in time.

To do this, the time streams in the dirfile are subdivided into frames. Each frame contains a fixed
integer number of samples of each time stream. Two time streams in the same dirfile may have different
numbers of samples per frame, but the number of samples per frame of any given time stream is fixed.

When synchronous retrieval of data from more than one time stream is required, position in the dirfile
can be specified in frames, which will ensure synchronicity.

The binary files are all located in one ore more filesystem directories, rooted around a central
directory, known as the dirfile directory. The dirfile as a whole may be referred to by its dirfile
directory path.

Included in the dirfile along with the time streams is the dirfile format specification, which is one or
more ASCII text files containing the dirfile database metadata. The primary file is the file called
format located in the dirfile directory. This file and any additional files that it names, fully specify
the dirfile's metadata. For the syntax of these files, see dirfile-format(5).

Version 3 of the Dirfile Standards introduced the large dirfile extension. This extension added the
ability to distribute the dirfile metadata among multiple files (called fragments) in addition to the
format file, as well as the ability to house portions of the database in subdirfiles. These subdirfiles
may be fully fledged dirfiles in their own right, but may also be contained within a larger, parent
dirfile. See dirfile-format(5) for information on specifying these subdirfiles.

In addition to the raw fields on disk, the dirfile format specification may also specify derived fields
which are calculated by performing simple element-wise operations on one or more input fields. Derived
fields behave identically to raw fields when read via GetData. See dirfile-format(5) for a complete list
of derived field types. Dirfiles may also contain both numerical and character string constant scalar
fields, also further outlined in dirfile-format(5).

Dirfiles are designed to be written to and read simultaneously. The dirfile specification dictates that
one particular raw field (specified either explicitly or implicitly by the dirfile metadata) is to be
used as the reference field: all other vector fields are assumed to have at least as many frames as the
reference field has, and the size (in frames) of the reference field is used as the size of the dirfile
as a whole.

Version 6 of the Dirfile Standards added the ability to encode the binary files on disk. Each fragment
may have its own encoding scheme. Most commonly, encodings are used to compress the data files to same
space. See dirfile-encoding(5) for information on encoding schemes.

Complex Number Storage Format
Version 7 of the Dirfile Standards added support for complex valued data. Two types of complex valued
data are supported by the Dirfile Standards:

• A 64-bit complex number consisting of a IEEE-754 standard 32-bit single precision floating point real
part and a IEEE-754 standard 32-bit single precision floating point imaginary part, and

• A 128-bit complex number consisting of a IEEE-754 standard 64-bit double precision floating point
real part and a IEEE-754 standard 64-bit double precision floating point imaginary part.

No integer-type complex numbers are supported.

Unencoded complex numbers are stored on disk in "Fortran order", that is with the IEEE-754 real part
followed by the IEEE-754 imaginary part. The specified endianness of the two components follows that of
purely real floating point numbers. Endianness does not affect the ordering of the real and imaginary
parts. This format also conforms to the C99 and C++11 standards.

To aid in using complex valued data, dirfile field codes may contain a representation suffix which
specifies a function to apply to the complex valued data to map it into purely real data. See
dirfile-format(5).

AUTHORS

       The  Dirfile  format  was  created by C. B. Netterfield <netterfield@astro.utoronto.ca>.  It is now main‐
       tained by D. V. Wiebe <getdata@ketiltrout.net>.

NAME

DESCRIPTION

AUTHORS

SEE ALSO