Provided by: debuginfod_0.190-1.1ubuntu0.1_amd64 bug

NAME

       debuginfod - debuginfo-related http file-server daemon

SYNOPSIS

       debuginfod [OPTION]... [PATH]...

DESCRIPTION

       debuginfod  serves debuginfo-related artifacts over HTTP.  It periodically scans a set of directories for
       ELF/DWARF files and their associated source code, as well as archive files containing the above, to build
       an index by their buildid.  This index is used when remote clients use the HTTP webapi,  to  fetch  these
       files by the same buildid.

       If  a  debuginfod  cannot  service  a  given  buildid  artifact request itself, and it is configured with
       information about upstream debuginfod servers,  it  queries  them  for  the  same  information,  just  as
       debuginfod-find  would.   If  successful,  it locally caches then relays the file content to the original
       requester.

       Indexing the given PATHs proceeds using multiple threads.  One  thread  periodically  traverses  all  the
       given  PATHs  logically  or  physically (see the -L option).  Duplicate PATHs are ignored.  You may use a
       file name for a PATH, but source code indexing may be incomplete; prefer using a directory that  contains
       the binaries.  The traversal thread enumerates all matching files (see the -I and -X options) into a work
       queue.   A  collection  of scanner threads (see the -c option) wait at the work queue to analyze files in
       parallel.

       If the -F option is given, each file is scanned as an ELF/DWARF file.   Source  files  are  matched  with
       DWARF files based on the AT_comp_dir (compilation directory) attributes inside it.  Caution: source files
       listed  in  the  DWARF may be a path anywhere in the file system, and debuginfod will readily serve their
       content on demand.  (Imagine a doctored DWARF file that lists /etc/passwd as a source file.)  If this  is
       a concern, audit your binaries with tools such as:

       % eu-readelf -wline BINARY | sed -n '/^Directory.table/,/^File.name.table/p'
       or
       % eu-readelf -wline BINARY | sed -n '/^Directory.table/,/^Line.number/p'
       or even use debuginfod itself:
       % debuginfod -vvv -d :memory: -F BINARY 2>&1 | grep 'recorded.*source'
       ^C

       If  any  of  the -R, -U, or -Z options is given, each file is scanned as an archive file that may contain
       ELF/DWARF/source files.  Archive files are recognized by extension.  If -R is  given,  ".rpm"  files  are
       scanned;  if -U is given, ".deb" and ".ddeb" files are scanned; if -Z is given, the listed extensions are
       scanned.  Because of complications such as DWZ-compressed debuginfo, may require two traversal passes  to
       identify  all  source code.  Source files for RPMs are only served from other RPMs, so the caution for -F
       does not apply.  Note that due to Debian/Ubuntu packaging policies & mechanisms,  debuginfod  cannot  re‐
       solve source files for DEB/DDEB at all.

       If no PATH is listed, or none of the scanning options is given, then debuginfod will simply serve content
       that it accumulated into its index in all previous runs, periodically groom the database, and federate to
       any  upstream  debuginfod  servers.  In passive mode, debuginfod will only serve content from a read-only
       index and federated upstream servers, but will not scan or groom.

OPTIONS

       -F     Activate ELF/DWARF file scanning.  The default is off.

       -Z EXT -Z EXT=CMD
              Activate an additional pattern in archive scanning.  Files with name extension  EXT  (include  the
              dot)  will  be processed.  If CMD is given, it is invoked with the file name added to its argument
              list, and should produce a common archive on its standard output.  Otherwise, the file is read  as
              if  CMD were "cat".  Since debuginfod internally uses libarchive to read archive files, it can ac‐
              cept a wide range of archive formats and compression modes.  The default  is  no  additional  pat‐
              terns.  This option may be repeated.

       -R     Activate  RPM patterns in archive scanning.  The default is off.  Equivalent to -Z .rpm=cat, since
              libarchive can natively process RPM archives.  If your version of libarchive is  much  older  than
              2020, be aware that some distributions have switched to an incompatible zstd compression for their
              payload.  You may experiment with -Z .rpm='(rpm2cpio|zstdcat)<' instead of -R.

       -U     Activate   DEB/DDEB   patterns   in   archive  scanning.   The  default  is  off.   Equivalent  to
              -Z .deb='dpkg-deb --fsys-tarfile' -Z .ddeb='dpkg-deb --fsys-tarfile'.

       -d FILE --database=FILE
              Set the path of the sqlite database used to store the index.  This file is disposable in the sense
              that a later rescan will repopulate data.  It will contain absolute file path names, so it may not
              be portable across machines.  It may be frequently  read/written,  so  it  should  be  on  a  fast
              filesystem.   It should not be shared across machines or users, to maximize sqlite locking perfor‐
              mance.  For quick testing the magic string ":memory:" can be used to use an  one-time  memory-only
              database.  The default database file is $HOME/.debuginfod.sqlite.

       --passive
              Set the server to passive mode, where it only services webapi requests, including participating in
              federation.   It performs no scanning, no grooming, and so only opens the sqlite database read-on‐
              ly.  This way a database can be safely shared between a active scanner/groomer server and multiple
              passive ones, thereby sharing service load.  Archive pattern options must still be given,  so  de‐
              buginfod can recognize file name extensions for unpacking.

       -D SQL --ddl=SQL
              Execute given sqlite statement after the database is opened and initialized as extra DDL (SQL data
              definition  language).  This may be useful to tune performance-related pragmas or indexes.  May be
              repeated.  The default is nothing extra.

       -p NUM --port=NUM
              Set the TCP port number (0 < NUM < 65536) on which debuginfod should listen, to service  HTTP  re‐
              quests.  Both IPv4 and IPV6 sockets are opened, if possible.  The webapi is documented below.  The
              default port number is 8002.

       -I REGEX --include=REGEX -X REGEX --exclude=REGEX
              Govern  the inclusion and exclusion of file names under the search paths.  The regular expressions
              are interpreted as unanchored POSIX extended REs, thus may include alternation.  They are evaluat‐
              ed against the full path of each file, based on its realpath(3) canonicalization.  By default, all
              files are included and none are excluded.  A file that matches both include and exclude  REGEX  is
              excluded.   (The  contents  of  archive files are not subject to inclusion or exclusion filtering:
              they are all processed.)  Only the last of each type of regular expression given is used.

       -t SECONDS --rescan-time=SECONDS
              Set the rescan time for the file and archive directories.  This is the amount of time the  traver‐
              sal  thread will wait after finishing a scan, before doing it again.  A rescan for unchanged files
              is fast (because the index also stores the file mtimes).  A time of zero is acceptable, and  means
              that only one initial scan should performed.  The default rescan time is 300 seconds.  Receiving a
              SIGUSR1 signal triggers a new scan, independent of the rescan time (including if it was zero), in‐
              terrupting a groom pass (if any).

       -r     Apply  the  -I  and  -X during groom cycles, so that most content related to files excluded by the
              regexes are removed from the index.  Not all content can be practically removed, so  eventually  a
              "-G" "maximal-groom" operation may be needed.

       -g SECONDS --groom-time=SECONDS
              Set  the  groom  time for the index database.  This is the amount of time the grooming thread will
              wait after finishing a grooming pass before doing it again.  A groom operation quickly rescans all
              previously scanned files, only to see if they are still present and current, so it can deindex ob‐
              solete files.  See also the DATA MANAGEMENT section.  The default groom time is 86400  seconds  (1
              day).   A  time  of zero is acceptable, and means that only one initial groom should be performed.
              Receiving a SIGUSR2 signal triggers a new grooming pass, independent of the groom time  (including
              if it was zero), interrupting a rescan pass (if any)..

       -G     Run an extraordinary maximal-grooming pass at debuginfod startup.  This pass can take considerable
              time, because it tries to remove any debuginfo-unrelated content from the archive-related parts of
              the  index.   It  should not be run if any recent archive-related indexing operations were aborted
              early.  It can take considerable space, because it finishes up with an sqlite "vacuum"  operation,
              which repacks the database file by triplicating it temporarily.  The default is not to do maximal-
              grooming.  See also the DATA MANAGEMENT section.

       -c NUM --concurrency=NUM
              Set  the concurrency limit for the scanning queue threads, which work together to process archives
              & files located by the traversal thread.  This important for controlling CPU-intensive  operations
              like  parsing  an  ELF  file and especially decompressing archives.  The default is related to the
              number of processors on the system and other constraints; the minimum is 1.

       -C -C=NUM --connection-pool --connection-pool=NUM
              Set the size of the pool of threads serving webapi queries.  The following  table  summarizes  the
              interpretaton of this option and its optional NUM parameter.
              no option, -C   use a fixed thread pool sized automatically
              -C=NUM          use a fixed thread pool sized NUM, minimum 2

              The  first  mode  is a simple and safe configuration related to the number of processors and other
              constraints. The second mode is suitable for  tuned  load-limiting  configurations  facing  unruly
              traffic.

       -L     Traverse  symbolic  links encountered during traversal of the PATHs, including across devices - as
              in find -L.  The default is to traverse the physical directory structure only, stay  on  the  same
              device,  and  ignore  symlinks  - as in find -P -xdev.  Caution: a loops in the symbolic directory
              tree might lead to infinite traversal.

       --fdcache-fds=NUM --fdcache-mbs=MB
              Configure limits on a cache that keeps recently extracted files from archives.  Up to NUM request‐
              ed files and up to a total of MB megabytes will be kept extracted, in order to avoid having to de‐
              compress their archives over and over again. The default NUM and MB values depend on  the  concur‐
              rency  of  the system, and on the available disk space on the $TMPDIR or /tmp filesystem.  This is
              because that is where the most recently used extracted files are kept.  Grooming cleans  out  this
              cache.

       --fdcache-prefetch-fds=NUM --fdcache-prefetch-mbs=MB
              --fdcache-prefetch=NUM2

              In addition to the main fdcache, up to NUM2 other files from an archive may be prefetched into an‐
              other  cache  before  they  are  even  requested.   Configure  how many file descriptors (fds) and
              megabytes (mbs) are allocated to the prefetch fdcache. If unspecified, these values depend on con‐
              currency of the system and on the available disk space on the $TMPDIR.   Allocating  more  to  the
              prefetch  cache  will  improve  performance in environments where different parts of several large
              archives are being accessed.  This cache is also cleaned out during grooming.

       --fdcache-mintmp=NUM
              Configure a disk space threshold for emergency flushing of the caches.  The filesystem holding the
              caches is checked periodically.  If the available space falls  below  the  given  percentage,  the
              caches  are  flushed, and the fdcaches will stay disabled until the next groom cycle.  This mecha‐
              nism, along a few associated /metrics on the webapi, are intended to give an operator notice about
              storage scarcity - which can translate to RAM scarcity if the disk happens to be on a RAM  virtual
              disk.  The default threshold is 25%.

       --forwarded-ttl-limit=NUM
              Configure  limits of X-Forwarded-For hops. if X-Forwarded-For exceeds N hops, it will not delegate
              a local lookup miss to upstream debuginfods. The default limit is 8.

       --disable-source-scan
              Disable scan of the dwarf source info of debuginfo sections.  If a setup has no access  to  source
              code, the source info is not required.

       --scan-checkpoint=NUM
              Run  a  synchronized  SQLITE  WAL  checkpoint  operation after every NUM completed archive or file
              scans.  This may slow down parallel scanning phase somewhat, but generate much smaller "-wal" tem‐
              porary files on busy servers.  The default is 256.  Disabled if 0.

       -v     Increase verbosity of logging to the standard error file descriptor.  May be repeated to  increase
              details.  The default verbosity is 0.

WEBAPI

       debuginfod's  webapi  resembles ordinary file service, where a GET request with a path containing a known
       buildid results in a file.  Unknown buildid / request combinations result in HTTP error codes.  This file
       service resemblance is intentional, so that an installation can take advantage of standard  HTTP  manage‐
       ment infrastructure.

       For most queries, some custom http headers are added to the response, providing additional metadata about
       the buildid-related response.  For example:

       % debuginfod-find -v debuginfo /bin/ls |& grep -i x-debuginfo
       x-debuginfod-size: 502024
       x-debuginfod-archive: /mnt/fedora_koji_prod/koji/packages/coreutils/9.3/4.fc39/x86_64/coreutils-debuginfo-9.3-4.fc39.x86_64.rpm
       x-debuginfod-file: /usr/lib/debug/usr/bin/ls-9.3-4.fc39.x86_64.debug

       X-DEBUGINFOD-SIZE
              The size of the file, in bytes.  This may differ from the http Content-Length: field (if present),
              due to compression in transit.

       X-DEBUGINFOD-FILE
              The full path name of the file related to the given buildid.

       X-DEBUGINFOD-ARCHIVE
              The full path name of the archive that contained the above file, if any.

              There are a handful of buildid-related requests.  In each case, the buildid is encoded as a lower‐
              case hexadecimal string.  For example, for a program /bin/ls, look at the ELF note GNU_BUILD_ID:

       % readelf -n /bin/ls | grep -A4 build.id
       Note section [ 4] '.note.gnu.buildid' of 36 bytes at offset 0x340:
       Owner          Data size  Type
       GNU                   20  GNU_BUILD_ID
       Build ID: 8713b9c3fb8a720137a4a08b325905c7aaf8429d

       Then the hexadecimal BUILDID is simply:

       8713b9c3fb8a720137a4a08b325905c7aaf8429d

   /buildid/BUILDID/debuginfo
       If  the  given  buildid is known to the server, this request will result in a binary object that contains
       the customary .*debug_* sections.  This may be a split debuginfo file as created by strip, or it  may  be
       an original unstripped executable.

   /buildid/BUILDID/executable
       If  the  given  buildid is known to the server, this request will result in a binary object that contains
       the normal executable segments.  This may be a executable stripped by strip, or it may be an original un‐
       stripped executable.  ET_DYN shared libraries are considered to be a type of executable.

   /buildid/BUILDID/source/SOURCE/FILE
       If the given buildid is known to the server, this request will result in a binary  object  that  contains
       the  source  file  mentioned.   The  path should be absolute.  Relative path names commonly appear in the
       DWARF file's source directory, but these paths are relative to individual  compilation  unit  AT_comp_dir
       paths,  and yet an executable is made up of multiple CUs.  Therefore, to disambiguate, debuginfod expects
       source queries to prefix relative path names with the CU compilation-directory, followed by  a  mandatory
       "/".

       Note: the caller may or may not elide ../ or /./ or extraneous /// sorts of path components in the direc‐
       tory  names.  debuginfod accepts both forms.  Specifically, debuginfod canonicalizes path names according
       to RFC3986 section 5.2.4 (Remove Dot Segments), plus reducing any // to / in the path.

       For example:
       #include <stdio.h>               /buildid/BUILDID/source/usr/include/stdio.h
       /path/to/foo.c                   /buildid/BUILDID/source/path/to/foo.c
       ../bar/foo.c AT_comp_dir=/zoo/   /buildid/BUILDID/source/zoo//../bar/foo.c

       Note: the client should %-escape characters in /SOURCE/FILE that are not shown as "unreserved" in section
       2.3 of RFC3986. Some characters that will be escaped include "+", "\", "$", "!", the  'space'  character,
       and ";". RFC3986 includes a more comprehensive list of these characters.

   /buildid/BUILDID/section/SECTION
       If  the  given  buildid  is  known  to  the server, the server will attempt to extract the contents of an
       ELF/DWARF section named SECTION from the debuginfo file matching BUILDID.  If the debuginfo file can't be
       found or the section has type SHT_NOBITS, then the server will attempt to extract the  section  from  the
       executable matching BUILDID.  If the section is successfully extracted then this request results in a bi‐
       nary  object of the section's contents.  Note that this result is the raw binary contents of the section,
       not an ELF file.

   /metrics
       This endpoint returns a Prometheus formatted text/plain dump of a variety of statistics about the  opera‐
       tion  of  the  debuginfod  server.  The exact set of metrics and their meanings may change in future ver‐
       sions.

DATA MANAGEMENT

       debuginfod stores its index in an sqlite database in a densely packed set of interlinked  tables.   While
       the  representation is as efficient as we have been able to make it, it still takes a considerable amount
       of data to record all debuginfo-related data of potentially a great many files.  This section offers some
       advice about the implications.

       As a general explanation for size, consider that debuginfod indexes  ELF/DWARF  files,  it  stores  their
       names  and  referenced source file names, and buildids will be stored.  When indexing archives, it stores
       every file name of or in an archive, every buildid, plus every source file name referenced from  a  DWARF
       file.   (Indexing archives takes more space because the source files often reside in separate subpackages
       that may not be indexed at the same pass, so extra metadata has to be kept.)

       Getting down to numbers, in the case of Fedora RPMs (essentially, gzip-compressed cpio files), the sqlite
       index database tends to be from 0.5% to 3% of their size.  It's larger for binaries  that  are  assembled
       out  of  a  great  many source files, or packages that carry much debuginfo-unrelated content.  It may be
       even larger during the indexing phase due to temporary sqlite write-ahead-logging files; these are check‐
       pointed (cleaned out and removed) at shutdown.  It may be helpful to apply tight -I or -X regular-expres‐
       sion constraints to exclude files from scanning that you know have no debuginfo-relevant content.

       As debuginfod runs in normal active mode, it periodically rescans its target  directories,  and  any  new
       content  found  is  added  to the database.  Old content, such as data for files that have disappeared or
       that have been replaced with newer versions is removed at a periodic grooming pass.  This means that  the
       sqlite files grow fast during initial indexing, slowly during index rescans, and periodically shrink dur‐
       ing  grooming.  There is also an optional one-shot maximal grooming pass is available.  It removes infor‐
       mation debuginfo-unrelated data from the archive content index such  as  file  names  found  in  archives
       ("archive  sdef"  records)  that  are  not referred to as source files from any binaries find in archives
       ("archive sref" records).  This can save considerable disk space.  However, it is  slow  and  temporarily
       requires  up  to twice the database size as free space.  Worse: it may result in missing source-code info
       if the archive traversals were interrupted, so that not all source file references were  known.   Use  it
       rarely to polish a complete index.

       You  should  ensure  that ample disk space remains available.  (The flood of error messages on -ENOSPC is
       ugly and nagging.  But, like for most other errors, debuginfod will resume when  resources  permit.)   If
       necessary, debuginfod can be stopped, the database file moved or removed, and debuginfod restarted.

       sqlite  offers  several  performance-related options in the form of pragmas.  Some may be useful to fine-
       tune the defaults plus the debuginfod extras.  The -D option may be useful to tell debuginfod to  execute
       the  given  bits  of  SQL  after  the  basic  schema  creation commands.  For example, the "synchronous",
       "cache_size", "auto_vacuum", "threads", "journal_mode" pragmas may be fun to  tweak  via  -D,  if  you're
       searching  for  peak performance.  The "optimize", "wal_checkpoint" pragmas may be useful to run periodi‐
       cally, outside debuginfod.  The default settings are performance- rather than reliability-oriented, so  a
       hardware  crash  might  corrupt the database.  In these cases, it may be necessary to manually delete the
       sqlite database and start over.

       As debuginfod changes in the future, we may have no choice but to change the database schema in an incom‐
       patible manner.  If this happens, new versions of debuginfod will issue SQL statements to drop all  prior
       schema  &  data,  and  start  over.   So, disk space will not be wasted for retaining a no-longer-useable
       dataset.

       In summary, if your system can bear a 0.5%-3% index-to-archive-dataset size ratio, and slow growth after‐
       wards, you should not need to worry about disk space.  If a system crash corrupts the  database,  or  you
       want to force debuginfod to reset and start over, simply erase the sqlite file before restarting debugin‐
       fod.

       In contrast, in passive mode, all scanning and grooming is disabled, and the index database remains read-
       only.   This  makes  the  database more suitable for sharing between servers or sites with simple one-way
       replication, and data management considerations are generally moot.

SECURITY

       debuginfod does not include any particular security features.  While it is robust with respect to inputs,
       some abuse is possible.  It forks a new thread for each incoming HTTP request, which could lead to a  de‐
       nial-of-service  in terms of RAM, CPU, disk I/O, or network I/O.  If this is a problem, users are advised
       to install debuginfod with a HTTPS reverse-proxy front-end that enforces site policies  for  firewalling,
       authentication, integrity, authorization, and load control.

       Front-end proxies may elide sensitive path name components in X-DEBUGINFOD-FILE/ARCHIVE response headers.
       For example, using Apache httpd's mod_headers, you can remove the entire directory name prefix:

       Header edit x-debuginfod-archive ".*/" ""

       When  relaying  queries to upstream debuginfods, debuginfod does not include any particular security fea‐
       tures.  It trusts that the binaries returned by the debuginfods are accurate.   Therefore,  the  list  of
       servers  should  include  only  trustworthy ones.  If accessed across HTTP rather than HTTPS, the network
       should be trustworthy.  Authentication information through the internal libcurl library is not  currently
       enabled.

ADDITIONAL FILES

       $HOME/.debuginfod.sqlite
              Default database file.

SEE ALSO

       debuginfod-find(1) sqlite3(1) https://prometheus.io/docs/instrumenting/exporters/

                                                                                                   DEBUGINFOD(8)