Provided by: maildir-utils_1.12.9-1_amd64 bug

NAME

       mu-index - index e-mail messages stored in Maildirs

SYNOPSIS

       mu [COMMON-OPTIONS] index

DESCRIPTION

       mu  index is the mu command for scanning the contents of Maildir directories and storing the results in a
       Xapian database. The data can then be queried using mu-find(1).

       Before the first time you run mu index, you must run mu init to initialize the database.

       index understands Maildirs as defined by Daniel Bernstein  for  qmail(7).  In  addition,  it  understands
       recursive  Maildirs (Maildirs within Maildirs), Maildir++. It also supports VFAT-based Maildirs which use
       ! or ; as the separators instead of :.

       E-mail messages which are not stored in something resembling a maildir leaf-directory (cur and  new)  are
       ignored, as are the cache directories for notmuch and gnus, and any dot-directory.

       Symlinks  are  followed,  and  the directories can be spread over multiple filesystems; however note that
       moving files around is much faster when multiple filesystems are not involved. Be careful to avoid  self-
       referential symlinks!

       If  there  is  a  file  called  .noindex  in  a  directory, the contents of that directory and all of its
       subdirectories will be ignored. This can be useful to  exclude  certain  directories  from  the  indexing
       process, for example directories with spam-messages.

       If  there  is  a  file  called  .noupdate  in  a directory, the contents of that directory and all of its
       subdirectories will be ignored. This can be useful to speed up things you have some maildirs  that  never
       change.

       .noupdate  does  not affect already-indexed messages: you can still search for them. .noupdate is ignored
       when you start indexing with an empty database (such as directly after mu init).

       There also the option --lazy-check which can greatly speed up indexing; see below for details.

       The first run of mu index may take a few minutes if you  have  a  lot  of  mail  (tens  of  thousands  of
       messages).  Fortunately, such a full scan needs to be done only once; after that it suffices to index the
       changes, which goes much faster. See the `PERFORMANCE (i,ii,iii)' below for more information.

       The optional `phase two' of the indexing-process is the removal of messages from the database  for  which
       there  is  no  longer  a  corresponding  file  in  the Maildir.  If you do not want this, you can use -n,
       --nocleanup.

       When mu index catches one of the signals SIGINT, SIGHUP or SIGTERM (e.g., when you  press  Ctrl-C  during
       the  indexing  process),  it attempts to shutdown gracefully; it tries to save and commit data, and close
       the database etc. If it receives another signal (e.g., when pressing Ctrl-C once  more),  mu  index  will
       terminate immediately.

INDEX OPTIONS

   --lazy-check
       In  lazy-check  mode,  mu does not consider messages for which the time-stamp (ctime) of the directory in
       which they reside, has not changed since the previous time this directory was checked.

       This is much faster than the non-lazy check, but won't update messages that  have  changed  (rather  than
       having  been  added or removed), since merely editing a message does not update the directory time-stamp.
       Of course, you can run mu-index occasionally without --lazy-check, to pick up such messages.

       Furthermore, in lazy-check mode, files which have a ctime smaller than the  time  the  previous  indexing
       operation  was  completed,  are ignored. This helps for the use-case where new messages can appear in big
       maildirs.

   --nocleanup
       Disable the database cleanup that mu does by default after indexing.

   --reindex
       Perform a complete reindexing of all the messages in the maildir.

   --muhome
       Use a non-default directory to store and read the database, write the logs, etc.  By default, mu uses the
       XDG Base Directory Specification (e.g. on GNU/Linux  this  defaults  to  ~/.cache/mu  and  ~/.config/mu).
       Earlier versions of mu defaulted to ~/.mu, which now requires --muhome=~/.mu.

       The environment variable MUHOME can be used as an alternative to --muhome. The latter has precedence.

COMMON OPTIONS

   -d, --debug
       Makes  mu  generate  extra  debug information, useful for debugging the program itself. Debug information
       goes to the standard logging location; see mu(1).

   -q, --quiet
       Causes mu not to output informational messages and progress information to standard output, but  only  to
       the log file. Error messages will still be sent to standard error. Note that mu index is much faster with
       --quiet, so it is recommended you use this option when using mu from scripts etc.

   --log-stderr
       Causes  mu  to  not  output  log  messages to standard error, in addition to sending them to the standard
       logging location.

   --nocolor
       Do not use ANSI colors. The environment variable NO_COLOR can be used as an alternative to --nocolor.

   -V, --version
       Prints mu version and copyright information.

   -h, --help
       Lists the various command line options.

ENCRYPTION

       mu index does not decrypt messages, and only the metadata (such as headers) of encrypted  messages  makes
       it  to  the database. mu view and mu4e can decrypt messages, but those work with the message directly and
       the information is not added to the database.

PERFORMANCE

   indexing in ancient times (2009?)
       As a non-scientific benchmark, a simple test on the author's machine (a Thinkpad X61s laptop using  Linux
       2.6.35 and an ext3 file system) with no existing database, and a maildir with 27273 messages:

              $ sudo sh -c 'sync && echo 3 > /proc/sys/vm/drop_caches'
              $ time mu index --quiet
              66,65s user 6,05s system 27% cpu 4:24,20 total

       (about 103 messages per second)

       A second run, which is the more typical use case when there is a database already, goes much faster:

              $ sudo sh -c 'sync && echo 3 > /proc/sys/vm/drop_caches'
              $ time mu index --quiet
              0,48s user 0,76s system 10% cpu 11,796 total

       (more than 56818 messages per second)

       Note  that  each  test flushes the caches first; a more common use case might be to run mu index when new
       mail has arrived; the cache may stay quite `warm' in that case:

              $ time mu index --quiet
              0,33s user 0,40s system 80% cpu 0,905 total

       which is more than 30000 messages per second.

   indexing in 2012
       As per June 2012, we did the same non-scientific benchmark,  this  time  with  an  Intel  i5-2500  CPU  @
       3.30GHz, an ext4 file system and a maildir with 22589 messages. We start without an existing database.

              $ sudo sh -c 'sync && echo 3 > /proc/sys/vm/drop_caches'
              $ time mu index --quiet
              27,79s user 2,17s system 48% cpu 1:01,47 total

       (about 813 messages per second)

       A second run, which is the more typical use case when there is a database already, goes much faster:

              $ sudo sh -c 'sync && echo 3 > /proc/sys/vm/drop_caches'
              $ time mu index --quiet
              0,13s user 0,30s system 19% cpu 2,162 total

       (more than 173000 messages per second)

   indexing in 2016
       As  per  July 2016, we did the same non-scientific benchmark, again with the Intel i5-2500 CPU @ 3.30GHz,
       an ext4 file system. This time, the maildir contains 72525 messages.

              $ sudo sh -c 'sync && echo 3 > /proc/sys/vm/drop_caches'
              $ time mu index --quiet
              40,34s user 2,56s system 64% cpu 1:06,17 total

       (about 1099 messages per second).

   indexing in 2022
       A few years later and it is June 2022. There's a lot more happening during indexing, but indexing  became
       multi-threaded  and  machines  are faster; e.g. this is with an AMD Ryzen Threadripper 1950X (16 cores) @
       3.399GHz.

       The instructions are a little different since we have a proper repeatable benchmark now. After building,

              $ sudo sh -c 'sync && echo 3 > /proc/sys/vm/drop_caches'
              % THREAD_NUM=4 build/lib/tests/bench-indexer -m perf
              # random seed: R02Sf5c50e4851ec51adaf301e0e054bd52b
              1..1
              # Start of bench tests
              # Start of indexer tests
              indexed 5000 messages in 20 maildirs in 3763ms; 752 μs/message; 1328 messages/s (4 thread(s))
              ok 1 /bench/indexer/4-cores
              # End of indexer tests
              # End of bench tests

       Things are again a little faster, even though the index does a lot more now (text-normalization, and pre-
       generating message-sexps). A faster machine helps, too!

   recent releases
       Indexing the the same 93000-message mail corpus with the last few releases:

                      ┌───────────────────────────────────────────────────────────────────────┐
                      │       release   time (sec)   notes                                    │
                      ├───────────────────────────────────────────────────────────────────────┤
                      │           1.4   160s                                                  │
                      │           1.6   178s                                                  │
                      │           1.8   97s                                                   │
                      │          1.10   120s         adds html indexing, sexp-caching         │
                      │ 1.11 (master)   96s          adds language-guessing, batch-size=50000 │
                      └───────────────────────────────────────────────────────────────────────┘

       Quite some variation!

       Over time new features / refactoring can change the timings quite a bit. At least  for  now,  the  latest
       code is both the fastest and the most featureful!

EXIT CODE

       This command returns 0 upon successful completion, or a non-zero exit code otherwise.

       0.  success

       2.  no matches found. Try a different query

       11. database schema mismatch. You need to re-initialize mu, see mu-init(1)

       19. failed to acquire lock. Some other program has exclusive access to the mu database

       99. caught an exception

REPORTING BUGS

       Please report bugs at https://github.com/djcb/mu/issues.

AUTHOR

       Dirk-Jan C. Binnema <djcb@djcbsoftware.nl>

COPYRIGHT

       This manpage is part of mu 1.12.9.

       Copyright   ©   2008-2025   Dirk-Jan   C.   Binnema.   License   GPLv3+:  GNU  GPL  version  3  or  later
       https://gnu.org/licenses/gpl.html. This is free software: you are free to  change  and  redistribute  it.
       There is NO WARRANTY, to the extent permitted by law.

SEE ALSO

       maildir(5), mu(1), mu-init(1), mu-find(1), mu-cfind(1)

                                                                                                     MU INDEX(1)