Provided by: unikmer_0.18.8-1ubuntu0.1_amd64 

NAME
unikmer - Toolkit for nucleic acid k-mer analysis
DESCRIPTION
unikmer - Unique-Kmer Toolkit
unikmer is a toolkit for nucleic acid k-mer analysis, providing functions including set operation on
k-mers optional with TaxIds but without count information.
K-mers are either encoded (k<=32) or hashed (arbitrary k) into 'uint64', and serialized in binary file
with extension '.unik'.
TaxIds can be assigned when counting k-mers from genome sequences, and LCA (Lowest Common Ancestor) is
computed during set opertions including computing union, intersection, set difference, unique and
repeated k-mers.
Version: v0.17.2
Author: Wei Shen <shenwei356@gmail.com>
Documents : https://shenwei356.github.io/unikmer Source code: https://github.com/shenwei356/unikmer
Dataset (optional):
Manipulating k-mers with TaxIds needs taxonomy file from e.g., NCBI Taxonomy database, please
extract "nodes.dmp", "names.dmp", "delnodes.dmp" and "merged.dmp" from link below into ~/.unikmer/
, ftp://ftp.ncbi.nih.gov/pub/taxonomy/taxdump.tar.gz , or some other directory, and later you can
refer to using flag --data-dir or environment variable UNIKMER_DB.
For GTDB, use https://github.com/nick-youngblut/gtdb_to_taxdump for taxonomy conversion.
Note that TaxIds are represented using uint32 and stored in 4 or less bytes, all TaxIds should be
in range of [1, 4294967295]
Usage:
unikmer [command]
Available Commands:
common Find k-mers shared by most of multiple binary files
concat Concatenate multiple binary files without removing duplicates
count Generate k-mers (sketch) from FASTA/Q sequences
decode Decode encoded integer to k-mer text
diff Set difference of multiple binary files
dump Convert plain k-mer text to binary format
encode Encode plain k-mer text to integer
filter Filter low-complexity k-mers (experimental)
genautocomplete generate shell autocompletion script (bash|zsh|fish|powershell) grep
Search k-mers from binary files head Extract the first N k-mers help Help
about any command info Information of binary files inter Intersection of
multiple binary files locate Locate k-mers in genome merge Merge k-mers from
sorted chunk files num Quickly inspect number of k-mers in binary files rfilter
Filter k-mers by taxonomic rank sample Sample k-mers from binary files sort
Sort k-mers in binary files to reduce file size split Split k-mers into sorted chunk
files tsplit Split k-mers according to taxid union Union of multiple binary
files uniqs Mapping k-mers back to genome and find unique subsequences version
Print version information and check for update view Read and output binary format to
plain text
Flags:
-c, --compact
write compact binary file with little loss of speed
--compression-level int
compression level (default -1)
--data-dir string
directory containing NCBI Taxonomy files, including nodes.dmp, names.dmp, merged.dmp and
delnodes.dmp (default "/home/nilesh/.unikmer")
-h, --help
help for unikmer
-I, --ignore-taxid
ignore taxonomy information
-i, --infile-list string
file of input files list (one file per line), if given, they are appended to files from cli
arguments
--max-taxid uint32
for smaller TaxIds, we can use less space to store TaxIds. default value is 1<<32-1, that's enough
for NCBI Taxonomy TaxIds (default 4294967295)
-C, --no-compress
do not compress binary file (not recommended)
--nocheck-file
do not check binary file, when using process substitution/named pipe
-j, --threads int
number of CPUs to use. (default value: 1 for single-CPU PC, 2 for others) (default 2)
--verbose
print verbose information
Use "unikmer [command] --help" for more information about a command.
AUTHOR
This manpage was written by Nilesh Patra for the Debian distribution and
can be used for any other usage of the program.
unikmer 0.18.3 August 2021 UNIKMER(1)