Provided by: patman_1.2.2+dfsg-8_amd64 bug

NAME

       PatMaN - search for approximate patterns in DNA libraries

SYNOPSIS

       patman [ option | file ... ]

DESCRIPTION

       PatMaN searches for (small) patterns in (huge) DNA databases, allowing for some mismatches and optionally
       gaps.   Patterns  and  databases are read from one or more fasta(5) files listed as non-option arguments,
       depending on whether the -D or -P option last preceded them, and matched against each other.  The  output
       of PatMaN is a table containing one line for each match, consisting of tab-separated fields:

       •   name of database sequence,

       •   name of pattern,

       •   position of first matched base in database sequence, the sequence's beginning has position 1,

       •   position of last matched base in database sequence,

       •   strand (+ for literal match, - for reverse complement),

       •   edit distance (number of mismatches plus number of gaps).

OPTIONS

       -V, --version
              Print version number and exit.

       -e num, --edits num
              Allow up to num mismatches and/or gaps per match.

       -g num, --gaps num
              Allow  up to num gaps per match.  Note that gaps count as mismatches, too, so the -e option should
              always be set at least as high as the -g option.  Allowing many  gaps  can  incur  a  considerable
              computational cost.

       -D, --databases
              Treat  the  following files as database.  Databases must be in fasta(5) format.  Multiple database
              files, including "-" for standard input, are allowed and are read in turn.

       -P, --patterns
              Treat the following files as patterns.  Pattern  files  must  be  in  fasta(5)  format.   Multiple
              pattern  files, including "-" for standard input, are allowed and are all read before touching the
              databases.

       -o file, --output file
              Redirect output to file.  The file name "-" causes output to be written to stdout, which  is  also
              the default

       -a, --ambicodes
              Activate  the interpretation of ambiguity codes in patterns.  This results in the expansion of any
              pattern with ambiguity codes into  multiple  patterns  which  can  match  independently.   Compare
              Unknown Nucleotides below.

       -s, --singlestrand
              Deactivate  matching  of  reverse-complements.   Normally,  PatMaN will try to match patterns both
              literally and after reverse-complementing them,  with  this  option  set,  only  straight  forward
              matches are considered.

       -p num, --prefetch num
              Causes  num pointers to be prefetched in advance.  This feature can improve performance, if PatMaN
              has been compiled for a processor architecture that supports prefetching.  The optimum  value  for
              your particular setup has to be determined empirically, but the default should be reasonably good.

       -l len, --min-length len
              Only  consider  patterns  with  a  length  of  at  least len.  Use this if your pattern collection
              contains short sequences that you don't want lots of possible matches reported for.

       -x num, --chop3 num
              Cut off num bases from the 3' end of each pattern.  Use this for patterns  with  damaged,  edited,
              etc.  3'  ends  that should be ignored.  The chopped bases are neither matched nor included in the
              reported match regions.

       -X num, --chop5 num
              Cut off num bases from the 5' end of each pattern.  Use this for patterns  with  damaged,  edited,
              etc.  5'  ends  that should be ignored.  The chopped bases are neither matched nor included in the
              reported match regions.

       -A, --adenine-hack
              Allow adenine to be ignored in patterns.  This is essentially equivalent to not counting  gaps  in
              the  database,  as long as it was an A that was gapped.  Using -A can be computationally extremely
              expensive, both in terms of memory and time consumed.

       -q, --quiet
              Suppress warnings (about unrecognized characters in input sequences or missing input files).  Even
              without -q, at most one such warning is given per run.

       -v, --verbose
              Prints additional progress information to stderr.

       -d flags, --debug flags
              Sets debugging flags to flags.Flags may be the logical OR of any of the following values, each  of
              which causes some output to appear on stderr.  Some of the values may only work if PatMaN has been
              compiled in debug mode.  The default value is 1.

       1      Print warnings.  Equivalent to not setting -q.

       2      Print progress information.  Equivalent to setting -v.

       4      Dump the suffix trie of the patterns.  Only available in debug build.

       8      Count  number  of  visited nodes and print that number in each iteration.  Only available in debug
              build.

       16     Print total number of nodes fetched from memory after completing all databases.

       32     Output database sequence while it is being matched.

NOTES

   Non-Option Arguments
       Non-option arguments (bare filenames) are either treated as  database  or  pattern  files,  depending  on
       whether the -D or -P option was the the last that occurred before the filename.  If neither -D nor -P was
       given,  file  names  are  treated  as  pattern  files.  If no database was given, it is instead read from
       standard input.  Standard input can be explicitly given as either a database or a pattern file  by  using
       the  filename  "-".   A  warning  is given if standard input is selected implicitly as database, an error
       message is given if no pattern files have been named at all.

   Gapped Matching
       Allowing gaps often causes overlapping matches of single patterns at almost the  same  position.   PatMaN
       makes  no  attempt  to filter these redundant matches.  Also note that allowing many gaps, and especially
       allowing an arbitrary amount of gaps through the -A hack can slow down PatMaN considerably and  cause  it
       to produce enormous amounts of output.  The use of some sorty of post-processor to filter these is highly
       recommended.

   Unknown Nucleotides
       Unknown  nucleotides  are most often encoded by the letter N.  If the --ambicodes option is not given, Ns
       in patterns are interpreted as unknown nucleotides and can never match without penalty.   If  --ambicodes
       is  given,  Ns  in  patterns  are  expanded just like the other amibuguity codes, and effectively work as
       wildcards.  Unknown nucleotides can still be encoded by an X and will never match anything.  The database
       is treated differently in that anything other than A, C, G,  T  and  U,  including  ambiguity  codes,  is
       treated as unknown and can never match without penalty.

FILES

       /etc/popt
              The system wide configuration file for popt(3).  PatMaN identifies itself as "patman" to popt.

       ~/.popt
              Per user configuration file for popt(3).

BUGS

       None known.

AUTHOR

       Kay Pruefer <pruefer@eva.mpg.de>
       Udo Stenzel <udo_stenzel@eva.mpg.de>

SEE ALSO

       popt(3),fasta(5)

Applications                                      JANUARY 2008                                         PATMAN(1)