Provided by: flash_1.2.11-2_amd64 bug

NAME

       flash - Fast Length Adjustment of SHort reads

SYNOPSIS

       flash [OPTIONS] MATES_1.FASTQ MATES_2.FASTQ

       flash [OPTIONS] --interleaved-input (MATES.FASTQ | -)

       flash [OPTIONS] --tab-delimited-input (MATES.TAB | -)

DESCRIPTION

       FLASH (Fast Length Adjustment of SHort reads) is an accurate and fast tool to merge paired-end reads that
       were  generated from DNA fragments whose lengths are shorter than twice the length of reads.  Merged read
       pairs result in unpaired longer reads, which are generally more desired in  genome  assembly  and  genome
       analysis processes.

       Briefly,  the  FLASH  algorithm  considers all possible overlaps at or above a minimum length between the
       reads in a pair and chooses the overlap that results  in  the  lowest  mismatch  density  (proportion  of
       mismatched  bases  in  the  overlapped region).  Ties between multiple overlaps are broken by considering
       quality scores at mismatch sites.  When building the merged sequence, FLASH computes a consensus sequence
       in   the   overlapped   region.    More   details   can   be   found   in   the   original    publication
       (http://bioinformatics.oxfordjournals.org/content/27/21/2957.full).

   Limitations of FLASH include:
              - FLASH cannot merge paired-end reads that do not overlap.

              -  FLASH  is  not  designed for data that has a significant amount of indel errors (such as Sanger
              sequencing data).  It is best suited for Illumina data.

MANDATORY INPUT

       The most common input to FLASH is two FASTQ files containing read  1  and  read  2  of  each  mate  pair,
       respectively, in the same order.

       Alternatively,  you  may provide one FASTQ file, which may be standard input, containing paired-end reads
       in  either  interleaved  FASTQ  (see  the  --interleaved-input  option)   or   tab-delimited   (see   the
       --tab-delimited-input option) format.  In all cases, gzip compressed input is autodetected.  Also, in all
       cases, the PHRED offset is, by default, assumed to be 33; use the --phred-offset option to change it.

OUTPUT

       The default output of FLASH consists of the following files:

       - out.extendedFrags.fastq
              The merged reads.

       - out.notCombined_1.fastq
              Read 1 of mate pairs that were not merged.

       - out.notCombined_2.fastq
              Read 2 of mate pairs that were not merged.

       - out.hist
              Numeric histogram of merged read lengths.

       - out.histogram
              Visual histogram of merged read lengths.

       FLASH also logs informational messages to standard output.  These can also be redirected to a file, as in
       the following example:

              $ flash reads_1.fq reads_2.fq 2>&1 | tee flash.log

       In addition, FLASH supports several features affecting the output:

              - Writing the merged reads directly to standard output (--to-stdout)

              - Writing gzip compressed output files (-z) or using an external
                 compression program (--compress-prog)

              - Writing the uncombined read pairs in interleaved FASTQ format

              (--interleaved-output)

              - Writing all output reads to a single file in tab-delimited format

              (--tab-delimited-output)

OPTIONS

       -m, --min-overlap=NUM
              The  minimum  required  overlap length between two reads to provide a confident overlap.  Default:
              10bp.

       -M, --max-overlap=NUM
              Maximum overlap length expected in approximately 90% of read pairs.  It is by default set to 65bp,
              which works well for 100bp reads generated from a 180bp library, assuming a normal distribution of
              fragment lengths.  Overlaps longer than the maximum overlap parameter are still considered as good
              overlaps, but the mismatch density (explained below) is  calculated  over  the  first  max_overlap
              bases  in the overlapped region rather than the entire overlap.  Default: 65bp, or calculated from
              the specified read length, fragment length, and fragment length standard deviation.

       -x, --max-mismatch-density=NUM
              Maximum allowed ratio between the number of mismatched base pairs and  the  overlap  length.   Two
              reads  will  not  be  combined  with  a given overlap if that overlap results in a mismatched base
              density higher than this value.  Note: Any occurence of an 'N' in either read is ignored  and  not
              counted  towards  the  mismatches or overlap length.  Our experimental results suggest that higher
              values of the maximum mismatch density yield larger numbers of correctly merged read pairs but  at
              the expense of higher numbers of incorrectly merged read pairs.  Default: 0.25.

       -O, --allow-outies
              Also try combining read pairs in the "outie" orientation, e.g.

       Read 1: <-----------
              Read 2:       ------------>

              as opposed to only the "innie" orientation, e.g.

       Read 1:
              <------------

              Read 2: ----------->

       FLASH uses the same parameters when trying each
              orientation.   If  a  read  pair  can  be  combined  in both "innie" and "outie" orientations, the
              better-fitting one will be chosen using the same scoring algorithm that FLASH normally uses.

       This option also causes extra .innie and .outie
              histogram files to be produced.

       -p, --phred-offset=OFFSET
              The smallest ASCII value of the characters used to represent quality  values  of  bases  in  FASTQ
              files.   It  should  be  set  to  either 33, which corresponds to the later Illumina platforms and
              Sanger platforms, or 64, which corresponds to the earlier Illumina platforms.  Default: 33.

       -r, --read-len=LEN

       -f, --fragment-len=LEN

       -s, --fragment-len-stddev=LEN
              Average read length, fragment length, and fragment  standard  deviation.   These  are  convenience
              parameters  only,  as  they  are  only  used  for  calculating the maximum overlap (--max-overlap)
              parameter.  The maximum overlap is calculated as the  overlap  of  average-length  reads  from  an
              average-size  fragment  plus 2.5 times the fragment length standard deviation.  The default values
              are -r 100, -f 180, and -s 18, so this works out to a maximum overlap of 65 bp.  If  --max-overlap
              is specified, then the specified value overrides the calculated value.

       If you do not know the standard deviation of the
              fragment  library,  you  can  probably  assume  that  the standard deviation is 10% of the average
              fragment length.

       --cap-mismatch-quals
              Cap quality scores assigned at mismatch locations to 2.  This was the default  behavior  in  FLASH
              v1.2.7  and earlier.  Later versions will instead calculate such scores as max(|q1 - q2|, 2); that
              is, the absolute value of the difference in quality scores, but at least 2.  Essentially, the  new
              behavior  prevents  a  low  quality base call that is likely a sequencing error from significantly
              bringing down the quality of a high quality, likely correct base call.

       --interleaved-input
              Instead of requiring files MATES_1.FASTQ and MATES_2.FASTQ, allow a single file  MATES.FASTQ  that
              has the paired-end reads interleaved.  Specify "-" to read from standard input.

       --interleaved-output
              Write the uncombined pairs in interleaved FASTQ format.

       -I, --interleaved
              Equivalent to specifying both --interleaved-input and --interleaved-output.

       -Ti, --tab-delimited-input
              Assume  the  input  is in tab-delimited format rather than FASTQ, in the format described below in
              '--tab-delimited-output'.  In this mode you should provide a single input file, each line of which
              must contain either a read pair (5 fields) or a single read (3 fields).  FLASH will try to combine
              the read  pairs.   Single  reads  will  be  written  to  the  output  file  as-is  if  also  using
              --tab-delimited-output;  otherwise  they  will  be  ignored.  Note that you may specify "-" as the
              input file to read the tab-delimited data from standard input.

       -To, --tab-delimited-output
              Write output in tab-delimited format (not FASTQ).  Each line will contain either a  combined  pair
              in  the  format  'tag  <tab>  seq <tab> qual' or an uncombined pair in the format 'tag <tab> seq_1
              <tab> qual_1 <tab> seq_2 <tab> qual_2'.

       -o, --output-prefix=PREFIX
              Prefix of output files.  Default: "out".

       -d, --output-directory=DIR
              Path to directory for output files.  Default: current working directory.

       -c, --to-stdout
              Write the combined reads to standard output.  In this mode, with FASTQ output  (the  default)  the
              uncombined  reads  are discarded.  With tab-delimited output, uncombined reads are included in the
              tab-delimited data written to standard output.  In both cases, histogram files  are  not  written,
              and informational messages are sent to standard error rather than to standard output.

       -z, --compress
              Compress  the  output  files  directly  with  zlib,  using  the gzip container format.  Similar to
              specifying --compress-prog=gzip and --suffix=gz, but may be slightly faster.

       --compress-prog=PROG
              Pipe the output through the compression program PROG, which will be called as `PROG  -c  -',  plus
              any  arguments  specified by --compress-prog-args.  PROG must read uncompressed data from standard
              input and write compressed data to standard output when invoked as noted above.   Examples:  gzip,
              bzip2, xz, pigz.

       --compress-prog-args=ARGS
              A  string  of  additional  arguments  that  will  be  passed  to the compression program if one is
              specified with --compress-prog=PROG.  (The arguments '-c  -'  are  still  passed  in  addition  to
              explicitly specified arguments.)

       --suffix=SUFFIX, --output-suffix=SUFFIX
              Use  SUFFIX as the suffix of the output files after ".fastq".  A dot before the suffix is assumed,
              unless an empty suffix is provided.  Default: nothing; or 'gz' if -z  is  specified;  or  PROG  if
              --compress-prog=PROG is specified.

       -t, --threads=NTHREADS
              Set  the  number  of  worker threads.  This is in addition to the I/O threads.  Default: number of
              processors.  Note: if you need FLASH's output to appear deterministically or in the same order  as
              the original reads, you must specify -t 1 (--threads=1).

       -q, --quiet
              Do not print informational messages.

       -h, --help
              Display this help and exit.

       -v, --version
              Display version.

AUTHOR

        This manpage was written by Andreas Tille for the Debian distribution and
        can be used for any other usage of the program.

flash 1.2.11                                        May 2020                                            FLASH(1)