Provided by: scoary_1.6.16-6_all bug

NAME

       scoary - pangenome-wide association studies

SYNOPSIS

       scoary  [-h]  [-t  TRAITS]  [-g  GENES]  [-n  NEWICKTREE]  [-s  START_COL]  [--delimiter  DELIMITER]  [-r
       RESTRICT_TO]  [-o  OUTDIR]  [-u]  [-p  P_VALUE_CUTOFF  [P_VALUE_CUTOFF  ...]]    [-c   [{I,B,BH,PW,EPW,P}
       [{I,B,BH,PW,EPW,P}  ...]]] [-m MAX_HITS] [--include_input_columns GRABCOLS] [-w] [--no-time] [-e PERMUTE]
       [--no_pairwise] [--collapse] [--threads THREADS] [--test] [--citation] [--version]

OPTIONS

   optional arguments:
       -h, --help
              show this help message and exit

   Input options:
       -t TRAITS, --traits TRAITS
              Input trait table (comma-separated-values). Trait presence is indicated by 1, trait absence by  0.
              Assumes strain names in the first column and trait names in the first row

       -g GENES, --genes GENES
              Input  gene  presence/absence table (comma-separatedvalues) from ROARY. Strain names must be equal
              to those in the trait table

       -n NEWICKTREE, --newicktree NEWICKTREE
              Supply a custom tree (Newick format) for phylogenetic analyses instead instead of  calculating  it
              internally.

       -s START_COL, --start_col START_COL
              On  which  column  in  the gene presence/absence file do individual strain info start. Default=15.
              (1-based indexing)

       --delimiter DELIMITER
              The delimiter between cells in the gene presence/absence and trait files, as well  as  the  output
              file.

       -r RESTRICT_TO, --restrict_to RESTRICT_TO
              Use  if  you  only  want  to  analyze  a  subset  of  your  strains. Scoary will read the provided
              comma-separated table of strains and restrict analyzes to these.

   Output options:
       -o OUTDIR, --outdir OUTDIR
              Directory to place output files. Default = .

       -u, --upgma_tree
              This flag will cause Scoary to write the calculated UPGMA tree to a newick file

       -p P_VALUE_CUTOFF [P_VALUE_CUTOFF ...], --p_value_cutoff P_VALUE_CUTOFF [P_VALUE_CUTOFF ...]
              P-value cut-off / alpha level. For Fishers, Bonferronis,  and  Benjamini-Hochbergs  tests,  SCOARY
              will  not report genes with higher p-values than this.  For empirical p-values, this is treated as
              an alpha level instead. I.e. 0.02 will filter all genes except the lower and upper percentile from
              this test. Run with "-p 1.0" to report all genes. Accepts standard form  (e.g.  1E-8).  Provide  a
              single  value  (applied  to  all)  or  exactly  as  many  values  as  correction  criteria  and in
              corresponding order. (See example under correction). Default = 0.05

       -c [{I,B,BH,PW,EPW,P} [{I,B,BH,PW,EPW,P} ...]], --correction [{I,B,BH,PW,EPW,P} [{I,B,BH,PW,EPW,P} ...]]
              Apply the indicated filtration measure. Allowed values are I, B,  BH,  PW,  EPW,  P.  I=Individual
              (naive)  p-value. B=Bonferroni adjusted p-value. BH=BenjaminiHochberg adjusted p. PW=Best (lowest)
              pairwise comparison. EPW=Entire range of pairwise comparison p-values.  P=Empirical  p-value  from
              permutations.  You  can  enter  as  many  correction  criteria  as  you  would like. These will be
              associated with the p_value_cutoffs you enter. For example "-c I EPW -p 0.1 0.05" will  apply  the
              following  cutoffs:  Naive  p-value  must  be  lower  than  0.1  AND  the entire range of pairwise
              comparison values are below 0.05 for this  gene.  Note  that  the  empirical  p-values  should  be
              interpreted  at  both  tails. Therefore, running "-c P -p 0.05" will apply an alpha of 0.05 to the
              empirical (permuted) p-values, i.e. it will filter everything  except  the  upper  and  lower  2.5
              percent of the distribution. Default = Individual p-value. (I)

       -m MAX_HITS, --max_hits MAX_HITS
              Maximum number of hits to report. SCOARY will only report the top max_hits results per trait

       --include_input_columns GRABCOLS
              Grab  columns  from  the  input Roary file. and puts them in the output. Handles comma and ranges,
              e.g.  --include_input_columns 4,6,8,16-23. The special keyword ALL will include all relevant input
              columns in the output

       -w, --write_reduced
              Use with -r if you want Scoary to create a new gene presence absence file from your reduced set of
              isolates. Note: Columns 1-14 (No. sequences, Avg group size nuc etc) in this file do  not  reflect
              the reduced dataset. These are taken from the full dataset.

       --no-time
              Output  file  in the form TRAIT.results.csv, instead of TRAIT_TIMESTAMP.csv. When used with the -w
              argument will output a reduced gene matrix in the  form  gene_presence_absence_reduced.csv  rather
              than gene_presence_absence_reduced_TIMESTAMP.csv

   Analysis options:
       -e PERMUTE, --permute PERMUTE
              Perform  N  number of permutations of the significant results post-analysis. Each permutation will
              do a label switching of the phenotype and a new  p-value  is  calculated  according  to  this  new
              dataset.  After  all N permutations are completed, the results are ordered in ascending order, and
              the percentile of the original result in the permuted p-value distribution is reported.

       --no_pairwise
              Do not perform pairwise comparisons. Inthis mode, Scoary will perform  population  structure-naive
              calculations  only.  (Fishers  test,  ORs  etc). Useful for summary operations and exploring sets.
              (Genes unique in groups, intersections etc) but not causal analyses.

       --collapse
              Add this to collapse correlated genes (genes that have  identical  distribution  patterns  in  the
              sample) into merged units.

   Misc options:
       --threads THREADS
              Number of threads to use. Default = 1

       --test Run Scoary on the test set in exampledata, overriding all other parameters.

       --citation
              Show citation information, and exit.

       --version
              Display Scoary version, and exit.

       by Ola Brynildsrud (olbb@fhi.no)

AUTHOR

       This manpage was written by Andreas Tille for the Debian distribution and can be used for any other usage
       of the program.

scoary 1.6.16                                     January 2019                                         SCOARY(1)