Provided by: qtltools_1.3.1+dfsg-4build3_amd64 bug

NAME

       QTLtools - A complete tool set for molecular QTL discovery and analysis

SYNOPSIS

       QTLtools [MODE] [OPTIONS]

DESCRIPTION

       QTLtools  is  a complete tool set for molecular QTL discovery and analysis that is fast, user and cluster
       friendly.  QTLtools performs multiple key tasks such as  checking  the  quality  of  the  sequence  data,
       checking  that  sequence and genotype data match, quantifying and stratifying individuals using molecular
       phenotypes, discovering proximal or distal molQTLs and integrating them with  functional  annotations  or
       GWAS  data,  and  analyzing  allele  specific expression.  It utilizes HTSlib <http://www.htslib.org/> to
       quickly and efficiently handle common genomics files types like VCF, BCF, BAM, SAM, CRAM, BED,  and  GTF,
       and the Eigen C++ library <http://eigen.tuxfamily.org/> for fast linear algebra.

MODES

       bamstat      QTLtools  bamstat  --bam  [in.sam|in.bam|in.cram]  --bed  annotation.bed.gz --out output.txt
                    [OPTIONS]

                    Calculate basic QC metrics for BAM/SAM.

       mbv          QTLtools mbv --bam [in.sam|in.bam|in.cram] --vcf [in.vcf|in.vcf.gz|in.bcf] --out  output.txt
                    [OPTIONS]

                    Match BAM to VCF

       pca          QTLtools pca --vcf [in.vcf|in.vcf.gz|in.bcf] | --bed in.bed.gz --out output.txt [OPTIONS]

                    Calculate principal components for a BED/VCF/BCF/CRAM file.

       correct      QTLtools  correct  --vcf  [in.vcf|in.vcf.gz|in.bcf] | --bed in.bed.gz --cov covariates.txt |
                    --normal --out output.txt [OPTIONS]

                    Covariate correction of a BED or a VCF file.

       cis          QTLtools  cis   --vcf   [in.vcf|in.vcf.gz|in.bcf|in.bed.gz]   --bed   quantifications.bed.gz
                    [--nominal float | --permute integer | --mapping in.txt] --out output.txt [OPTIONS]

                    cis QTL analysis.

       trans        QTLtools   trans   --vcf  [in.vcf|in.vcf.gz|in.bcf|in.bed.gz]  --bed  quantifications.bed.gz
                    [--nominal | --permute | --sample integer | --adjust in.txt] --out output.txt [OPTIONS]

                    trans QTL analysis.

       fenrich      QTLtools fenrich --qtl significanty_genes.bed  --tss  gene_tss.bed  --bed  TFs.encode.bed.gz
                    --out output.txt [OPTIONS]

                    Functional enrichment for QTLs.

       fdensity     QTLtools  fdensity  --qtl  significanty_genes.bed  --bed  TFs.encode.bed.gz --out output.txt
                    [OPTIONS]

                    Functional density around QTLs.

       genrich      QTLtools genrich --qtl significanty_genes.bed --tss  gene_tss.bed  --vcf  1000kg.vcf  --gwas
                    gwas_hits.bed --out output.txt [OPTIONS]

                    GWAS enrichment for QTLs.  This mode is deprecated and not supported, use rtc instead.

       rtc          QTLtools   rtc   --vcf   [in.vcf|in.vcf.gz|in.bcf|in.bed.gz]   --bed  quantifications.bed.gz
                    --hotspots hotspots_b37_hg19.bed [--gwas-cis | --gwas-trans | --mergeQTL-cis  |  --mergeQTL-
                    trans] variants_external.txt qtls_in_this_dataset.txt --out output.txt [OPTIONS]

                    Regulatory  Trait Concordance score analysis to test if two colocalizing variants are due to
                    the same functional effect.

       rtc-union    QTLtools     rtc-union     --vcf     [in.vcf|in.vcf.gz|in.bcf|in.bed.gz]     ...       --bed
                    quantifications.bed.gz ...  --hotspots hotspots_b37_hg19.bed --results qtl_results_files.txt
                    ...  [OPTIONS]

                    Find  the  union  of  QTLs  from  independent  datasets.   If  there  was  a  QTL in a given
                    recombination interval in one dataset, then find the best QTL (may or may not be genome-wide
                    significant) in the same recombination interval in all other datasets.

       extract      QTLtools extract [--vcf --bed --cov] relevant_file --out output_prefix [OPTIONS]

                    Data extraction mode.  Extract all the data from the provided files into one flat file.

       quan         QTLtools quan --bam [in.sam|in.bam|in.cram] --gtf  gene_annotation.gtf  --out-prefix  output
                    [OPTIONS]

                    Quantify gene and exon expression from RNAseq.

       ase          QTLtools   ase   --bam   [in.sam|in.bam|in.cram]   --vcf   [in.vcf|in.vcf.gz|in.bcf]   --ind
                    sample_name_in_vcf --mapq integer --out output.txt [OPTIONS]

                    Measure allele specific expression from RNAseq at transcribed heterozygous SNPs

       rep          QTLtools   rep   --bed   quantifications.bed.gz   --vcf   [in.vcf|in.vcf.gz|in.bcf]    --qtl
                    qtls_external.txt --out output.txt [OPTIONS]

                    Replicate QTL associations in an independent dataset

       gwas         QTLtools  gwas  --vcf [in.vcf|in.vcf.gz|in.bcf|in.bed.gz] --bed quantifications.bed.gz --out
                    output.txt [OPTIONS]

                    GWAS tests. Correlate all genotypes with all phenotypes.

GLOBAL OPTIONS

       QTLtools can read gzip, bgzip, and bzip2 files, and can output gzip and bzip2 files.  This  is  dependent
       on the input and output files' extension.  E.g --out output.txt.gz will write a gzipped file.

       The  following  are  common  options  that are used in all of the modes.  Some of these will not apply to
       certain modes.

       --help Produces a description of options for a given mode.

       --seed integer
              Random seed for analyses that utilizes randomness.   Useful  for  generating  replicable  results.
              Default=15112011.

       --log file
              Dump screen output to this file.

       --silent
              Disable screen output.

       --exclude-samples file
              List of samples to exclude.  One sample name per line.

       --include-samples file
              List of samples to include.  One sample name per line.

       --exclude-sites file
              List of variants to exclude.  One variant ID per line.

       --include-sites file
              List of variants to include.  One variant ID per line.

       --exclude-positions file
              List of positions to exclude from genotypes.  One chr position per line (separated by a space).

       --include-positions file
              List of positions to include from genotypes.  One chr position per line (separated by a space).

       --exclude-phenotypes file
              List of phenotypes to exclude.  One phenotype ID per line.

       --include-phenotypes file
              List of phenotypes to include.  One phenotype ID per line.

       --exclude-covariates file
              List of covariates to exclude.  One covariate name per line.

       --include-covariates file
              List of covariates to include.  One covariate name per line.

FILE FORMATS

       .bcf|.vcf|.vcf.gz
              These  files  are  used  for  genotype  data.   The  official  VCF  specification  is described at
              <https://samtools.github.io/hts-specs/VCFv4.2.pdf>.  The VCF/BCF files  used  with  QTLtools  must
              satisfy  this  spec's  requirements.   BCF  files  must  be  indexed  with  bcftools index  in.bcf
              <http://samtools.github.io/bcftools/bcftools.html>.  VCF  files  should  be  compressed  by  bgzip
              <http://www.htslib.org/doc/bgzip.html>    and    indexed    with    tabix    -p   vcf    in.vcf.gz
              <http://www.htslib.org/doc/tabix.html>.

       .bed|.bed.gz
              These files are used for phenotype data, and in certain modes they can also be used with the --vcf
              option, which can be used to correlate two molecular phenotypes.  The format used for QTLtools  is
              a  custom  UCSC  BED  format  <https://genome.ucsc.edu/FAQ/FAQformat.html#format1>,  which  has  6
              annotation columns followed by sample columns.  The header line must exist, and must begin with  a
              #  and  columns  must  be  tab  separated.  THIS  IS A DIFFERENT FILE FORMAT THAN THE ONE USED FOR
              FASTQTL, THUS FASTQTL BED FILES ARE INCOMPATIBLE WITH  QTLTOOLS.   Phenotype  BED  files  must  be
              compressed   by  bgzip  <http://www.htslib.org/doc/bgzip.html>  and  indexed  with  tabix  -p  bed
              in.bed.gz <http://www.htslib.org/doc/tabix.html>.  Missing values must be coded as NA.   Following
              is an example BED file:

              #chr start     end  pid  gid  strand    sample1   sample2
              1    9999 10000     exon1     gene1     +    15   234
              1    9999 10000     exon2     gene1     +    11   134
              1    19999     20000     exon1     gene2     -    154  284
              1    19999     20000     exon2     gene2     -    112  301

              BED file's annotation columns' descriptions:
              1   Phenotype chromosome [string]
              2   Start position of the phenotype [integer, 0-based]
              3   End position of the phenotype [integer, 1-based]
              4   Phenotype ID [string]
              5   Phenotype group ID or any type of info about the phenotype [string]
              6   Phenotype strand [+/-]

       .bam|.sam|.cram
              These  files  are  used  for  sequence  data.   The  official  SAM  specification  is described at
              <https://samtools.github.io/hts-specs/SAMv1.pdf>.  The SAM/BAM/CRAM files used with QTLtools  must
              satisfy  this spec's requirements.  SAM/BAM/CRAM files must be indexed with samtools index  in.bam
              <http://www.htslib.org/doc/samtools.html>.

       .gtf   These  files  are  used  for  gene  annotation.   The   file   specification   is   described   at
              <https://www.ensembl.org/info/website/upload/gff.html>.   The GTF files used must comply with this
              spec, and should have  the  gene_id,  transcript_id,  gene_name,  gene_type,  and  trnascript_type
              attributes.  We recommend using gene annotations from GENCODE <https://www.gencodegenes.org/>.

       covariate files
              The  covariate  file contains the covariate data in simple text format.  The missing values should
              be encoded as NA.  Both quantitative  and  qualitative  covariates  are  supported.   Quantitative
              covariates  are assumed when only numeric values are provided.  Qualitative covariates are assumed
              when only non-numeric values are provided.  In practice, qualitative covariates with F factors are
              converted in F-1 binary covariates.  Following is an example a covariate file:

              id   sample1   sample2   sample3
              PC1  -0.02     0.14 0.16
              PC2  0.01 0.11 0.10
              PC3  0.03 0.05 0.07
              COV  A    B    C

       include/exclude files
              The various --{include,exclude}-{sites,samples,phenotypes,covariates}  options  require  a  simple
              text  file  which  lists  the  IDs of the desired type, one ID per line.  The include options will
              result in running the analyses only in this subset of IDs, whereas  exclude  options  will  remove
              these  IDs  from  the  analyses.  The IDs for --{include,exclude}-sites refer to the 3rd column in
              VCF/BCF  files,  --{include,exclude}-covariates  refer  to  the   1st   column   in   COV   files,
              --{include,exclude}-phenotyps  refer  to the 4th column in BED files and when --grp-best option is
              used to the 5th column.  The --include-positions and --exclude-positions options  require  a  text
              file  which lists the chromosomes and positions (separated by a space) of genotypes to be excluded
              or included. One position per line.

IMPORTANT NOTES

       o BED files' start position is 0-based, whereas the end position is  1-based.   Positions  in  all  other
         files  used  in QTLtools are 1-based.  All positions provided as option arguments and filters, even the
         ones referring to BED files, must be 1-based.  1-based means the first base of  the  sequence  has  the
         position 1, whereas in 0-based the first position is 0.

       o Make sure the chromosome names are the same across all files.  If some files have e.g. chr1 and another
         has 1 as a chromosome name then these will be considered different chromosomes.

       o BED files used for FastQTL <http://fastqtl.sourceforge.net/> are not directly compatible with QTLtools.
         To  convert  a  FastQTL BED file to the format used in QTLtools you need to add 2 columns after the 4th
         column.

       o The quan mode in version 1.2 and above is not compatible with  the  quantifications  generated  by  the
         previous  versions.   This  due to bug fixes and slight adjustments to the way we quantify.  Do not mix
         quantifications generated by earlier versions of QTLtools with quantifications  from  version  1.2  and
         above, as this will create a bias in your dataset.

       o Make sure you index all your genotype, phenotype, and sequence files.

       o Use BCF and BAM files for the best performance.

EXAMPLE FILES

       exons.50percent.chr22.bed.gz  <http://jungle.unige.ch/QTLtools_examples/exons.50percent.chr22.bed.gz>
       exons.50percent.chr22.bed.gz.tbi   <http://jungle.unige.ch/QTLtools_examples/exons.50percent.chr22.bed.gz.tbi>
       gencode.v19.annotation.chr22.gtf.gz     <http://jungle.unige.ch/QTLtools_examples/gencode.v19.annotation.chr22.gtf.gz>
       gencode.v19.exon.chr22.bed.gz <http://jungle.unige.ch/QTLtools_examples/gencode.v19.exon.chr22.bed.gz>
       genes.50percent.chr22.bed.gz  <http://jungle.unige.ch/QTLtools_examples/genes.50percent.chr22.bed.gz>
       genes.50percent.chr22.bed.gz.tbi   <http://jungle.unige.ch/QTLtools_examples/genes.50percent.chr22.bed.gz.tbi>
       genes.covariates.pc50.txt.gz  <http://jungle.unige.ch/QTLtools_examples/genes.covariates.pc50.txt.gz>
       genes.simulated.chr22.bed.gz  <http://jungle.unige.ch/QTLtools_examples/genes.simulated.chr22.bed.gz>
       genes.simulated.chr22.bed.gz.tbi   <http://jungle.unige.ch/QTLtools_examples/genes.simulated.chr22.bed.gz.tbi>
       genotypes.chr22.vcf.gz   <http://jungle.unige.ch/QTLtools_examples/genotypes.chr22.vcf.gz>
       genotypes.chr22.vcf.gz.tbi    <http://jungle.unige.ch/QTLtools_examples/genotypes.chr22.vcf.gz.tbi>
       GWAS.b37.txt   <http://jungle.unige.ch/QTLtools_examples/GWAS.b37.txt>
       HG00381.chr22.bam   <http://jungle.unige.ch/QTLtools_examples/HG00381.chr22.bam>
       HG00381.chr22.bam.bai    <http://jungle.unige.ch/QTLtools_examples/HG00381.chr22.bam.bai>
       hotspots_b37_hg19.bed    <http://jungle.unige.ch/QTLtools_examples/hotspots_b37_hg19.bed>
       results.genes.full.txt.gz     <http://jungle.unige.ch/QTLtools_examples/results.genes.full.txt.gz>
       TFs.encode.bed.gz   <http://jungle.unige.ch/QTLtools_examples/TFs.encode.bed.gz>

SEE ALSO

       QTLtools-bamstat(1),  QTLtools-mbv(1),  QTLtools-pca(1),  QTLtools-correct(1), QTLtools-cis(1), QTLtools-
       trans(1), QTLtools-fenrich(1), QTLtools-fdensity(1),  QTLtools-rtc(1),  QTLtools-rtc-union(1),  QTLtools-
       extract(1), QTLtools-quan(1), QTLtools-ase(1), QTLtools-rep(1), QTLtools-gwas(1)

       QTLtools website: <https://qtltools.github.io/qtltools>

BUGS

       o Versions  up  to  and  including  1.2, suffer from a bug in reading missing genotypes in VCF/BCF files.
         This bug affects variants with a DS field in their genotype's FORMAT and have a  missing  genotype  (DS
         field  is  .)  in  one  of the samples, in which case genotypes for all the samples are set to missing,
         effectively removing this variant from the analyses.  Affected modes: cis,  correct,  gwas,  pca,  rep,
         trans, rtc-union

       Please submit bugs to <https://github.com/qtltools/qtltools>

CITATIONS

       Delaneau  O., Ongen H., Brown A. A., et al. A complete tool set for molecular QTL discovery and analysis.
       Nat Commun 8, 15452 (2017).  <https://doi.org/10.1038/ncomms15452>

       Ongen H, Brown A. A., Delaneau O., et al. Estimating the causal tissues for complex traits and  diseases.
       Nat Genet. 2017;49(12):1676-1683. doi:10.1038/ng.3981 <https://doi.org/10.1038/ng.3981>

       Fort  A.,  Panousis  N.  I.,  Garieri  M.,  et  al.  MBV: a method to solve sample mislabeling and detect
       technical bias in large combined genotype and sequencing  assay  datasets,  Bioinformatics  33(12),  1895
       2017.  <https://doi.org/10.1093/bioinformatics/btx074>

AUTHORS

       Olivier Delaneau (olivier.delaneau@gmail.com), Halit Ongen (halitongen@gmail.com)

QTLtools-v1.3                                      06 May 2020                                       QTLtools(1)