Provided by: qtltools_1.3.1+dfsg-4build3_amd64 bug

NAME

       QTLtools pca - Conducts PCA

SYNOPSIS

       QTLtools pca --vcf [in.vcf|in.vcf.gz|in.bcf] | --bed in.bed.gz --out output.txt [OPTIONS]

DESCRIPTION

       This  mode  allows  performing  a  Principal  Component  Analysis  (PCA)  either  on  molecular phenotype
       quantifications or genotype data.  It is typically used (i) to detect  outliers  in  the  data,  (ii)  to
       detect stratification in the data or (iii) to build a covariate matrix before QTL mapping.  QTLtools' PCA
       implementation  utilizes singular value decomposition (SVD).  When building a covariate matrix to account
       for technical covariates we recommend using --center and  --scale.

OPTIONS

       --vcf [in.vcf|in.bcf|in.vcf.gz|in.bed.gz]
              Genotypes in VCF/BCF/BED format.  REQUIRED unless --bed.

       --bed quantifications.bed.gz
              Quantifications in BED format.  REQUIRED unless --vcf.

       --out output_prefix
              Output file prefix.  REQUIRED.

       --center
              Center the variables (genotypes or phenotypes) by subtracting the mean from each value

       --scale
              Scale the variables (genotypes or phenotypes) by dividing each value by the standard deviation

       --region chr:start-end
              Genomic region to be processed.  E.g. chr4:12334456-16334456, or chr5

       --exclude-chrs string
              The chromosomes to exclude given as a space separated list.  Only applies to --vcf.  DEFAULT="X  Y
              M MT XY chrX chrY chrM chrMT chrXY"

       --maf float
              Exclude sites with minor allele frequency less than this.  Only applies to --vcf.  DEFAULT=0.0

       --distance integer
              Only include sites separated with this many base pairs.  Only applies to --vcf.  DEFAULT=0

OUTPUT FILES

       .pca
        This  file  contains  the  principal  components  that  were  calculated.   The  names  of the principal
        components, which is given in the first column, is composed of the output file prefix, whether the  data
        was centered, whether the data was scaled, and the principal component number.

       .pca_stats
        This  file  contains  the  standard  deviation  of  each  principal  component, and the variance and the
        cumulative variance explained by each PC.

EXAMPLES

       o Running pca on RNAseq quantifications to calculate technical covariates:

         QTLtools pca --bed genes.50percent.chr22.bed.gz --out genes.50percent.chr22 --center --scale

       o Running pca on genotypes to detect population stratification:

         QTLtools pca --vcf genotypes.chr22.vcf.gz --out genotypes.chr22 --center --scale --maf 0.05  --distance
         5000

SEE ALSO

       QTLtools(1)

       QTLtools website: <https://qtltools.github.io/qtltools>

BUGS

       o Versions  up  to  and  including  1.2, suffer from a bug in reading missing genotypes in VCF/BCF files.
         This bug affects variants with a DS field in their genotype's FORMAT and have a  missing  genotype  (DS
         fiels  is  .)  in  one  of the samples, in which case genotypes for all the samples are set to missing,
         effectively removing this variant from the analyses.

       Please submit bugs to <https://github.com/qtltools/qtltools>

AUTHORS

       Halit Ongen (halitongen@gmail.com), Olivier Delaneau (olivier.delaneau@gmail.com)

QTLtools-v1.3                                      06 May 2020                                   QTLtools-pca(1)