Ubuntu Manpage: PLINK - whole genome SNP analysis

Provided by: plink2_2.00~a5.8-231123+dfsg-1_amd64

NAME

       PLINK - whole genome SNP analysis

DESCRIPTION

       PLINK  v2.00a3  64-bit  (9  Apr  2020)               www.cog-genomics.org/plink/2.0/  (C) 2005-2020 Shaun
       Purcell, Christopher Chang   GNU General Public License v3

       In the command line flag definitions that follow,

              * <angle brackets> denote a required parameter, where the text between the

              angle brackets describes its nature.

       * ['square brackets + single-quotes'] denotes an optional modifier.
              Use the

              EXACT text in the quotes.

              * [{bar|separated|braced|bracketed|values}] denotes a collection of mutually

       exclusive optional modifiers (again, the exact text must be used).
              When

              there are no outer square brackets, one of the choices must be selected.

              * ['quoted_text='<description of value>] denotes an optional modifier that

              must begin with the quoted text, and be followed by a value with no whitespace  in  between.   '|'
              may also be used here to indicate mutually exclusive options.

              * [square brackets without quotes or braces] denote an optional parameter,

              where the text between the brackets describes its nature.

              * An ellipsis (...) indicates that you may enter multiple arguments of the

              specified type.

              * A "column set descriptor" is either

              1. a comma-separated sequence of column set names; this is interpreted as

              the full list of column sets to include.

              2. a comma-separated sequence of column set names, all preceded by '+' or

              '-'; this is interpreted as a list of changes to the default.

              plink2 <input flag(s)...> [command flag(s)...] [other flag(s)...]  plink2 --help [flag name(s)...]

       Most  PLINK  runs require exactly one main input fileset.  The following flags are available for defining
       its form and location:

       --pfile <prefix> ['vzs']
              : Specify .pgen + .pvar[.zst] + .psam prefix.

       --pgen <filename>
              : Specify full name of .pgen/.bed file.

       --pvar <filename>
              : Specify full name of .pvar/.bim file.

       --psam <filename>
              : Specify full name of .psam/.fam file.

       --bfile
              <prefix> ['vzs'] : Specify .bed + .bim[.zst] + .fam prefix.

       --bpfile <prefix> ['vzs'] : Specify .pgen + .bim[.zst] + .fam prefix.

       --keep-autoconv ['vzs']
              : When importing non-PLINK-binary data, don't delete autogenerated fileset at end of run.

       --no-fid
              : .fam file does not contain column 1 (family ID).

       --no-parents
              : .fam file does not contain columns 3-4 (parents).

       --no-sex
              : .fam file does not contain column 5 (sex).

       --vcf <filename> ['dosage='<field>]

       --bcf <filename> ['dosage='<field>] :

              Specify full name of  .vcf{|.gz|.zst}  or  BCF2  file  to  import.   *  These  can  be  used  with
              --psam/--fam.  * By default, dosage information is not imported.  To import the GP field

              (must  be  VCFv4.3-style  0..1,  one  probability  per  possible  genotype),  add  'dosage=GP' (or
              'dosage=GP-force', see below).  To import Minimac3-style DS+HDS phased dosage,  add  'dosage=HDS'.
              'dosage=DS'  (or  anything  else  for  now)  causes  the  named  field  to  be  interpreted  as  a
              Minimac3-style dosage.  Note that, in the dosage=GP case, PLINK 2 collapses the probabilities down
              to dosages; you cannot use PLINK 2 to losslessly convert VCF FORMAT:GP data to e.g.  BGEN  format.
              To  make  this  more  obvious,  PLINK  2  now  errors  out when dosage=GP is used on a file with a
              FORMAT:DS header line and --import-dosage-certainty wasn't specified, since dosage=DS extracts the
              same  information  more  quickly  in  this  situation.   You  can   suppress   this   error   with
              'dosage=GP-force'.   In  all  of  these  cases,  hardcalls  are  regenerated from scratch from the
              dosages.  As a consequence, variants with no GT field can now be imported; they will be assumed to
              contain only diploid calls when HDS is also absent.

       --data <filename prefix> <REF/ALT mode> ['gzs']

       --bgen <filename> <REF/ALT mode> ['snpid-chr']

       --gen <filename> <REF/ALT mode>

       --sample <filename> :

       Specify an Oxford-format dataset to import.
              --data specifies a .gen[.zst]

              + .sample pair, while --bgen specifies a BGEN v1.1+ file.  * If a BGEN v1.2+ file contains  sample
              IDs, it may be imported without a

              companion .sample file.

              * With 'snpid-chr', chromosome codes are read from the 'SNP ID' field

              instead of the usual chromosome field.

              * The following REF/ALT modes are supported:

              'ref-first':  The  first  allele  for  each  variant is REF.  'ref-last': The last allele for each
              variant is REF.  'ref-unknown': The last allele for each variant is treated as

              provisional-REF.

       --haps <filename> [{ref-first | ref-last}]

       --legend <filename> <chr code> :

              Specify .haps [+ .legend] file(s) to import.  * When --legend is specified, it's assumed that  the
              --haps file doesn't

              contain header columns.

       * On chrX, the second male column may contain dummy '-' entries.
              (However,

              PLINK 2 currently cannot handle omitted male columns.)

              * If not used with --sample, new sample IDs are of the form 'per#/per#'.

       --map <filename>
              : Specify full name of .map file.

       --import-dosage <allele dosage file> ['noheader'] ['id-delim='<char>]

              ['skip0='<i>]  ['skip1='<j>]  ['skip2='<k>]  ['dose1']  ['format='<m>]  [{ref-first  |  ref-last}]
              ['single-chr='<code>] ['chr-col-num='<#>] ['pos-col-num='<#>] :

              Specify PLINK 1.x-style dosage file to import.  * You must also  specify  a  companion  .psam/.fam
              file.  * By default, PLINK assumes that the file contains a header line, which has

              'SNP'  in  (1-based)  column  i+1, 'A1' in column i+j+2, 'A2' in column i+j+3, and sample FID/IIDs
              starting from column i+j+k+4.  (i/j/k are normally zero, but can be changed with 'skip0', 'skip1',
              and 'skip2' respectively.  FID/IID are normally assumed to be  separate  tokens,  but  if  they're
              merged into a single token you can specify the delimiter with 'id-delim='.)  If such a header line
              is  not  present,  use the 'noheader' modifier; samples will then be assumed to appear in the same
              order as they do in the .psam/.fam file.

       * You may specify a companion .map file.
              If you do not,

              * 'single-chr=' can be used to specify that all variants are on the named

       chromosome.
              Otherwise, you can use 'chr-col-num=' to read chromosome

              codes from the given (1-based) column number.

              * 'pos-col-num=' causes bp coordinates to be read from the given column

              number.

              * The 'format=' modifier lets you specify the number of values used to

       represent each dosage.
              'format=1' normally indicates a single 0..2 A1

       expected count; 'dose1' modifies this to a 0..1 frequency.
              'format=2'

              indicates a 0..1 homozygous A1 likelihood followed by a 0..1 het likelihood.  'format=3' indicates
              0..1 hom A1, 0..1 het, 0..1 hom A2.  'format=infer' (the  default)  infers  the  format  from  the
              number of columns in the first nonheader line.

       --dummy <sample ct> <SNP ct> [missing dosage freq] [missing pheno freq]

              [{acgt | 1234 | 12}] ['pheno-ct='<count>] ['scalar-pheno'] ['dosage-freq='<rate>]

              This  generates a fake input dataset with the specified number of samples and SNPs.  * By default,
              the missing dosage and phenotype frequencies are zero.

              These can be changed by providing 3rd and 4th numeric arguments.

              * By default, allele codes are As and Bs; this can be changed with the

              'acgt', '1234', or '12' modifier.

       * By default, one binary phenotype is generated.
              'pheno-ct=' can be used

              to change the number of phenotypes, and 'scalar-pheno' causes  these  phenotypes  to  be  normally
              distributed scalars.

       * By default, all (nonmissing) dosages are in {0,1,2}.
              To make some of

       them take on decimal values, use 'dosage-freq='.
              (These dosages are

              affected by --hard-call-threshold and --dosage-erase-threshold.)

       --fa <filename>
              : Specify full name of reference FASTA file.

       Output  files have names of the form 'plink2.<extension>' by default.  You can change the 'plink2' prefix
       with

       --out <prefix>
              : Specify prefix for output files.

       Most runs also require at least one of the following commands:

       --rm-dup [mode] ['list']

              Remove all but one instance of each duplicate-ID variant (ignoring the missing ID), and (with  the
              'list'  modifier)  write  a  list  of duplicated IDs to <output prefix>.rmdup.list.  The following
              modes of operation are supported: * 'error' (default) causes this to  error  out  when  there's  a
              genotype data

       or other mismatch between the records.
              A list of affected IDs is written

              to <output prefix>.rmdup.mismatch.

              * 'retain-mismatch' causes all instances of a duplicate-ID variant to be

              retained  when  there's  a genotype data or variant info mismatch; otherwise one instance is kept.
              The .rmdup.mismatch file is also written.

              * 'exclude-mismatch' removes all instances of duplicate-ID mismatched

              variants instead.

              * 'exclude-all' causes all instances of duplicate-ID variants to be

              removed, even when the actual records are identical.

              * 'force-first' causes only the first instance of duplicate-ID variants to

              be kept, under all circumstances.

       --make-pgen ['vzs'] ['format='<code>] ['trim-alts'] ['erase-phase']

              ['erase-dosage'] ['pvar-cols='<col set descriptor>] ['psam-cols='<col set descriptor>]

       --make-bpgen ['vzs'] ['format='<code>] ['trim-alts'] ['erase-phase']

              ['erase-dosage']

       --make-bed ['vzs'] ['trim-alts']

              Create a new PLINK 2 binary fileset (--make-pgen = .pgen + .pvar[.zst]  +  .psam,  --make-bpgen  =
              .pgen + .bim[.zst] + .fam).  * Unlike the automatic text-to-binary converters (which only heed

              chromosome filters), this supports all of PLINK's filtering flags.

              * The 'vzs' modifier causes the variant file (.pvar/.bim) to be

              Zstd-compressed.

              * The 'format' modifier requests an uncompressed fixed-variant-width .pgen

       file.
              (These do not directly support multiallelic variants.)  The

              following format code is currently supported:

              2: just like .bed, except with an extended (12-byte instead of 3-byte)

              header  containing variant/sample counts, and rotated genotype codes (00 = hom ref, 01 = het, 10 =
              hom alt, 11 = missing).

              * The 'erase-phase' and 'erase-dosage' modifiers prevent phase and dosage

              information from being written to the new .pgen.

              * The first five columns of a .pvar file are always #CHROM/POS/ID/REF/ALT.

              Supported optional .pvar column sets are:

              xheader: All ## header lines (yeah, this is technically not a column),

              except for possibly FILTER/INFO definitions when those column(s) have been removed.  Without this,
              only the #CHROM header line is kept.

              vcfheader: xheader, with additions to make it a valid VCF header.  maybequal:  QUAL.   Omitted  if
              all  remaining  values  are  missing.   qual:  Force  QUAL  column  to be written even when empty.
              maybefilter: FILTER.  Omitted if all remaining values are missing.  filter: Force FILTER column to
              be written even when empty.  maybeinfo: INFO.  Omitted if all remaining values are missing, or if

              INFO:PR is the only subfield.

              info: Force INFO column to be written.  maybecm: Centimorgan coordinate.  Omitted if all remaining
              values = 0.  cm: Force CM column to be written even when empty.

              The default is xheader,maybequal,maybefilter,maybeinfo,maybecm.

              * Supported column sets for the .psam file are:

       maybefid: Family ID, '0' = missing.
              Omitted if all values missing.

              fid: Force FID column to be written even when empty.  maybesid: Source ID, '0' = missing.  Omitted
              if all values missing.  sid: Force SID column to be written even when empty.  maybeparents: Father
              and mother IIDs.  Omitted if all values missing.  parents: Force PAT and MAT columns to be written
              even when empty.  sex: '1' = male, '2' = female, 'NA' = missing.  pheno1: First active  phenotype.
              If none, all column entries are set to

              the --output-missing-phenotype string.

       phenos: All active phenotypes, if any.
              (Can be combined with pheno1 to

              force at least one phenotype column to be written.)

              The default is maybefid,maybesid,maybeparents,sex,phenos.

       --make-just-pvar ['zs'] ['cols='<column set descriptor>]

       --make-just-psam ['cols='<column set descriptor>]

       --make-just-bim ['zs']

       --make-just-fam

              Variants  of  --make-pgen/--make-bed  which  only write a new variant or sample file.  These don't
              always require an input genotype file.  USE THESE CAUTIOUSLY.  It is very  easy  to  desynchronize
              your  binary  genotype  data and your sample/variant indexes if you use these commands improperly.
              If you have any doubt, stick with --make-[b]pgen/--make-bed.

       --export <output format(s)...> [{01 | 12}] ['bgz'] ['id-delim='<char>]

              ['id-paste='<column    set    descriptor>]    ['include-alt']    ['omit-nonmale-y']     ['spaces']
              ['vcf-dosage='<field>] ['ref-first'] ['bits='<#>]

       Create a new fileset with all filters applied.
              The following output

              formats  are  supported:  (actually,  only  A,  AD,  A-transpose, bcf, bgen-1.x, haps, hapslegend,
              ind-major-bed, oxford, and vcf are implemented for now) * '23': 23andMe 4-column format.  This can
              only be used on a single

              sample's data (--keep may be handy), and does not support multicharacter allele codes.

              * 'A': Sample-major additive (0/1/2) coding, suitable for loading from R.

              If you need uncounted alleles to be named in the header line, add the 'include-alt' modifier.

              * 'AD': Sample-major additive (0/1/2) + dominant (het=1/hom=0) coding.

              Also supports 'include-alt'.

              * 'A-transpose': Variant-major 0/1/2.  * 'beagle': Unphased  per-autosome  .dat  and  .map  files,
              readable by early

              BEAGLE versions.

              *  'beagle-nomap':  Single  .beagle.dat  file.   * 'bgen-1.x': Oxford-format .bgen + .sample.  For
              v1.2/v1.3, sample

              identifiers are stored in the .bgen (with id-delim and id-paste  settings  applied),  and  default
              precision is 16-bit (use the 'bits' modifier to reduce this).

              *  'bimbam':  Regular  BIMBAM  format.  * 'bimbam-1chr': BIMBAM format, with a two-column .pos.txt
              file.  Does not

              support multiple chromosomes.

              * 'fastphase': Per-chromosome fastPHASE files, with

              .chr-<chr #>.phase.inp filename extensions.

       * 'fastphase-1chr': Single .phase.inp file.
              Does not support

              multiple chromosomes.

       * 'haps', 'hapslegend': Oxford-format .haps + .sample[ + .legend].
              All

       data must be biallelic and phased.
              When the 'bgz'

       modifier is present, the .haps file is
              block-gzipped.

              * 'HV': Per-chromosome Haploview files, with .chr-<chr #>{.ped,.info}

              filename extensions.

       * 'HV-1chr': Single Haploview .ped + .info file pair.
              Does not support

              multiple chromosomes.

              * 'ind-major-bed': PLINK 1 sample-major .bed (+ .bim +  .fam).   *  'lgen':  PLINK  1  long-format
              (.lgen  + .fam + .map), loadable with --lfile.  * 'lgen-ref': .lgen + .fam + .map + .ref, loadable
              with --lfile +

              --reference.

       * 'list': Single genotype-based list, up to 4 lines per variant.
              To omit

              nonmale genotypes on the Y chromosome, add the 'omit-nonmale-y' modifier.

              * 'rlist': .rlist + .fam + .map fileset, where the .rlist file is a

              genotype-based list which omits  the  most  common  genotype  for  each  variant.   Also  supports
              'omit-nonmale-y'.

       * 'oxford': Oxford-format .gen + .sample.
              When the 'bgz' modifier is

              present, the .gen file is block-gzipped.

              *  'ped':  PLINK 1 sample-major (.ped + .map), loadable with --file.  * 'compound-genotypes': Same
              as 'ped', except that the space between each

              pair of same-variant allele codes is removed.

              * 'structure': Structure-format.  * 'transpose': PLINK 1 variant-major (.tped +  .tfam),  loadable
              with

              --tfile.

       * 'vcf',
              : VCF (default version 4.3).  If PAR1 and PAR2 are present,

       'vcf-4.2',
              they are automatically merged with chrX, with proper

       'bcf', handling of chromosome codes and male ploidy.

       'bcf-4.2'
              When  the 'bgz' modifier is present, the VCF file is block-gzipped.  (This always happens with BCF
              output.)  The 'id-paste' modifier controls which .psam columns are used to  construct  sample  IDs
              (choices  are  maybefid, fid, iid, maybesid, and sid; default is maybefid,iid,maybesid), while the
              'id-delim' modifier sets the character between the ID pieces (default '_').  Genotypes are  always
              exported.   If  you  want  to  export a sites-only VCF instead, see --make-pgen/--make-just-pvar's
              'vcfheader' column set.  Dosages are not exported unless the 'vcf-dosage='  modifier  is  present.
              The following five dosage export modes are supported: 'GP': genotype posterior probabilities (v4.3
              only).   'DS': Minimac3-style dosages, omitted for hardcalls.  'DS-force': Minimac3-style dosages,
              never omit.  'HDS': Minimac3-style phased dosages, omitted for hardcalls

       and unphased calls.
              Also includes 'DS' output.

              'HDS-force': Always report DS and HDS.

              In addition, * The '12' modifier causes alt1 alleles to be coded as '1' and ref alleles

              to be coded as '2', while '01' maps alt1 -> 0 and ref -> 1.

              * The 'spaces' modifier makes the output space-delimited instead of

              tab-delimited, whenever both are permitted.

              * For biallelic formats where it's unspecified whether the reference/major

              allele should appear first or second, --export defaults to second  for  compatibility  with  PLINK
              1.9.   Use  'ref-first'  to  change  this.   (Note  that  this doesn't apply to the 'A', 'AD', and
              'A-transpose' formats; use --export-allele to control which alleles are counted there.)

       --freq ['zs'] ['counts'] ['cols='<column set descriptor>] ['bins-only']

              ['refbins='<comma-separated bin boundaries> | 'refbins-file='<file>]  ['alt1bins='<comma-separated
              bin boundaries> | 'alt1bins-file='<file>]

       Empirical allele frequency report.
              By default, only founders are

       considered.
              Dosages are taken into account (e.g. heterozygous haploid

              calls count as 0.5).  Supported column sets are:

              chrom:  Chromosome  ID.   pos: Base-pair coordinate.  (ID is always present, and positioned here.)
              ref: Reference allele.  alt1: Alternate allele 1.  alt: All  alternate  alleles,  comma-separated.
              reffreq:   Reference   allele   frequency/dosage.    alt1freq:  Alt1  frequency/dosage.   altfreq:
              Comma-separated frequencies/dosages for all alternate alleles.  freq: Similar to  altfreq,  except
              ref  is  also included at the start.  eq: Comma-separated <allele>=<freq> for all present alleles.
              (If no

              alleles are present, the column contains a single '.'.)

              eqz: Same as eq, except zero-counts are included.  alteq/alteqz: Same as eq/eqz, except  reference
              allele  is  omitted.   numeq: 0=<freq>,1=<freq>, etc.  Zero-counts are omitted.  altnumeq: Same as
              numeq, except reference allele is omitted.   machr2:  Unphased  MaCH  imputation  quality  metric.
              minimac3r2: Phased Minimac3 imputation quality.  nobs: Number of allele observations.

              The    default    is    chrom,ref,alt,altfreq,nobs.     Additional    .afreq.{ref,alt1}.bins   (or
              .acount.{ref,alt1}.bins with 'counts') file(s) are generated  when  'refbins='/'refbins-file='  or
              'alt1bins='/'alt1bins-file=' is present; these report the total number of frequencies or counts in
              each  left-closed,  right-open  interval.   (If you only want these histogram(s), and not the main
              report, add 'bins-only'.)

       --geno-counts ['zs'] ['cols='<column set descriptor>]

              Variant-based hardcall genotype count report  (considering  both  alleles  simultaneously  in  the
              diploid  case).   Nonfounders  are  now  included;  use  --keep-founders  if  this  is  a problem.
              Heterozygous haploid calls are treated as missing.  Supported column sets are:

              chrom: Chromosome ID.  pos: Base-pair coordinate.  (ID is always present,  and  positioned  here.)
              ref:  Reference  allele.   alt1: Alternate allele 1.  alt: All alternate alleles, comma-separated.
              homref: Homozygous-ref count.  refalt1: Heterozygous ref-alt1 count.  refalt: Comma-separated  het
              ref-altx  counts.   homalt1:  Homozygous-alt1  count.  altxy: Comma-separated altx-alty counts, in
              (1/1)-(1/2)-(2/2)-(1/3)-...

              order.

              xy: Similar to altxy, except the reference allele is treated as alt0,

              and the sequence starts (0/0)-(0/1)-(1/1)-(0/2)-...

              hapref: Haploid-ref count.  hapalt1: Haploid-alt1  count.   hapalt:  Comma-separated  haploid-altx
              counts.   hap:  Similar to hapalts, except ref is also included at the start.  numeq: 0/0=<hom ref
              ct>,0/1=<het ref-alt1>,1/1=<hom alt1>,...,0=<hap ref>

       etc.   Zero-counts are omitted.  (If all genotypes are missing, the

              column contains a single '.'.)

              missing: Number of missing genotypes.  nobs: Number of (nonmissing) genotype observations.

              The default is chrom,ref,alt,homref,refalt,altxy,hapref,hapalt,missing.

       --sample-counts ['zs'] ['cols='<column set descriptor>]

              Sample-based hardcall genotype count report.  * Unknown-sex samples  are  treated  as  female.   *
              Heterozygous  haploid  calls  (MT  included)  are  treated  as  missing.   * As with other PLINK 2
              commands, SNPs that have not been left-normalized

       are counted as non-SNP non-symbolic.
              (Use e.g. --normalize when that's a

              problem.)

              * Supported column sets are:

              maybefid: FID, if that column was present in the input.  fid: Force FID column to be written  even
              when  absent  in the input.  (IID is always present, and positioned here.)  maybesid: SID, if that
              column was present in the input.  sid: Force SID column to be written  even  when  absent  in  the
              input.   sex:  '1' = male, '2' = female, 'NA' = missing.  hom: Homozygous genotype count.  homref:
              Homozygous-ref genotype count.  homalt: Homozygous-alt genotype count.  homaltsnp:  Homozygous-alt
              SNP  count.   het:  Heterozygous  genotype  count.   refalt: Heterozygous ref-altx genotype count.
              het2alt: Heterozygous altx-alty genotype count.  hetsnp: Heterozygous SNP count.   dipts:  Diploid
              SNP  transition count.  ts: SNP transition count (excluding chrY for females).  diptv: Diploid SNP
              transversion count.  tv: SNP transversion count.   dipnonsnpsymb:  Diploid  non-SNP,  non-symbolic
              count.   nonsnpsymb:  Non-SNP,  non-symbolic  count.   symbolic:  Symbolic variant count.  nonsnp:
              Non-SNP count.  dipsingle: Number of singletons relative to this dataset, across just

       diploid calls.
              (Note that if the ALT allele in a chrX

              biallelic variant appears in exactly one female and one male, that counts as a singleton for  just
              the female.)

              single:  Number  of  singletons  relative  to  this  dataset.   haprefwfemaley: Haploid-ref count,
              counting  chrY  for  everyone.   hapref:  Haploid-ref   count,   excluding   chrY   for   females.
              hapaltwfemaley:  Haploid-alt  count,  counting  chrY  for  everyone.   hapalt:  Haploid-alt count,
              excluding chrY for females.  missingwfemaley: Missing call  count,  counting  chrY  for  everyone.
              missing: Missing call count, excluding chrY for females.

              The           default           is          maybefid,maybesid,homref,homaltsnp,hetsnp,dipts,diptv,
              dipnonsnpsymb,dipsingle,haprefwfemaley,hapaltwfemaley,missingwfemaley.

              * The 'hetsnp', 'dipts'/'ts'/'diptv'/'tv', 'dipnonsnpsymb'/'nonsnpsymb',

              'symbolic', and  'nonsnp'  columns  count  each  ALT  allele  in  a  heterozygous  altx-alty  call
              separately,  since  they can be of different subtypes.  (I.e. if they are of the same subtype, the
              corresponding count is incremented by 2.)  As a  consequence,  these  columns  are  unaffected  by
              variant split/join.

       --missing ['zs'] [{sample-only | variant-only}]

              ['scols='<column set descriptor>] ['vcols='<column set descriptor>]

              Generate   sample-   and   variant-based   missing   data   reports   (or   just   one  report  if
              'sample-only'/'variant-only' is specified).  As of alpha 2,  mixed  MT  hardcalls  appear  in  the
              heterozygous haploid stats.  Supported column sets in the sample-based report are:

              maybefid:  FID, if that column was present in the input.  fid: Force FID column to be written even
              when absent in the input.  (IID is always present, and positioned here.)  maybesid: SID,  if  that
              column  was  present  in  the  input.  sid: Force SID column to be written even when absent in the
              input.  misspheno1: First active phenotype missing (Y/N)?  Always 'Y' if no

              phenotypes are loaded.

       missphenos: A Y/N column for each loaded phenotype.
              (Can be combined

              with misspheno1 to force at least one such column.)

              nmissdosage: Number of missing dosages.  nmiss: Number of  missing  hardcalls,  not  counting  het
              haploids.   nmisshh:  Number  of  missing  hardcalls,  counting  het  haploids.  hethap: Number of
              heterozygous haploid hardcalls.  nobs: Denominator (male count on  chrY,  otherwise  total  sample
              count).   fmissdosage:  Missing  dosage  rate.   fmiss:  Missing  hardcall  rate, not counting het
              haploids.  fmisshh: Missing hardcall rate, counting het haploids.

              The default  is  maybefid,maybesid,missphenos,nmiss,nobs,fmiss.   Supported  column  sets  in  the
              variant-based report are:

              chrom:  Chromosome  ID.   pos: Base-pair coordinate.  (ID is always present, and positioned here.)
              ref: Reference allele.  alt1: Alternate allele 1.  alt: All  alternate  alleles,  comma-separated.
              nmissdosage:  Number  of  missing  dosages.   nmiss: Number of missing hardcalls, not counting het
              haploids.  nmisshh: Number of  missing  hardcalls,  counting  het  haploids.   hethap:  Number  of
              heterozygous haploid calls.  nobs: Number of potentially valid calls.  fmissdosage: Missing dosage
              rate.   fmiss:  Missing hardcall rate, not counting het haploids.  fmisshh: Missing hardcall rate,
              counting het haploids.  fhethap: Heterozygous haploid rate.

              The default is chrom,nmiss,nobs,fmiss.

       --hardy ['zs'] ['midp'] ['redundant'] ['cols='<column set descriptor>]

              Hardy-Weinberg exact test p-value report(s).  * By default, only founders are  considered;  change
              this  with  --nonfounders.   *  chrX  is  now  omitted from the main <output prefix>.hardy report.
              Instead,

              (if present) it gets its own <output prefix>.hardy.x report  based  on  the  method  described  in
              Graffelman J, Weir BS (2016) Hardy-Weinberg equilibrium and the X chromosome.

              * For variants with k alleles where k>2, k separate 'biallelic' tests are

       performed, each reported on its own line.
              However, biallelic variants

              are  normally  reported  on a single line, since the counts/frequencies would be mirror-images and
              the p-values would be the same.  You can add the 'redundant' modifier to force  biallelic  variant
              results to be reported on two lines for parsing convenience.

              * There is currently no special handling of case/control phenotypes.  Supported column sets are:

              chrom:  Chromosome  ID.   pos: Base-pair coordinate.  (ID is always present, and positioned here.)
              ref: Reference allele.  alt1: Alternate allele 1.  alt: All  alternate  alleles,  comma-separated.
              (A1  is  always  present,  and positioned here.)  ax: Non-A1 allele(s), comma-separated.  gcounts:
              Hom-A1 count, total number of het-A1 calls, and total number of

       nonmissing calls with no copies of A1.
              On chrX, these are

              followed by male A1 and male non-A1 counts.

              gcount1col: gcounts values in a single comma-separated column.   hetfreq:  Observed  and  expected
              het-A1  frequencies.  sexaf: Female and male A1 observed allele frequencies (chrX only).  femalep:
              Female-only p/midp-value (chrX only).  p: Hardy-Weinberg equilibrium exact test p/midp-value.

              The default is chrom,ax,gcounts,hetfreq,sexaf,p.

       --indep-pairwise <window size>['kb'] [step size (variant ct)]

              <unphased-hardcall-r^2 threshold>

              Generate a list of variants in approximate linkage  equilibrium.   *  For  multiallelic  variants,
              major allele counts are used in the r^2

              computation.

              * With the 'kb' modifier, the window size is in kilobase instead of variant

       count units.
              (Pre-'kb' space is optional, i.e.

              "--indep-pairwise 500 kb 0.5" and "--indep-pairwise 500kb 0.5" have the same effect.)

              * The step size now defaults to 1 if it's unspecified, and *must* be 1 if

              the window is in kilobase units.

              * Note that you need to rerun PLINK using --extract or --exclude on the

              .prune.in/.prune.out  file  to  apply  the  list  to  another  computation...  and  as  with other
              applications of --extract/--exclude, duplicate variant IDs are a problem.  --indep-pairwise  still
              runs  to  completion for now when duplicate variant IDs are present, but that will become an error
              in alpha 3.

       --ld <variant ID> <variant ID> ['dosage'] ['hwe-midp']

              This displays diplotype  frequencies,  r^2,  and  D'  for  a  single  pair  of  variants.   *  For
              multiallelic  variants,  major  allele  counts/dosages are used.  * Phase information is used when
              both variants are on the same chromosome.  * When there is at least one sample with  unphased  het
              calls for both

              variants,  diplotype  frequencies  are  estimated  using the Hill equation.  If there are multiple
              biologically possible local maxima, all are displayed, along with HWE exact test statistics.

       * By default, only hardcalls are considered.
              Add the 'dosage' modifier if

       you want dosages to be taken into account.
              (In the diploid case, an

              unphased dosage of x is interpreted as P(0/0) = 1 - x, P(0/1) = x when x is in 0..1.)

       --sample-diff ['id-delim='<char>] ['dosage' | 'dosage='<tolerance>]

              ['include-missing'] [{pairwise | counts-only}] ['fname-id-delim='<c>] ['zs']  ['cols='<column  set
              descriptor>]  ['counts-cols='<column  set  descriptor>]  {base=  |  ids=}<sample ID> [other sample
              ID(s)...]

       --sample-diff ['id-delim='<char>] ['dosage' | 'dosage='<tolerance>]

              ['include-missing'] [{pairwise | counts-only}] ['fname-id-delim='<c>] ['zs']  ['cols='<column  set
              descriptor>] ['counts-cols='<column set descriptor>] file=<ID-pair file>

              (alias:  --sdiff) Report discordances and discordance-counts between pairs of samples.  If chrX or
              chrY is present, sex must be defined and consistent.  * There are  three  ways  to  specify  which
              sample pairs to compare.  To

              compare  a  single baseline sample against some others, start the (space-delimited) sample ID list
              with 'base='.  To perform an all-vs.-all comparison, start it with  'ids='  instead.   To  compare
              sample  pairs  listed in a file, use 'file='.  Note that 'base='/'ids='/'file=' must be positioned
              after all modifiers.

              * Sample IDs are interpreted as if they were in a VCF header line, with

              'id-delim=' having the usual effect.

       * By default, comparisons are based on hardcalls.
              Use 'dosage' to compare

              dosages instead; you can combine this with a tolerance in [0, 0.5).

              * By default, if one genotype is missing and the other isn't, that doesn't

              count as a difference; this can be changed with 'include-missing'.

              * By default, a single main report is written to

       <output prefix>[.<base ID>].sdiff.
              To write separate pairwise

              <output prefix>.<ID1>.<ID2>.sdiff reports for each compared ID pair, add the 'pairwise'  modifier.
              To omit the main report, add the 'counts-only' modifier.  (Note that, if you're only interested in
              nonmissing  autosomal  biallelic  hardcalls,  --make-king-table  provides  a more efficient way to
              compute just counts.)

              * By default, if an output filename has a multipart sample ID, the parts

              will be delimited by '_'; use 'fname-id-delim=' to change this.

              Supported main-report column sets are:

              chrom: Chromosome ID.  pos: Base-pair coordinate.  (Variant ID is always present,  and  positioned
              here.)  ref: Reference allele.  alt: All alternate alleles, comma-separated.  maybefid: FID1/FID2,
              if that column was in the input.  Requires 'id'.  fid: Force FID1/FID2 even when FID was absent in
              the  input.  id: IID1/IID2.  maybesid: SID1/SID2, if that column was in the input.  Requires 'id'.
              sid: Force SID1/SID2 even when SID was absent in the input.  geno: Unphased GT or DS for  the  two
              samples.

              The  default  is  usually  chrom,pos,ref,alt,maybefid,id,maybesid,geno; the sample IDs are removed
              from the default in 'pairwise' mode.  Supported discordance-count-summary column sets are:

              maybefid: FID1/FID2, if that column was in the input.  fid: Force  FID1/FID2  even  when  FID  was
              absent  in the input.  (IID1/IID2 are always present.)  maybesid: SID1/SID2, if that column was in
              the input.  sid: Force SID1/SID2 even when SID was absent in the input.  nobs: Number of  variants
              considered.  This includes variants where one or

              both variants are missing iff 'include-missing' was specified.

              nobsibs: ibs0+ibs1+ibs2.  ibs0: Number of diploid variants with no common hardcall alleles.  ibs1:
              Number  of  diploid  variants  with  exactly  1  common  hardcall allele.  ibs2: Number of diploid
              variants with both hardcall alleles matching.  halfmiss: Number of variants with exactly 1 missing
              genotype/dosage.

              Ignored without 'include-missing'.

              diff: Total number of differences.

              The default is maybefid,maybesid,nobs,halfmiss,diff.

       --make-king [{square | square0 | triangle}] [{zs | bin | bin4}]

              KING-robust kinship estimator, described by Manichaikul A, Mychaleckyj JC, Rich SS, Daly  K,  Sale
              M,  Chen  WM (2010) Robust relationship inference in genome-wide association studies.  By default,
              this  writes  a  lower-triangular  tab-delimited  table  of  kinship   coefficients   to   <output
              prefix>.king,  and  a  list of the corresponding sample IDs to <output prefix>.king.id.  The first
              row of the .king file contains a single <genome 1-genome 2> kinship coefficient,  the  second  row
              has  the  <genome  1-genome  3> and <genome 2-genome 3> kinship values in that order, etc.  * Only
              autosomes  are  currently  considered.   *  Pedigree  information  is   currently   ignored;   the
              between-family estimator

              is used for all pairs.

              *  For multiallelic variants, REF allele counts are used.  * If the 'square' or 'square0' modifier
              is present, a square matrix is

              written instead; 'square0' fills the upper right triangle with zeroes.

              * If the 'zs' modifier is present, the .king file is Zstd-compressed.  * If the 'bin' modifier  is
              present, a binary (square) matrix of

              double-precision floating point values, suitable for loading from R, is instead written to <output
              prefix>.king.bin.  ('bin4' specifies single-precision numbers instead.)  This can be combined with
              'square0' if you still want the upper right zeroed out, or 'triangle' if you don't want to pad the
              upper right at all.

              * The computation can be subdivided with --parallel.

       --make-king-table ['zs'] ['counts'] ['rel-check'] ['cols='<col set descrip.>]

              Similar  to  --make-king,  except  results are reported in KING's original .kin0 text table format
              (with minor changes, e.g. row  order  is  more  friendly  to  incremental  addition  of  samples),
              --king-table-filter can be used to restrict the report to high kinship values, and the 'rel-check'
              modifier can be used to restrict to same-FID pairs.  Supported column sets are:

       maybefid: FID1/FID2, if that column was in the input.
              Requires 'id'.

              fid:  Force FID1/FID2 even when FID was absent in the input.  id: IID1/IID2.  maybesid: SID1/SID2,
              if that column was in the input.  Requires 'id'.  sid: Force SID1/SID2 even when SID was absent in
              the input.  nsnp: Number of  variants  considered  (autosomal,  neither  call  missing).   hethet:
              Proportion/count of considered call pairs which are het-het.  ibs0: Proportion/count of considered
              call  pairs  which are opposite homs.  ibs1: HET1_HOM2 and HET2_HOM1 proportions/counts.  kinship:
              KING-robust between-family kinship estimator.

              The  default  is  maybefid,id,maybesid,nsnp,hethet,ibs0,kinship.   hethet/ibs0/ibs1   values   are
              proportions  unless  the  'counts' modifier is present.  If id is omitted, a .kin0.id file is also
              written.

       --make-rel ['cov'] ['meanimpute'] [{square | square0 | triangle}]

              [{zs | bin | bin4}]

              Write a lower-triangular variance-standardized relationship matrix  to  <output  prefix>.rel,  and
              corresponding IDs to <output prefix>.rel.id.  * This computation assumes that variants do not have
              very low MAF, or

              deviate greatly from Hardy-Weinberg equilibrium.

              * Also, it's usually best to perform this calculation on a variant set in

              approximate linkage equilibrium.

              * The 'cov' modifier replaces the variance-standardization step with basic

              mean-centering, causing a covariance matrix to be calculated instead.

              * The computation can be subdivided with --parallel.

       --make-grm-list ['cov'] ['meanimpute'] ['zs'] [{id-header | iid-only}]

       --make-grm-bin ['cov'] ['meanimpute'] [{id-header | iid-only}]

       --make-grm-list causes the relationships to be written to GCTA's original

              list  format,  which  describes one pair per line, while --make-grm-bin writes them in GCTA 1.1+'s
              single-precision triangular binary format.  Note that these formats explicitly report  the  number
              of  valid  observations  (where  neither sample has a missing call) for each pair, which is useful
              input for some scripts.

       --pca [count] [{approx | meanimpute}] ['scols='<col set descriptor>]

       --pca [{allele-wts | biallelic-var-wts}] [count] [{approx | meanimpute}]

              ['vzs'] ['scols='<col set descriptor>] ['vcols='<col set descriptor>]

              Extracts top principal components from the variance-standardized relationship  matrix.   *  It  is
              usually best to perform this calculation on a variant set in

              approximate linkage equilibrium, with no very-low-MAF variants.

              * By default, 10 PCs are extracted; you can adjust this by passing a

       numeric argument.
              (Note that 10 is lower than the PLINK 1.9 default of

              20; this is due to the randomized algorithm's memory footprint growing quadratically w.r.t. the PC
              count.)

              * The 'approx' modifier causes the standard deterministic computation to be

              replaced  with  the randomized algorithm originally implemented for Galinsky KJ, Bhatia G, Loh PR,
              Georgiev S, Mukherjee S, Patterson NJ, Price AL (2016) Fast Principal-Component  Analysis  Reveals
              Convergent  Evolution of ADH1B in Europe and East Asia.  This can be a good idea when you have >5k
              samples, and is almost required with >50k.

              * The randomized algorithm always uses mean imputation for missing genotype

       calls. For comparison purposes, you can use the 'meanimpute' modifier to

              request this behavior for the standard computation.

              * 'scols=' can be used to customize how sample IDs appear in the .eigenvec

       file.  (maybefid, fid, maybesid, and sid supported; default is

              maybefid,maybesid.)

              * The 'allele-wts' modifier requests an additional one-line-per-allele

              .eigenvec.allele file with PCs expressed as allele weights instead of sample weights.   When  it's
              present,  'vzs'  causes  the .eigenvec.allele file to be Zstd-compressed.  'vcols=' can be used to
              customize the report columns; supported column sets are:

              chrom: Chromosome ID.  pos: Base-pair coordinate.  (ID is always present,  and  positioned  here.)
              ref:  Reference  allele.   alt1: Alternate allele 1.  alt: All alternate alleles, comma-separated.
              (A1 is always present, and positioned here.)   ax:  Non-A1  alleles,  comma-separated.   (PCs  are
              always present, and positioned here.)

              Default is chrom,ref,alt.

              * For datasets with no multiallelic variants, the 'biallelic-var-wts'

              modifier  requests  the  old  .eigenvec.var  format, which only reports weights for major alleles.
              (These weights are 2x the corresponding .eigenvec.allele weights.)  Supported column sets are:

              chrom: Chromosome ID.  pos: Base-pair coordinate.  (ID is always present,  and  positioned  here.)
              ref:  Reference  allele.   alt1: Alternate allele 1.  alt: All alternate alleles, comma-separated.
              maj: Major allele.  nonmaj: Minor allele.  (PCs are always present, and  positioned  here.   Signs
              are w.r.t. the major, not necessarily reference, allele.)

              Default is chrom,maj,nonmaj.

       --king-cutoff [.king.bin + .king.id fileset prefix] <threshold>

              Exclude  one  member  of  each  pair  of  samples  with KING-robust kinship greater than the given
              threshold.  Remaining/excluded sample IDs  are  written  to  <output  prefix>.king.cutoff.in.id  +
              .king.cutoff.out.id.   If present, the .king.bin file must be triangular (either precision is ok).

       --write-covar ['cols='<column set descriptor>]

              If covariates are defined, an updated version (with all filters applied) is automatically  written
              to  <output  prefix>.cov whenever --make-pgen, --make-just-psam, --export, or a similar command is
              present.  However, if you do not wish to simultaneously generate a new sample file,  you  can  use
              --write-covar to just produce a pruned covariate file.  Supported column sets are:

              maybefid:  FID,  if  that  column was in the input.  fid: Force FID column to be written even when
              absent in the input.  maybesid: SID, if that column was in the input.  sid: Force SID column to be
              written even when absent in the input.  maybeparents:  Father/mother  IIDs  ('0'  =  missing),  if
              columns  in  input.  parents: Force PAT/MAT columns to be written even when absent in input.  sex:
              '1' = male, '2' = female, 'NA' = missing.  pheno1: First active phenotype.  If  none,  all  column
              entries are set to

              the --output-missing-phenotype string.

       phenos: All active phenotypes, if any.
              (Can be combined with pheno1 to

              force at least one phenotype column to be written.)

              (Covariates are always present, and positioned here.)

              The default is maybefid,maybesid.

       --write-samples

              Report IDs of all samples which pass your filters/inclusion thresholds.

       --write-snplist ['zs'] ['allow-dups']

       List all variants which pass your filters/inclusion thresholds.
              Unless the

              'allow-dups' modifier is provided, this now errors out when duplicate variant ID(s) remain.

       --glm ['zs'] ['omit-ref'] [{sex | no-x-sex}] ['log10'] ['pheno-ids']

              [{genotypic | hethom | dominant | recessive}] ['interaction'] ['skip'] ['hide-covar'] [{no-firth |
              firth-fallback   |   firth}]   ['intercept']   ['cols='<col   set   desc>]  ['local-covar='<file>]
              ['local-psam='<file>]  ['local-pos-cols='<key  col  #s>  |   'local-pvar='<file>]   ['local-haps']
              ['local-omit-last' | 'local-cats[0]='<category ct>] ['allow-no-covars']

              Basic  association  analysis  on quantitative and/or case/control phenotypes.  For each variant, a
              linear (for quantitative traits) or  logistic  (for  case/control)  regression  is  run  with  the
              phenotype  as  the  dependent  variable,  and nonmajor allele dosage(s) and a constant-1 column as
              predictors.  * There is usually an additive effect line for every nonmajor allele, and

       no such line for the major allele.
              To omit REF alleles instead of major

       alleles, add the 'omit-ref' modifier.
              (When performing interaction

              testing, this tends to cause the multicollinearity check to fail for low-ref-frequency variants.)

              * By default, sex (male = 1, female = 2; note that this is a change from

              PLINK 1.x) is automatically added as a predictor for X chromosome variants, and  no  others.   The
              'sex'  modifier  causes  it  to  be  added  everywhere (except chrY), while 'no-x-sex' excludes it
              entirely.

              * The 'log10' modifier causes p-values to be reported in -log10(p) form.  * 'pheno-ids' causes the
              samples used in each set of regressions to be

       written to an .id file.
              (When the samples differ on chrX or chrY, .x.id

              and/or .y.id files are also written.)

              * The 'genotypic' modifier adds an additive effect/dominance deviation 2df

              joint test (0-2 and 0..1..0 coding), while 'hethom' uses 0..0..1 and 0..1..0 coding instead.

              * 'dominant' and 'recessive' specify a model assuming full dominance or

       recessiveness, respectively, for the ref allele.
              I.e. the genotype

              column is recoded as 0..1..1 or 0..0..1, respectively.

       * 'interaction' adds genotype x covariate interactions to the model.
              Note

              that this tends to produce 'NA' results (due to the multicollinearity check)  when  the  reference
              allele is 'wrong'; --maj-ref can be used to enable analysis of those variants.

       * Additional predictors can be added with --covar.
              By default, association

              statistics  are  reported  for  all nonconstant predictors; 'hide-covar' suppresses covariate-only
              results, while 'intercept' causes intercepts to be reported.  Since running --glm without at least
              e.g. principal component covariates  is  usually  an  analytical  mistake,  the  'allow-no-covars'
              modifier is now required when you're intentionally running --glm without a covariate file.

              * By default, if the current phenotype and covariates are such that every

       regression on a chromosome will fail, PLINK 2 errors out.
              To just skip

              the phenotype or chromosome instead, add the 'skip' modifier.

              * There are now three regression modes for case/control phenotypes:

              * 'no-firth' requests PLINK 1.x's behavior, where a NA result is reported

              when basic logistic regression fails to converge.

              * 'firth-fallback' requests logistic regression, followed by Firth

       regression whenever the logistic regression fails to converge.
              This is

              now the default.

              * 'firth' requests Firth regression all the time.

              * To add covariates which are not constant across all variants, add the

              'local-covar='  and  'local-psam='  modifiers,  use  full  filenames  for  each,  and  use  either
              'local-pvar=' or 'local-pos-cols=' to provide variant ID or position information.   Normally,  the
              local-covar  file  should  have c * n real-valued columns, where the first c columns correspond to
              the first sample in the local-psam file, columns (c+1) to 2c  correspond  to  the  second  sample,
              etc.;  and the mth line corresponds to the mth nonheader line of the local-pvar file when there is
              one.  (Variants outside of the local-pvar file are  excluded  from  the  regression.)   The  local
              covariates  are  assigned the names LOCAL1, LOCAL2, etc.; to exclude the last local covariate from
              the regression (necessary if they are e.g. local  ancestry  coefficients  which  sum  to  1),  add
              'local-omit-last'.  Alternatively, with 'local-cats='<k>, the local-covar file is expected to have
              n  columns  with  integer-valued entries in [1, k].  (This range is [0, k-1] with 'local-cats0='.)
              These category assignments are expanded into (k-1) local covariates in  the  usual  manner.   When
              position   information   is   in   the   local-covar   file,   this   should   be   indicated   by
              'local-pos-cols='<number of header rows>,<chrom col #>,<pos start col #>,<first covariate col  #>.
              'local-haps'  indicates  that  there's  one  column  or  column-group per haplotype instead of per
              sample; they are averaged by --glm.

              The main report supports the following column sets:

              chrom: Chromosome ID.  pos: Base-pair coordinate.  (ID is always present,  and  positioned  here.)
              ref:  Reference  allele.   alt1: Alternate allele 1.  alt: All alternate alleles, comma-separated.
              (A1 is always present, and positioned here.  For multiallelic variants, this  column  may  contain
              multiple  comma-separated  alleles  when  the  result  doesn't depend on which allele is A1.)  ax:
              Non-A1 alleles, comma-separated.  a1count: A1 allele count (can  be  decimal  with  dosage  data).
              totallele: Allele observation count (can be higher than --freq value, due

              to inclusion of het haploids and chrX model).

              a1countcc:  A1  count  in cases, then controls (case/control only).  totallelecc: Case and control
              allele observation counts.  gcountcc: Genotype hardcall  counts  (neither-A1,  het-A1,  A1-A1)  in
              cases,

              then controls (case/control only).

              a1freq:  A1 allele frequency.  a1freqcc: A1 frequency in cases, then controls (case/control only).
              machr2: Unphased MaCH imputation quality (frequently  labeled  'INFO').   firth:  Reports  whether
              Firth  regression  was  used (firth-fallback only).  test: Test identifier.  (Required unless only
              one test is run.)  nobs: Number of samples in the regression.  beta: Regression  coefficient  (for
              A1 if additive test).  orbeta: Odds ratio for case/control, beta for quantitative traits.

              (Ignored if 'beta' column set included.)

              se:  Standard  error  of  beta.  ci: Bounds of symmetric approximate confidence interval (requires
              --ci).  tz: T-statistic for linear regression, Wald Z-score  for  logistic/Firth.   p:  Asymptotic
              p-value (or -log10(p)) for T/Z-statistic.  err: Error code for NA results.

              The default is chrom,pos,ref,alt,firth,test,nobs,orbeta,se,ci,tz,p,err.

       --score <filename> [i] [j] [k] [{header | header-read}]

              [{center  |  variance-standardize  |  dominant  | recessive}] ['no-mean-imputation'] ['se'] ['zs']
              ['ignore-dup-ids'] [{list-variants | list-variants-zs}] ['cols='<col set descriptor>]

              Apply linear scoring system(s) to each sample.  The input file should have  one  line  per  scored
              (variant, allele) pair.  Variant IDs are read from column #i and allele codes are read from column
              #j,  where  i  defaults  to  1  and  j  defaults  to  i+1.  * By default, a single column of input
              coefficients is read from column #k,

       where k defaults to j+1.
              (--score-col-nums can be used to specify

              multiple columns.)

              * 'header-read' causes the first line of the input file to be treated as a

       header line containing score names.
              Otherwise, score(s) are assigned the

              names 'SCORE1', 'SCORE2', etc.; and 'header' just causes the first line to be entirely ignored.

              * By default, copies of unnamed alleles contribute zero to score, while

              missing genotypes contribute an amount proportional to the loaded  (via  --read-freq)  or  imputed
              allele  frequency.   To  throw out missing observations instead (decreasing the denominator in the
              final average when this happens), use the 'no-mean-imputation' modifier.

              * You can use the 'center' modifier to shift all genotypes to mean zero, or

              'variance-standardize' to linearly transform the genotypes to mean-0, variance-1.

              * The 'dominant' modifier causes dosages greater than 1 to be treated as 1,

              while 'recessive' uses max(dosage - 1, 0) on diploid chromosomes.  ('dominant',  'recessive',  and
              'variance-standardize' cannot be used with chrX.)

              * The 'se' modifier causes the input coefficients to be treated as

              independent standard errors; in this case, standard errors for the score average/sum are reported.
              (Note that this will systematically underestimate standard errors when scored variants are in LD.)

              * By default, --score errors out if a variant ID in the input file appears

       multiple times in the main dataset.
              Use the 'ignore-dup-ids' modifier to

              skip them instead (a warning is still printed if such variants are present).

              * The 'list-variants[-zs]' modifier causes variant IDs used for scoring to

              be written to <output prefix>.sscore.vars[.zst].

              The main report supports the following column sets:

              maybefid:  FID,  if  that  column was in the input.  fid: Force FID column to be written even when
              absent in the input.  (IID is always present, and positioned here.)  maybesid: SID, if that column
              was in the input.  sid: Force SID column to be written even when absent  in  the  input.   pheno1:
              First  active  phenotype.   phenos:  All active phenotypes, if any.  nallele: Number of nonmissing
              alleles.  denom: Denominator of score average (equal to nallele value when

              'no-mean-imputation' specified).

              dosagesum: Sum of named allele dosages.  scoreavgs: Score averages.  scoresums: Score sums.

              The  default  is  maybefid,maybesid,phenos,nallele,dosagesum,scoreavgs.   For  more  sophisticated
              polygenic  risk  scoring, we recommend looking at the LDpred (https://github.com/bvilhjal/ldpred )
              and PRSice-2 (https://www.prsice.info/ ) software packages.

       --variant-score <filename> ['zs'] ['bin' | 'cols='<col set descriptor>]

              (alias: --vscore) Apply linear scoring system(s) to each variant.  Each reported variant score  is
              the  dot  product  of a sample-weight vector with the total-ALT-dosage vector, with MAF-based mean
              imputation applied to missing dosages.  Input file format: one line per sample, each starting with
              an ID and followed by scoring weight(s); it can also  have  a  header  line  with  the  sample  ID
              representation and the score name(s).  The usual .vscore text report supports the following column
              sets:

              chrom:  Chromosome  ID.   pos: Base-pair coordinate.  (ID is always present, and positioned here.)
              ref: Reference allele.  alt1: Alternate allele 1.  alt: All  alternate  alleles,  comma-separated.
              altfreq:  ALT  allele  frequency  used  for  mean-imputation.   nmiss: Number of missing (and thus
              mean-imputed) dosages.  nobs: Number of (nonmissing) sample  observations.   (Variant  scores  are
              always present, and positioned here.)

              Default  is chrom,pos,ref,alt.  If binary output is requested instead, the main .vscore.bin matrix
              contains double-precision floating-point  values,  column  (score)  ID(s)  are  saved  to  <output
              prefix>.vscore.cols, and variant IDs are saved to <output prefix>.vscore.vars[.zst].

       --adjust-file <filename> ['zs'] ['gc'] ['cols='<column set descriptor>]

              ['log10'] ['input-log10'] ['test='<test name, case-sensitive>]

              Given  a  file  with  unfiltered  association  test  results,  report  some basic multiple-testing
              corrections, sorted in increasing-p-value order.  * 'gc' causes genomic-controlled p-values to  be
              used in the formulas.

       (This tends to be overly conservative.
              We note that LD Score regression

              usually  does  a  better job of calibrating lambda; see Lee JJ, McGue M, Iacono WG, Chow CC (2018)
              The accuracy of LD Score regression as an estimator of confounding  and  genetic  correlations  in
              genome-wide association studies.)

              * 'log10' causes negative base 10 logs of p-values to be reported, instead

       of raw p-values.
              'input-log10' specifies that the input file contains

       -log10(p) values.

              * If the input file contains multiple tests per variant which are

              distinguished  by  a  'TEST'  column (true for --linear/--logistic/--glm), you must use 'test=' to
              select the test to process.

              The following column sets are supported:

              chrom: Chromosome ID.  pos: Base-pair coordinate.  (ID is always present,  and  positioned  here.)
              ref:  Reference  allele.   alt1: Alternate allele 1.  alt: All alternate alleles, comma-separated.
              a1: Tested allele.  (Omitted if missing from input file.)  unadj: Unadjusted p-value.  gc:  Devlin
              & Roeder (1999) genomic control corrected p-value (additive

              models only).

              qq:  P-value  quantile.   bonf:  Bonferroni  correction.   holm:  Holm-Bonferroni  (1979) adjusted
              p-value.  sidakss: Sidak single-step adjusted p-value.  sidaksd: Sidak step-down adjusted p-value.
              fdrbh: Benjamini & Hochberg (1995) step-up false discovery control.  fdrby: Benjamini &  Yekutieli
              (2001) step-up false discovery control.

              Default set is chrom,a1,unadj,gc,bonf,holm,sidakss,sidaksd,fdrbh,fdrby.

       --genotyping-rate ['dosage']

              Report genotyping rate in log (this was automatic in PLINK 1.x).

       --pgen-info

              Reports basic information about a .pgen file.

       --validate

              Validates all variant records in a .pgen file.

       --zst-decompress <.zst file> [output filename]

              (alias:  --zd) Decompress a Zstd-compressed file.  If no output filename is specified, the file is
              decompressed to standard output.  This cannot be used with any other flags, and does not  cause  a
              log file to be generated.

       The following other flags are supported.

       --script <fname>
              : Include command-line options from file.

       --rerun [log]
              : Rerun commands in log (default 'plink2.log').

       --version
              : Display only version number before exiting.

       --silent
              : Suppress regular output to console.  (Error-output is not suppressed.)

       --double-id
              : Set both FIDs and IIDs to the VCF/.bgen sample ID.

       --const-fid [ID]
              : Set all FIDs to the given constant.  If '0' (the default), no FID column is created.

       --id-delim [d]
              :  Normally  parses  single-delimiter  sample  IDs  as  <FID><d><IID>, and double-delimiter IDs as
              <FID><d><IID><d><SID>;  default  delimiter  is  '_'.   --id-delim  can  no  longer  be  used  with
              --double-id/--const-fid; it will error out if any ID lacks the delimiter.

       --idspace-to <c>
              : Convert spaces in VCF/.bgen sample IDs to the given character.

       --iid-sid
              : Make --id-delim and --sample-diff interpret two-token sample IDs as IID-SID instead of FID-IID.

       --vcf-require-gt
              : Skip variants with no GT field.

       --vcf-min-gq <val>
              : No-call genotypes when GQ is present and below the threshold.

       --vcf-max-dp <val>
              : No-call genotypes when DP is present and above/below

       --vcf-min-dp <val>
              the threshold.

       --vcf-half-call <m> : Specify how '0/.' and similar VCF GT values should be
              handled.   The  following four modes are supported: * 'error'/'e' (default) errors out and reports
              line #.  * 'haploid'/'h' treats them as haploid calls.  * 'missing'/'m' treats them as missing.  *
              'reference'/'r' treats the missing value as 0.

       --oxford-single-chr <chr name>
              : Specify single-chromosome .gen/.bgen file with no useful chromosome info inside.

       --missing-code [string list]
              : Comma-delimited list of missing phenotype

       (alias: --missing_code)
              values for Oxford-format import (default 'NA').

       --hard-call-threshold <val>
              : When importing dosage data, a hardcall is normally saved when  the  distance  from  the  nearest
              hardcall, defined as

       0.5 * sum_i |x_i - round(x_i)|
              (where the x_i's are 0..2 allele dosages), is not greater than 0.1.  You can adjust this threshold
              by   providing  a  numeric  argument  to  --hard-call-threshold.   You  can  also  use  this  with
              --make-[b]pgen to alter the saved hardcalls while leaving the dosages untouched, or --make-bed  to
              tweak hardcall export.

       --dosage-erase-threshold <val>
              :  --hard-call-threshold normally preserves the original dosages, and several PLINK 2 commands use
              them when they're available.  Use --dosage-erase-threshold to make PLINK 2 erase dosages and  keep
              only hardcalls when distance-from-hardcall <= the given level.

       --import-dosage-certainty <val> : The PLINK 2 file format currently supports
              a  single  dosage  for each allele.  Some other dosage file formats include a separate probability
              for every possible genotype, e.g. {P(0/0)=0.2, P(0/1)=0.52, P(1/1)=0.28}, a highly uncertain  call
              that  is  nevertheless  treated  as a hardcall under '--hard-call-threshold 0.1'.  To make PLINK 2
              treat a dosage as missing  whenever  the  largest  probability  is  less  than  a  threshold,  use
              --import-dosage-certainty.

       --input-missing-genotype <c> : '.' is always interpreted as a missing
              genotype  code  in  input files.  By default, '0' also is; you can change this second missing code
              with --input-missing-genotype.

       --allow-extra-chr
              : Permit unrecognized chromosome codes (alias --aec).

       --chr-set <autosome ct> ['no-x'] ['no-y'] ['no-xy'] ['no-mt'] :

       Specify a nonhuman chromosome set.
              The first parameter sets the number of

              diploid autosome pairs if positive, or haploid chromosomes if negative.  Given diploid  autosomes,
              the remaining modifiers indicate the absence of the named non-autosomal chromosomes.

       --cow/--dog/--horse/--mouse/--rice/--sheep : Shortcuts for those species.

       --autosome-num <val>
              : Alias for '--chr-set <value> no-y no-xy no-mt'.

       --human
              :  Explicitly  specify  human  chromosome  set, and make output .pvar/VCF files include a ##chrSet
              header line.  (.pvar/VCF output files automatically  include  ##chrSet  when  a  nonhuman  set  is
              specified.)

       --chr-override ['file'] : By default, if --chr-set/--autosome-num/--cow/etc.
              conflicts with an input file ##chrSet header line, PLINK 2 will error out.  --chr-override with no
              argument causes the command line to take precedence; '--chr-override file' defers to the file.

       --var-min-qual <val>
              : Skip variants with low/missing QUAL.

       --var-filter [exception(s)...]
              : Skip variants which have FILTER failures.

       --extract-if-info <key> <op> <val> : Exclude variants which don't/do satisfy

       --exclude-if-info <key> <op> <val>
              a comparison predicate on an INFO key,

       (aliases: --extract-if,
              e.g.

       --exclude-if)
              --extract-if-info "VT == SNP"

       Unless the operator is !=, the predicate
              always evaluates to false when the key is missing.

       --require-info <key(s)...>
              : Exclude variants based on nonexistence

       --require-no-info <key(s)...>
              or existence of an INFO key.  "<key>=."  is treated as nonexistence.

       --extract-col-cond <f> [valcol] [IDcol] [skip] :

       --extract-col-cond-match <(sub)string(s)...>

       --extract-col-cond-mismatch <(sub)string(s)...>

       --extract-col-cond-substr

       --extract-col-cond-min <min>

       --extract-col-cond-max <max> :

              Exclude  all  variants  without a value-column entry satisfying a condition.  * By default, values
              are read from column 2 of the file, and variant IDs

              are read from column 1.

              * Three types of conditions are supported:

              * When --extract-col-cond-match is specified without

       --extract-col-cond-substr, the value is checked for equality with the

       given strings, and kept iff one of them matches.
              Similarly,

       --extract-col-cond-mismatch without --extract-col-cond-substr causes

              the variant to be kept iff the value matches none of the given strings.

              * When --extract-col-cond-match and/or -mismatch are specified with

       --extract-col-cond-substr, the variant is kept iff none of the

       --extract-col-cond-mismatch substrings are contained in the value, and

              either --extract-col-cond-match was unspecified or at least one of its substrings is contained.

              * Otherwise, the value is interpreted as a number, and the variant is

              kept if the number is in [<min>, <max>] (default min=0, max=DBL_MAX).

       --pheno ['iid-only'] <f> : Specify additional phenotype/covariate file.
              Comma-delimited files with a header line are now permitted.

       --pheno-name <name...>
              : Only load the designated phenotype(s) from the --pheno (if one was specified) or  .psam  (if  no
              --pheno) file.  Separate multiple names with spaces or commas, and use dashes to designate ranges.

       --pheno-col-nums <#...>
              : Only load the phenotype(s) in the designated column number(s) from the --pheno file.

       --no-psam-pheno
              : Ignore phenotype(s) in .psam/.fam file.

       --strict-sid0
              : By default, if there is no SID column in the .psam/.fam (or --update-ids) file, but there is one
              in another input file (for e.g. --keep/--remove), the latter SID column is ignored; sample IDs are
              considered  matching  as  long as FID and IID are equal (with missing FID treated as '0').  If you
              also want to require SID = '0' for a sample ID match in this situation, add --strict-sid0.

       --input-missing-phenotype <v> : Set nonzero number to treat as a missing
              pheno/covar in input files (default -9).

       --no-input-missing-phenotype
              : Don't treat any nonzero number as a missing  pheno/covar.   ('NA'/'nan'  are  still  treated  as
              missing.)

       --1    :  Expect  case/control phenotypes in input files to be coded as 0 = control, 1 = case, instead of
              the usual 0 = missing, 1 = ctrl, 2 = case.  (Unlike PLINK 1.x, this does not force all  phenotypes
              to be interpreted as case/ctrl.)

       --missing-catname <str>
              : Set missing-categorical-phenotype string (case-sensitive, default 'NONE').

       --covar ['iid-only'] <f> : Specify additional covariate file.
              Comma-delimited files with a header line are now permitted.

       --covar-name <name...>
              :  Only  load  the designated covariate(s) from the --covar (if one was specified), --pheno (if no
              --covar), or .psam (if no --covar or --pheno) file.

       --covar-col-nums <#...>
              : Only load the covariate(s) in the designated column number(s)  from  the  --covar  (if  one  was
              specified) or --pheno (if no --covar) file.

       --within <f> [new pheno name] : Import a PLINK 1.x categorical phenotype.
              (Phenotype name defaults to 'CATPHENO'.)  * If any numeric values are present, ALL

       values must be numeric.
              In that case, 'C'

       is added in front of all category names.
              * 'NA' is treated as a missing value.

       --mwithin <n>
              : Load --within categories from column n+2.

       --family [new pheno name]
              : Create a categorical phenotype from FID.  Restrictions on and handling of numeric values are the
              same as for --within.

       --family-missing-catname <nm> : Make --family treat the specified FID as
              missing.

       --keep <fname...>
              : Exclude all samples not named in a file.

       --remove <fname...>
              : Exclude all samples named in a file.

       --keep-fam <fn...>
              : Exclude all families not named in a file.

       --remove-fam <f...>
              : Exclude all families named in a file.

       --extract [{bed0 | bed1}] <f...> : Usually excludes all variants (not) named

       --exclude [{bed0 | bed1}] <f...>
              in  the  given file(s).  When multiple files are named, they are concatenated.  With the 'bed0' or
              'bed1' modifier, variants outside/inside the positional ranges in  the  interval-BED  file(s)  are
              excluded  instead.   'bed0'  tells  PLINK  2 to assume the interval bounds follow the UCSC 0-based
              half-open  convention,  while  'bed1'  (equivalent  to  PLINK  1.9  'range')   specifies   1-based
              fully-closed.

       --extract-intersect [{bed0 | bed1}] <f...> : Just like --extract, except that
              a variant must be in the intersection, rather than just the union, of the files to remain.

       --bed-border-bp <n>
              : Stretch BED intervals by the given amount on each

       --bed-border-kb <n>
              side.

       --keep-cats <filename>
              : These can be used individually or in combination

       --keep-cat-names <nm...>
              to  define  a  list  of  categories  to  keep;  all samples not in one of the named categories are
              excluded.  Use spaces to separate category names for --keep-cat-names.  Use the  --missing-catname
              value (default 'NONE') to refer to the group of uncategorized samples.

       --keep-cat-pheno <pheno> : If more than one categorical phenotype is loaded,
              or  you  wish to filter on a categorical covariate, --keep-cat-pheno must be used to specify which
              phenotype/covariate --keep-cats and --keep-cat-names apply to.

       --remove-cats <filename> : Exclude all categories named in the file.

       --remove-cat-names <...> : Exclude named categories.

       --remove-cat-pheno <phe> : Specify pheno for --remove-cats/remove-cat-names.

       --split-cat-pheno [{omit-most | omit-last}] ['covar-01']
              [cat. pheno/covar name(s)...] :

              Split n-category phenotype(s) into n (or n-1,  with  'omit-most'/'omit-last')  binary  phenotypes,
              with names of the form <orig. pheno name>=<cat. name>.  (As a consequence, affected phenotypes and
              categories  are  not  permitted  to  contain  the '=' character.)  * This happens after all sample
              filters.  * If no phenotype or covariate names are provided, all categorical

              phenotypes (but not covariates) are processed.

       * By default, generated covariates are coded as 1=false, 2=true.
              To code

              them as 0=false, 1=true instead, add the 'covar-01' modifier.

       --loop-cats <pheno/covar>
              : Run variant filters and subsequent operations on just the samples in the  first  category;  then
              just  the  samples  in the second category; and so on, for all categories in the named categorical
              phenotype.

       --no-id-header ['iid-only'] : Don't include a header line in .id output
              files.  This normally forces two-column FID/IID output; add 'iid-only' to force just single-column
              IID.

       --variance-standardize [pheno/covar name(s)...]

       --covar-variance-standardize [covar name(s)...] :

              Linearly transform named covariates (and quantitative phenotypes,  if  --variance-standardize)  to
              mean-zero,  variance  1.   If  no  arguments  are provided, all possible phenotypes/covariates are
              affected.  This is frequently necessary to prevent multicollinearity when dealing with  covariates
              where abs(mean) is much larger than abs(standard deviation), such as year of birth.

       --quantile-normalize [...]
              : Force named covariates and quantitative

       --pheno-quantile-normalize [...]
              phenotypes to a N(0,1) distribution,

       --covar-quantile-normalize [...]
              preserving only the original rank orders.

       --chr <chr(s)...>
              : Exclude all variants not on the given chromosome(s).  Valid choices for humans are 0 (unplaced),
              1-22,  X, Y, XY, MT, PAR1, and PAR2.  Separate multiple chromosomes with spaces and/or commas, and
              use a dash (no adjacent spaces permitted) to denote a range, e.g.  '--chr 1-4, 22, par1, x, par2'.

       --not-chr <...>
              : Reverse of --chr (exclude variants on listed chromosomes).

       --autosome
              : Exclude all non-autosomal variants.

       --autosome-par
              : Exclude all non-autosomal variants, except those in a pseudo-autosomal region.

       --snps-only ['just-acgt'] : Exclude non-SNP variants.
              By default, SNP = all allele codes are single-character (so multiallelic variants with  a  mix  of
              SNPs  and  non-SNPs are excluded; split your variants first if that's a problem).  The 'just-acgt'
              modifier restricts SNP codes to {A,C,G,T,a,c,g,t,<missing>}.

       --from <var ID>
              : Use ID(s) to specify a variant range to load.  When used

       --to   <var ID>      together, both variants must be on the same chromosome.   (--snps  can  be  used  to
              specify intervals which cross chromosome boundaries.)

       --snp  <var ID>    : Specify a single variant to load.

       --exclude-snp <ID> : Specify a single variant to exclude.

       --window
              <kbs>     :  With  --snp/--exclude-snp,  loads/excludes  all variants within half the specified kb
              distance of the named one.

       --from-bp <pos>
              : Use base-pair coordinates to define a variant range to

       --to-bp
              <pos>      load.

       --from-kb <pos>
              * You must use these with --chr, specifying a single

       --to-kb
              <pos>        chromosome.

       --from-mb <pos>
              * Decimals and negative numbers are permitted.

       --to-mb
              <pos>      * The --to-bp(/-kb/-mb) position  is  no  longer  permitted  to  be  smaller  than  the
              --from-bp position.

       --snps <var IDs...>
              : Use IDs to specify variant range(s) to load or

       --exclude-snps <...>
              exclude.  E.g. '--snps rs1111-rs2222, rs3333, rs4444'.

       --force-intersect
              :   PLINK   2   normally   errors   out   when  multiple  variant  inclusion  filters  (--extract,
              --extract-col-cond,  --extract-intersect,  --from/--to,  --from-bp/--to-bp,  --snp,  --snps)   are
              specified.  --force-intersect allows the run to proceed; the set intersection will be taken.

       --thin <p>
              : Randomly remove variants, retaining each with prob. p.

       --thin-count <n>
              : Randomly remove variants until n of them remain.

       --bp-space <bps>
              : Remove variants so that each pair is no closer than the given bp distance.

       --thin-indiv <p>
              : Randomly remove samples, retaining with prob. p.

       --thin-indiv-count <n> : Randomly remove samples until n of them remain.

       --keep-col-match <f> <val(s)...> : Exclude all samples without a 3rd column
              entry  in  the  given  file exactly matching one of the given strings.  (Separate multiple strings
              with spaces.)

       --keep-col-match-name <col name> : Check column with given name instead.

       --keep-col-match-num <n>
              : Check nth column instead.

       --geno [val] [{dosage | hh-missing}]

       --mind [val] [{dosage | hh-missing}] :

              Exclude variants (--geno) and/or samples (--mind) with missing call  frequencies  greater  than  a
              threshold  (default  0.1).   (Note  that the default threshold is only applied if --geno/--mind is
              invoked without an argument; when --geno/--mind is not invoked, no missing call frequency  ceiling
              is enforced at all.  Other inclusion/exclusion default thresholds work the same way.)  By default,
              when  a  dosage  is  present  but  a  hardcall is not, the genotype is treated as missing; add the
              'dosage' modifier to treat this case as nonmissing.  Alternatively, you can  use  'hh-missing'  to
              also treat heterozygous haploid calls as missing.

       --require-pheno [name(s)...] : Remove samples missing any of the named

       --require-covar [name(s)...]
              phenotype(s)/covariate(s).   If  no  arguments are provided, all phenotype(s)/covariate(s) must be
              present.

       --maf [freq] [mode]
              : Exclude variants with allele frequency lower than a

       (alias: --min-af)
              threshold (default 0.01).  By default, the nonmajor allele frequency is used; the other  supported
              modes  are  'nref'  (non-reference),  'alt1',  and  'minor'  (least frequent).  bcftools freq:mode
              notation is permitted.

       --max-maf <freq> [mode] : Exclude variants with MAF greater than the

       (alias: --max-af)
              threshold.

       --mac <ct> [mode]
              : Exclude variants with allele dosage lower than the

       (alias: --min-ac)
              given threshold.

       --max-mac <ct> [mode]
              : Exclude variants with allele dosage greater than

       (alias: --max-ac)
              the given threshold.

       --maf-succ
              : Rule of succession allele frequency estimation (used in EIGENSOFT).  Given j observations of one
              allele and k observations of the other for a biallelic variant, infer allele frequencies of  (j+1)
              /  (j+k+2)  and  (k+1) / (j+k+2), rather than the default j / (j+k) and k / (j+k).  Note that this
              does not affect --freq's output.

       --min-alleles <ct> : Exclude variants with fewer than the given # of alleles.
              (When a  variant  has  exactly  one  ALT  allele,  and  it's  a  missing-code,  it's  excluded  by
              "--min-alleles 2".)

       --max-alleles <ct> : Exclude variants with more than the given # of alleles.

       --read-freq <file> : Load allele frequency estimates from the given --freq or
              --geno-counts  (or PLINK 1.9 --freqx) report, instead of imputing them from the immediate dataset.

       --hwe <p> ['midp'] ['keep-fewhet'] :

              Exclude variants with Hardy-Weinberg equilibrium exact test p-values  below  a  threshold.   *  By
              default,  only  founders  are  considered.   * chrX p-values are now computed using Graffelman and
              Weir's method.  * For variants with k alleles with k>2, k separate 'biallelic' tests are

              performed, and the variant is filtered out if any of them fail.

              * With 'keep-fewhet', variants which fail the test in the too-few-hets

       direction are not excluded.
              On chrX, this uses the ratio between the

              Graffelman/Weir p-value and the female-only p-value.

              * There is currently no special handling of case/control phenotypes.

       --mach-r2-filter [min] [max] : Exclude variants with MaCH imputation quality
              metric less than min or greater than max (defaults 0.1 and 2.0).  (Monomorphic variants, with r2 =
              nan, are not excluded.)  * This is NOT identical to the R2 metric

       reported by Minimac3 0.1.13+; see below.
              * If a single argument is provided, it is

       treated as the minimum.
              * The metric is not computed on chrX and MT.

       --minimac3-r2-filter <min> [max] : Compute Minimac3 R2 values from scratch,
              and exclude variants with R2 less than min or (if max is provided) greater than max.  * Note  that
              this requires phased-dosage

       data for all samples and variants;
              otherwise   this   will   systematically   underestimate   imputation   quality,   since  unphased
              hardcalls/dosages   are    treated    as    if    they    were    maximally    uncertain.     (Use
              --extract-if-info/--exclude-if-info  to  filter  on  precomputed  Minimac3  R2 in a VCF/.pvar INFO
              column.)

       --keep-females
              : Exclude male and unknown-sex samples.

       --keep-males
              : Exclude female and unknown-sex samples.

       --keep-nosex
              : Exclude all known-sex samples.

       --remove-females
              : Exclude female samples.

       --remove-males
              : Exclude male samples.

       --remove-nosex
              : Exclude unknown-sex samples.

       --keep-founders
              : Exclude nonfounder samples.

       --keep-nonfounders : Exclude founder samples.

       --keep-if <pheno/covar> <op> <val> : Exclude samples which don't/do satisfy a

       --remove-if <pheno/covar> <op> <v>
              comparison predicate, e.g.  --keep-if "PHENO1 == case"

       Unless the operator is !=, the predicate
              always evaluates to false when the phenotype/covariate is missing.

       --nonfounders
              : Include nonfounders in allele freq/HWE calculations.

       --bad-freqs
              : When PLINK 2 needs decent allele frequencies, it normally errors out if they aren't provided  by
              --read-freq and less than 50 founders are available to impute them from.  Use --bad-freqs to force
              PLINK 2 to proceed in this case.

       --bad-ld
              :  PLINK  2  normally errors out when it needs to estimate LD between variants, but there are less
              than 50 founders to estimate from.  Use --bad-ld to force PLINK 2 to proceed.

       --export-allele <file> : With --export A/A-transpose/AD, count alleles named
              in the file, instead of REF alleles.

       --output-chr <MT code> : Set chromosome coding scheme in output files by
              providing the desired human mitochondrial code.  Options  are  '26',  'M',  'MT',  '0M',  'chr26',
              'chrM',  and  'chrMT';  default  is  now  'MT'  (note  that this is a change from PLINK 1.x, which
              defaulted to '26').

       --output-missing-genotype <ch> : Set the code used to represent missing
              genotypes in output files (default '.').

       --output-missing-phenotype <s> : Set the string used to represent missing
              phenotypes in output files (default 'NA').

       --sort-vars [mode]
              : Sort variants by chromosome, then position, then ID.  The following string orders are supported:
              * 'natural'/'n':  Natural  sort  (default).   *  'ascii'/'a':  ASCII.   This  must  be  used  with
              --make-[b]pgen/--make-bed.

       --set-hh-missing ['keep-dosage'] : Make --make-[b]pgen/--make-bed set non-MT
              heterozygous  haploid  hardcalls,  and all female chrY calls, to missing.  (Unlike PLINK 1.x, this
              treats unknown-sex chrY genotypes like males, not females.)  By default,  all  associated  dosages
              are also erased; use 'keep-dosage' to keep them all.

       --set-mixed-mt-missing ['keep-dosage'] : Make --make-[b]pgen/--make-bed set
              mixed MT hardcalls to missing.

       --split-par <bp1> <bp2> : Changes chromosome code of all X chromosome

       --split-par <build>
              variants  with  bp position <= bp1 to PAR1, and those with position >= bp2 to PAR2.  The following
              build codes are supported as shorthand: * 'b36'/'hg18' = NCBI 36, 2709521/154584237 * 'b37'/'hg19'
              = GRCh37, 2699520/154931044 * 'b38'/'hg38' = GRCh38, 2781479/155701383

       --merge-par
              : Merge PAR1/PAR2 back with X.  Requires PAR1 to be positioned immediately before X, and  PAR2  to
              be  immediately  after  X.   (Should  *not*  be  used  with  "--export  vcf", since it causes male
              homozygous/missing calls in PAR1/PAR2 to be reported as haploid.)

       --merge-x
              : Merge XY back with X.  This usually has to be combined with --sort-vars.

       --set-missing-var-ids <t>
              : Given a template string with a '@' where the

       --set-all-var-ids <t>
              chromosome code should go and '#' where the bp coordinate belongs,  --set-missing-var-ids  assigns
              chromosome-and-bp-based  IDs to unnamed variants, while --set-all-var-ids resets all IDs.  You may
              also use '$r'/'$a' to refer to the ref and  alt1  alleles,  or  '$1'/'$2'  to  refer  to  them  in
              alphabetical order.

       --var-id-multi <t>
              : Specify alternative templates for multiallelic

       --var-id-multi-nonsnp <t>
              variants.   ('$a' and '$1'/'$2' should be avoided here, though they're technically still allowed.)

       --new-id-max-allele-len <len> [{error | missing | truncate}] :

              Specify maximum number of leading characters from allele codes to include in new variant IDs,  and
              behavior on longer codes (defaults 23, error).

       --missing-var-code <str>
              :  Change  unnamed  variant  code for --rm-dup, --set-{missing|all}-var-ids, and --recover-var-ids
              (default '.').

       --update-map
              <f> [bpcol]  [IDcol]  [skip] : Update variant bp positions.

       --update-name <f> [newcol] [oldcol] [skip] : Update variant IDs.

       --recover-var-ids <file> ['strict-bim-order'] [{rigid | force}] ['partial'] :

       Undo --set-all-var-ids, given the original .pvar/VCF/.bim file.
              Original

              IDs are looked up by position and allele codes.  * By default, if the original-ID file is a  .bim,
              allele order is ignored.

              Use 'strict-bim-order' to force A1=ALT, A2=REF.

              * If any variant has multiple matching records in the original-ID file, and

              the   IDs   conflict,   --recover-var-ids   writes   the   affected  (current)  ID(s)  to  <output
              prefix>.recoverid.dup, and normally errors out.  If the original-ID file has the  same  number  of
              variants  in  the  same order, you can still recover the old IDs with the 'rigid' modifier in this
              case.  Alternatively, to proceed and assign the missing-ID code to these variants, add the 'force'
              modifier.  (The .recoverid.dup file is still written when 'rigid' or 'force' is specified.)

              * --recover-var-ids normally expects to replace all variant IDs, and errors

       out if any are left untouched.
              Add the 'partial' modifier when you

              actually want to update just a proper subset.

       --update-alleles <fname>
              : Update variant allele codes.

       --update-ids <fname>
              : Update sample IDs.

       --update-parents <fname>
              : Update parental IDs.

       --update-sex <filename> ['col-num='<n>] ['male0'] :

              Update sex information.  * By default, if there is a header line starting with '#FID'/'#IID',  sex
              is

              loaded  from  the  first column titled 'SEX' (any capitalization); otherwise, column 3 is assumed.
              Use 'col-num=' to force a column number.

       * Only the first character in the sex column is processed.
              By default,

              '1'/'M'/'m' is interpreted  as  male,  '2'/'F'/'f'  is  interpreted  as  female,  and  '0'/'N'  is
              interpreted  as unknown-sex.  To change this to '0'/'M'/'m' = male, '1'/'F'/'f' = female, anything
              else other than '2' = unknown-sex, add 'male0'.

       --real-ref-alleles
              : Treat A2 alleles in a PLINK 1.x fileset as actual  REF  alleles;  otherwise  they're  marked  as
              provisional.

       --maj-ref ['force'] : Set major alleles to reference, like PLINK 1.x
              automatically  did.   (Note that this is now opt-in rather than opt-out; --keep-allele-order is no
              longer necessary to prevent allele-swapping.)  * This can only be used in runs with

       --make-bed/--make-[b]pgen/--export and no other
              commands.

       * By default, this only affects variants marked as
              having 'provisional' reference alleles.  Add 'force' to apply this to all variants.

              * All new reference alleles are marked as provisional.

       --ref-allele ['force'] <filename> [refcol] [IDcol] [skip]

       --alt1-allele ['force'] <filename> [alt1col] [IDcol] [skip] :

              These set the alleles specified in the file to ref (--ref-allele) or alt1  (--alt1-allele).   They
              can   be   combined   in   the   same   run.    *   These   can   only   be   used  in  runs  with
              --make-bed/--make-[b]pgen/--export

              and no other commands.

              * "--ref-allele <VCF filename> 4 3 '#'", which scrapes reference allele

              assignments from a VCF file, is especially useful.

              * By default, these error out when asked to change a 'known' reference

       allele.
              Add 'force' to permit that (when e.g. switching to a new

              reference genome).

              * When --alt1-allele changes the previous ref allele to alt1, the previous

              alt1 allele is set to reference and marked as provisional.

       --ref-from-fa ['force'] : This sets reference alleles from the --fa file when
              it can be done unambiguously (note that it's never possible for deletions or some insertions).  By
              default, it errors out when asked to change a 'known' reference allele; add the  'force'  modifier
              to permit that.

       --normalize ['list']
              : Left-normalize all variants, using the --fa file.

       (alias: --norm)
              (Assumes no differences in capitalization.)  The 'list' modifier causes a list of affected variant
              IDs to be written to <output prefix>.normalized.

       --indiv-sort <mode> [f] : Specify sample ID sort order for merge and
              --make-[b]pgen/--make-bed.   The following four modes are supported: * 'none'/'0' keeps samples in
              the order they were

       loaded.
              Default for non-merge.

       * 'natural'/'n' invokes "natural sort", e.g.
              'id2' < 'ID3' < 'id10'.  Default when merging.

       * 'ascii'/'a' sorts in ASCII order, e.g.
              'ID3' < 'id10' < 'id2'.

       * 'file'/'f' uses the order in the given file
              (named in the last argument).

       --king-table-filter <min>
              : Specify minimum kinship coefficient for inclusion in --make-king-table report.

       --king-table-subset <f> [kmin] : Restrict current --make-king-table run to
              sample pairs listed in the given .kin0 file.  If a second argument is provided, only sample  pairs
              with kinship >= that threshold (in the input .kin0) are processed.

       --condition <variant ID> [{dominant | recessive}] ['multiallelic']

       --condition-list <fname> [{dominant | recessive}] ['multiallelic'] :

              Add  the  given variant, or all variants in the given file, as --glm covariates.  By default, this
              errors out if any of the variants  are  multiallelic;  add  the  'multiallelic'  ('m'  for  short)
              modifier  to  allow  them.   They'll effectively be split against the major allele (unless --glm's
              'omit-ref'  modifier  was  specified),  and  all  induced  covariate  names--even  for   biallelic
              variants--will have an underscore followed by the allele code at the end.

       --parameters <...> : Include only the given covariates/interactions in the
              --glm model, identified by a list of 1-based indices and/or ranges of them.

       --tests <...>
              : Perform a (joint) test on the specified term(s) in the

       --tests all          --glm model, identified by 1-based indices and/or ranges
              of them.  * Note that, when --parameters is also present, the

       indices refer to the terms remaining AFTER pruning by
              --parameters.

              * You can use '--tests all' to include all terms.

       --vif <max VIF>
              :  Set  VIF  threshold for --glm multicollinearity check (default 50).  (This is no longer skipped
              for case/control phenotypes.)

       --max-corr <val>
              : Skip --glm regression when the absolute value of the correlation between two predictors  exceeds
              this value (default 0.999).

       --xchr-model <m>
              :  Set the chrX --glm/--condition[-list]/--[v]score model.  * '0' = skip chrX.  * '1' = add sex as
              a covar on chrX, code males 0..1.  * '2' (default) = chrX sex covar, code males  0..2.   (Use  the
              --glm 'interaction' modifier to test for interaction between genotype and sex.)

       --adjust ['zs'] ['gc'] ['log10'] ['cols='<column set descriptor>] :

              For  each  association test in this run, report some basic multiple-testing corrections, sorted in
              increasing-p-value order.  Modifiers work the same way as they do on --adjust-file.

       --lambda
              : Set genomic control lambda for --adjust[-file].

       --adjust-chr-field <n...>
              : Set --adjust-file input field names.  When

       --adjust-pos-field <n...>
              multiple arguments are given to these flags,

       --adjust-id-field <n...>
              earlier names take precedence over later ones.

       --adjust-ref-field <n...>

       --adjust-alt-field <n...>

       --adjust-a1-field <n...>

       --adjust-test-field <n...>

       --adjust-p-field <n...>

       --ci <size>
              : Report confidence ratios for odds ratios/betas.

       --pfilter <val>
              : Filter out assoc. test results with higher p-values.

       --score-col-nums <...> : Process all the specified coefficient columns in the
              --score file, identified by 1-based indexes and/or ranges of them.

       --q-score-range <range file> <data file> [i] [j] ['header'] ['min'] :

              Apply --score to subset(s) of variants in the primary score list(s) based on e.g. p-value  ranges.
              * The first file should have range labels in the first column, p-value

              lower  bounds  in  the  second  column,  and upper bounds in the third column.  Lines with too few
              entries, or nonnumeric values in the second or third column, are ignored.

              * The second file should contain a variant ID and a p-value on each line

       (except possibly the first).
              Variant IDs are read from column #i and

              p-values are read from column #j, where i defaults to 1 and  j  defaults  to  i+1.   The  'header'
              modifier causes the first nonempty line of this file to be skipped.

              * By default, --q-score-range errors out when a variant ID appears multiple

       times in the data file (and is also present in the main dataset).
              To use

              the minimum p-value in this case instead, add the 'min' modifier.

       --vscore-col-nums <...> : Process all the specified coefficient columns in
              the --variant-score file, identified by 1-based indexes and/or ranges of them.

       --parallel <k> <n> : Divide the output matrix into n pieces, and only compute
              the  kth  piece.   The  primary  output file will have the piece number included in its name, e.g.
              plink2.king.13 or plink2.king.13.zst if k is 13.  Concatenating these files in  order  will  yield
              the  full  matrix of interest.  (Yes, this can be done before decompression.)  N.B. This generally
              cannot be used to directly write a symmetric square matrix.   Choose  square0  or  triangle  shape
              instead, and postprocess as necessary.

       --memory <val> ['require'] : Set size, in MiB, of initial workspace malloc
              attempt.   To  error  out instead of reducing the request size when the initial attempt fails, add
              the 'require' modifier.

       --threads <val>
              : Set maximum number of compute threads.

       --d <char>
              : Change variant/covariate range delimiter (normally '-').

       --seed <val...>
              : Set random number seed(s).  Each value must be an integer between 0  and  4294967295  inclusive.
              Note that --threads and "--memory require" may also be needed to reproduce some randomized runs.

       --output-min-p <p> : Specify minimum p-value to write to reports.
              (2.23e-308 is useful for preventing underflow in some programs.)

       --debug
              : Use slower, more crash-resistant logging method.

       --randmem
              : Randomize initial workspace memory (helps catch uninitialized-memory bugs).

       --warning-errcode
              : Return a nonzero error code to the OS when a run completes with warning(s).

       --zst-level <lvl>
              : Set the Zstd compression level (1-22, default 3).

       Primary  methods  paper:  Chang  CC,  Chow  CC,  Tellier  LCAM,  Vattikuti  S,  Purcell SM, Lee JJ (2015)
       Second-generation PLINK: rising to the challenge of larger and richer datasets.  GigaScience, 4.

NAME

DESCRIPTION

SEE ALSO