Provided by: genometools_1.6.5+ds-2.2_amd64 

NAME
gt-extractfeat - Extract features given in GFF3 file from sequence file.
SYNOPSIS
gt extractfeat [option ...] [GFF3_file]
DESCRIPTION
-type [string]
set type of features to extract (default: undefined)
-join [yes|no]
join feature sequences in the same subgraph into a single one (default: no)
-translate [yes|no]
translate the features (of a DNA sequence) into protein (default: no)
-seqid [yes|no]
add sequence ID of extracted features to FASTA descriptions (default: no)
-target [yes|no]
add target ID(s) of extracted features to FASTA descriptions (default: no)
-coords [yes|no]
add location of extracted features to FASTA descriptions (default: no)
-retainids [yes|no]
use ID attributes of extracted features as FASTA descriptions (default: no)
-gcode [value]
specify genetic code to use (default: 1)
-seqfile [filename]
set the sequence file from which to take the sequences (default: undefined)
-encseq [filename]
set the encoded sequence indexname from which to take the sequences (default: undefined)
-seqfiles
set the sequence files from which to extract the features use -- to terminate the list of sequence
files
-matchdesc [yes|no]
search the sequence descriptions from the input files for the desired sequence IDs (in GFF3),
reporting the first match (default: no)
-matchdescstart [yes|no]
exactly match the sequence descriptions from the input files for the desired sequence IDs (in GFF3)
from the beginning to the first whitespace (default: no)
-usedesc [yes|no]
use sequence descriptions to map the sequence IDs (in GFF3) to actual sequence entries. If a
description contains a sequence range (e.g., III:1000001..2000000), the first part is used as
sequence ID (III) and the first range position as offset (1000001) (default: no)
-regionmapping [string]
set file containing sequence-region to sequence file mapping (default: undefined)
-v [yes|no]
be verbose (default: no)
-width [value]
set output width for FASTA sequence printing (0 disables formatting) (default: 0)
-o [filename]
redirect output to specified file (default: undefined)
-gzip [yes|no]
write gzip compressed output file (default: no)
-bzip2 [yes|no]
write bzip2 compressed output file (default: no)
-force [yes|no]
force writing to output file (default: no)
-help
display help and exit
-version
display version information and exit
Genetic code numbers for option -gcode:
1: Standard 2: Vertebrate Mitochondrial 3: Yeast Mitochondrial 4: Mold Mitochondrial; Protozoan
Mitochondrial; Coelenterate Mitochondrial; Mycoplasma; Spiroplasma 5: Invertebrate Mitochondrial 6:
Ciliate Nuclear; Dasycladacean Nuclear; Hexamita Nuclear 9: Echinoderm Mitochondrial; Flatworm
Mitochondrial 10: Euplotid Nuclear 11: Bacterial, Archaeal and Plant Plastid 12: Alternative Yeast
Nuclear 13: Ascidian Mitochondrial 14: Alternative Flatworm Mitochondrial 15: Blepharisma Macronuclear
16: Chlorophycean Mitochondrial 21: Trematode Mitochondrial 22: Scenedesmus obliquus Mitochondrial 23:
Thraustochytrium Mitochondrial 24: Pterobranchia Mitochondrial 25: Candidate Division SR1 and
Gracilibacteria
File format for option -regionmapping:
The file supplied to option -regionmapping defines a “mapping”. A mapping maps the sequence-region
entries given in the GFF3_file to a sequence file containing the corresponding sequence. Mappings can be
defined in one of the following two forms:
mapping = {
chr1 = "hs_ref_chr1.fa.gz",
chr2 = "hs_ref_chr2.fa.gz"
}
or
function mapping(sequence_region)
return "hs_ref_"..sequence_region..".fa.gz"
end
The first form defines a Lua (http://www.lua.org) table named “mapping” which maps each sequence region
to the corresponding sequence file. The second one defines a Lua function “mapping”, which has to return
the sequence file name when it is called with the sequence_region as argument.
REPORTING BUGS
Report bugs to https://github.com/genometools/genometools/issues.
GenomeTools 1.6.5 04/27/2024 GT-EXTRACTFEAT(1)