Provided by: ncbi-entrez-direct_19.2.20230331+dfsg-3_amd64 
      
    
NAME
       xtract - NCBI Entrez Direct XML conversion and transformation tool
SYNOPSIS
       xtract  [-help]  [-strict]  [-mixed]  [-self]  [-accent]  [-ascii] [-compress] [-stops] [-input filename]
       [-transform filename]  [-aliases filename]  [-pattern expr]  [-group expr]  [-block expr]  [-subset expr]
       [-path path] [-if expr [constraint]] [-unless expr [constraint]] [-and condition] [-or condition] [-else]
       [-position pos]   [-equals str]   [-contains str]   [-includes str]  [-is-within str]  [-starts-with str]
       [-ends-with str]   [-is-not str]   [-is-before str]   [-is-after str]   [-matches str]   [-resembles str]
       [-is-equal-to expr]  [-differs-from expr]  [-gt N]  [-ge N]  [-lt N]  [-le N]  [-eq N] [-ne N] [-ret str]
       [-tab str] [-sep str] [-pfx str] [-sfx str] [-rst] [-clr]  [-pfc str]  [-deq str]  [-def str]  [-lbl str]
       [-set tag]  [-rec tag]  [-wrp tag]  [-enc tag]  [-plg str]  [-elg str]  [-pkg tag]  [-fwd str] [-awd str]
       [-tag tag] [-att key value] [-cls] [-slf] [-end tag] [-element element] [-first element]  [-last element]
       [-backward element]   [-NAME]   [--STATS]  [-num element]  [-len element]  [-sum element]  [-acc element]
       [-min element] [-max element] [-inc element] [-dec element] [-sub element] [-avg element]  [-dev element]
       [-med element]  [-mul element] [-div element] [-mod element] [-bin element] [-oct element] [-hex element]
       [-bit element]  [-pad element]  [-encode element]  [-upper element]   [-lower element]   [-chain element]
       [-title element]  [-mirror element]  [-alnum element] [-basic element] [-plain element] [-simple element]
       [-author element] [-prose element] [-terms element]  [-words element]  [-pairs element]  [-order element]
       [-reverse element] [-letters element] [-clauses element] [-year element] [-month element] [-date element]
       [-page element]   [-auth element]   [-initials element]  [-jour element]  [-trim element]  [-wct element]
       [-doi element]   [-translate element]   [-classify element]   [-replace   -reg target   -exp replacement]
       [-revcomp]  [-nucleic]  [-fasta]  [-ncbi2na]  [-ncbi4na]  [-molwt]  [-0-based element] [-1-based element]
       [-ucsc-based element]     [-insd arg ...]     [-histogram]     [-e2index [extras]]     [-indices element]
       [-article element]  [-abstract element]  [-paragraph element]  [-stemmed element] [-head str] [-tail str]
       [-hd str]  [-tl str]   [-select condition]   [-in filename]   [-sort[-fwd] element]   [-sort-rev element]
       [-format fmt   [-unicode style]]  [-verify]  [-outline]  [-synopsis]  [-contour [delimiter]]  [-examples]
       [-unix] [-version]
DESCRIPTION
       xtract converts an XML document into a table of data values according to user-specified rules.
OPTIONS
   Processing Flags
       -strict
              Remove HTML and MathML tags.
       -mixed Allow mixed content XML.
       -self  Allow detection of empty self-closing tags.
       -accent
              Delete Unicode accents and diacritical marks.
       -ascii Convert Unicode to numeric HTML character entities.
       -compress
              Compress runs of spaces.
       -stops Retain stop words in selected phrases.
   Data Source
       -input filename
              Read XML from file instead of standard input.
       -transform filename
              File of substitutions for -translate.
       -aliases filename
              Mappings file for -classify operation.
   Exploration Argument Hierarchy
       -pattern expr
       -group expr
       -block expr
       -subset expr
              Name of record within set.  Use of different argument names allows command-line control of  nested
              looping.
   Path Navigation
       -path path
              Explore by list of adjacent object names.
   Exploration Constructs
       Object         DateRevised
       Parent/Child   Book/AuthorList
       Path           MedlineCitation/Article/Journal/JournalIssue/PubDate
       Heterogeneous  "PubmedArticleSet/*"
       Exhaustive     "History/**"
       Nested         "*/Taxon"
   Conditional Execution
       -if expr [constraint]
              Element (or @attribute) must exist and satisfy any specified constraint.
       -unless expr [constraint]
              Skip if element matches.
       -and condition
              Preceding and following tests must both pass.
       -or condition
              Any passing test suffices.
       -else  Execute if conditional test failed.
       -position pos
              first/last/outer/inner/even/odd/all.
   String Constraints
       -equals str
              String must match exactly.
       -contains str
              Substring must be present.
       -includes str
              Substring must match at word boundaries.
       -is-within str
              String must be present.
       -starts-with str
              Substring must be at beginning.
       -ends-with str
              Substring must be at end.
       -is-not str
              String must not match.
       -is-before str
              First string < second string.
       -is-after str
              First string > second string.
       -matches str
              Matches without commas or semicolons.
       -resembles str
              Requires all words, but in any order.
   Object Constraints
       -is-equal-to expr
              Object values must match.
       -differs-from expr
              Object values must differ.
   Numeric Constraints
       -gt N  Greater than.
       -ge N  Greater than or equal to.
       -lt N  Less than to.
       -le N  Less than or equal to.
       -eq N  Equal to.
       -ne N  Not equal to.
   Format Customization
       -ret str
              Override line break between patterns.
       -tab str
              Replace tab character between fields.
       -sep str
              Separator between group members.
       -pfx str
              Prefix to print before group.
       -sfx str
              Suffix to print after group.
       -rst   Reset -sep through -elg.
       -clr   Clear queued tab separator.
       -pfc str
              Preface combines -clr and -pfx.
       -deq str
              Delete and replace queued tab separator.
       -def str
              Default placeholder for missing fields.
       -lbl str
              Insert arbitrary text.
   XML Generation
       -set tag
              XML tag for entire set.
       -rec tag
              XML tag for each record.
       -wrp tag
              Wrap elements in XML object.
       -enc tag
              Encase instance in XML object.
       -plg str
              Prologue to print before instance.
       -elg str
              Epilogue to print after instance.
       -pkg tag
              Package subset in XML object.
       -fwd str
              Foreword to print before subset.
       -awd str
              Afterword to print after subset.
   Tag and Attribute Construction
       -tag tag
              Start with <tag.
       -att key value
              Attribute key and value.
       -cls   Close with >.
       -slf   Self-close with />.
       -end tag
              End contents with </tag>.
   Element Selection
       -element element
              Print all items that match tag name.
       -first element
              Only print value of first item.
       -last element
              Only print value of last item.
       -backward element
              Print values in reverse order.
       -NAME  Record value in named variable.
       --STATS
              Accumulate values into variable.
   -element Constructs
       Tag            Caption
       Group          Initials,LastName
       Parent/Child   MedlineCitation/PMID
       Recursive      "**/Gene-commentary_accession"
       Unrestricted   PubDate/*
       Attribute      DescriptorName@MajorTopicYN
       Range          MedlineDate[1:4]
       Substring      "Title[phospholipase | rattlesnake]"
       Object Count   "#Author"
       Item Length    "%Title"
       Element Depth  "^PMID"
       Variable       "&NAME"
   Special -element Operations
       Parent Index   "+"
       Object Name    "?"
       Object Value   "~"
       XML Subtree    "*"
       Children       "$"
       Attributes     "@"
       ASN.1 Record   "."
       JSON Record    "%"
   Numeric Processing
       -num element
              Count.
       -len element
              Length.
       -sum element
              Sum.
       -acc element
              Accumulator.
       -min element
              Minimum.
       -max element
              Maximum.
       -inc element
              Increment.
       -dec element
              Decrement.
       -sub element
              Difference.
       -avg element
              Average.
       -dev element
              Deviation.
       -med element
              Median.
       -mul element
              Product.
       -div element
              Quotient.
       -mod element
              Remainder.
       -bin element
              Binary.
       -oct element
              Octal.
       -hex element
              Hexadecimal.
       -bit element
              Bit count.
       -pad element
              Zero-pad to eight digits.
   Character Processing
       -encode element
              XML-encode <, >, &, ", and ' characters.
       -upper element
              Convert text to uppercase.
       -lower element
              Convert text to lowercase.
       -chain element
              Change spaces to underscores.
       -title element
              Capitalize initial letters of words.
       -mirror element
              Reverse order of letters.
       -alnum element
              Non-alphanumeric characters to space.
   String Processing
       -basic element
              Convert superscripts and subscripts.
       -plain element
              Remove embedded mixed-content markup tags.
       -simple element
              Normalize accented letters; spell Greek letters.
       -author element
              Multi-step author cleanup.
       -prose element
              Text conversion to ASCII.
   Text Processing
       -terms element
              Partition text at spaces.
       -words element
              Split at punctuation marks.
       -pairs element
              Adjacent informative words.
       -order element
              Rearrange words in sorted order.
       -reverse element
              Reverse words in string.
       -letters element
              Separate individual letters.
       -clauses element
              Break at phrase separators.
   Citation Functions
       -year element
              Extract first 4-digit year from string.
       -month element
              Match first month name and return a corresponding integer.
       -date element
              YYYY/MM/DD from -unit "PubDate" -date "*"
       -page element
              Get digits (and letters) of first page number.
       -auth element
              Change GenBank authors to Medline form.
       -initials element
              Parse initials from forename or given name.
       -jour element
              Clean up journal name punctuation.
       -trim element
              Remove extra spaces and leading zeros.
       -wct element
              Count number of -words in a string.
       -doi element
              Add https://doi.org/ prefix, URL encode.
   Value Transformation
       -translate element
              Substitute values with -transform table.
       -classify element
              Substring word or phrase matches to -aliases table.
   Regular Expression
       -replace
              Substitute text using regular expressions.
              -reg target    Target expression.
              -exp pattern   Replacement pattern.
   Sequence Processing
       -revcomp
              Reverse complement nucleotide sequence.
       -nucleic
              Subrange determines forward or revcomp.
       -fasta Split sequence into blocks of 70 uppercase letters.
       -ncbi2na
              Expand ncbi2na to IUPAC.  (May need to truncate result to actual sequence length.)
       -ncbi4na
              Expand ncbi4na to IUPAC.  (May need to truncate result to actual sequence length.)
       -molwt Calculate molecular weight of peptide.
   Sequence Coordinates
       -0-based element
              Zero-based.
       -1-based element
              One-based.
       -ucsc-based element
              Half-open.
   Command Generator
       -insd arg ...
              Generate  INSDSeq  extraction  commands.  Print them if invoked standalone; run them if invoked as
              part of a pipeline.  Requires one or more arguments, which may appear in the following order:
              Descriptor(s)  INSDSeq_sequence/INSDSeq_definition/INSDSeq_division/... [...]
              Completeness   complete/partial
              Feature(s)     CDS/mRNA/...[,...]
              Qualifier(s)   INSDFeature_key/"#INSDInterval"/gene/product/feat_location/sub_sequence/... [...]
   Frequency Table
       -histogram
              Collects data for sort-uniq-count(1) on entire set of records.
   Entrez Indexing
       -e2index [extras]
              Create Entrez index XML.  extras (true or false; false by default) indicates whether to index  ex‐
              tra fields.
       -indices element
              Index normalized words.
       -article element
              Title positional index.
       -abstract element
              Abstract positional index.
       -paragraph element
              Index text paragraphs.
       -stemmed element
              Apply Porter2 algorithm.
   Output Organization
       -head str
              Print before everything else.
       -tail str
              Print after everything else.
       -hd str
              Print before each record.
       -tl str
              Print after each record.
   Record Selection
       -select condition
              Select record subset by conditions.
       -in filename
              File of identifiers to use for selection.
   Record Rearrangement
       -sort[-fwd] element
              Element to use as sort key.
       -sort-rev element
              Sort records in reverse order.
   Reformatting
       -format fmt
              copy     Fast block copy (still applies processing flags).
              compact  Compress runs of spaces.
              flush    Suppress line indentation.
              indent   Indent according to nesting depth.
              expand   Place each attribute on a separate line.
   Validation
       -verify
              Report XML data integrity problems.
   Summary
       -outline
              Display outline of XML structure.
       -synopsis
              Display individual XML paths.
       -contour [delimiter]
              Display XML paths to leaf nodes (delimited by / by default).
   Full Exploration Command Precedence
       -pattern
       -path
       -division
       -group
       -branch
       -block
       -section
       -subset
       -unit
   Documentation
       -help  Print usage information and some example argument combinations.
       -examples
              Complete usage examples, involving additional Entrez Direct tools.
       -unix  Illustrate common Unix command arguments.
       -version
              Print version number.
NOTES
       String constraints use case-insensitive comparisons.
       Numeric constraints and selection arguments use integer values.
       -num and -len selections are synonyms for Object Count (#) and Item Length (%).
       -words, -pairs, and -indices convert to lower case.
SEE ALSO
       archive-pmc(1),  archive-pubmed(1),  custom-index(1), disambiguate-nucleotides(1), download-ncbi-data(1),
       ds2pme(1), esample(1), fetch-pmc(1), fetch-pubmed(1), find-in-gene(1),  fuse-segments(1),  gene2range(1),
       hgvs2spdi(1),   index-extras(1),   index-pubmed(1),   pma2pme(1),   rchive(1),  snp2hgvs(1),  snp2tbl(1),
       sort-uniq-count(1),  spdi2tbl(1),  tbl2prod(1),  transmute(1),  uniq-table(1),  xml2fsa(1),   xml2tbl(1),
       xy-plot(1).
NCBI                                               2023-03-31                                          XTRACT(1)