Provided by: pftools_3.2.12-1_amd64 bug

NAME

       pfscan - scan a protein or DNA sequence with a profile library

SYNOPSIS

       pfscan    [  -abdfhlLmruksvxyz  ]  [  -C  cut_off  ]  [  -M  mode_nb  ]  [  -W width ] [ sequence | - ] [
                 profile_library | - ] [ parameters ]

DESCRIPTION

       pfscan compares a protein or nucleic acid sequence against a profile library.  The result is an  unsorted
       list  of profile-sequence matches written to the standard output.  A variety of output formats containing
       different information can be specified via the  options  -l, -L, -r, -k, -s, -x, -y  and  -z.   The  file
       'sequence'  contains a sequence in EMBL/SWISS-PROT format (assumed by default) or in Pearson/Fasta format
       (indicated by option -f).  The 'profile_library' file contains a library of profiles in  PROSITE  format.
       If  '-'  is  specified  instead of one of the filenames, the corresponding data is read from the standard
       input.

OPTIONS

       sequence
              Input query sequence.
              This DNA or protein sequence will be used to search for matches to a library of PROSITE profiles.
              The content of the file must be either in EMBL/SWISS-PROT (default)  or  in  Pearson/Fasta  format
              (option  -f).   If  the  filename  is  replaced by a '-', pfscan will read the input sequence from
              stdin.

       profile_library
              Library of PROSITE profiles.
              This file should contain one or several PROSITE profiles, against which the query sequence will be
              matched.  Each entry in this library should be separated from the next by a line  containing  only
              the  '//'  code.   If the filename is replaced by a '-', pfscan will read the profile library from
              stdin.

       -a     Report optimal alignment scores for all profiles regardless of the  cut-off  value.   This  option
              simultaneously forces DISJOINT=UNIQUE.

       -b     Search the complementary strand of the DNA sequence as well.

       -f     Input sequence is in Pearson/Fasta format.

       -h     Display usage help text.

       -l     Indicate the value of the highest cut-off level exceeded by the match score in the output list.

       -L     Indicate  by  character string the highest cut-off level exceeded by the match score in the output
              list.

              Note:  The generalized profile format includes a text string field to specify a name for a cut-off
                     level. The -L option causes the program to display the first two characters  of  this  text
                     string  (usually  something  like  '!', '?', '??',  etc.)  at  the  beginning of each match
                     description.

       -m     Report individual matches for circular profiles.
              If the profile is circular, each match between a sequence and a  profile  can  be  composed  of  a
              stretch  of  individual  matches of the profile. By default, pfscan reports only the total matched
              region. When this option is set, detailed information for each individual match will be output  as
              well.

              Note:  The  scoring  system  for  most circular profiles has been optimized to find total matches,
                     therefore the normalized scores of individual matches of a circular profile to  a  sequence
                     should be considered with caution.

       -r     Use  raw  scores  rather  than normalized scores for match selection.  The normalized score is not
              printed.

       -u     Forces DISJOINT=UNIQUE.

       -C cut_off
              Cut-off level to be used for match selection.
              The value of 'cut_off' should be the numerical identifier  of  a  cut-off  level  defined  in  the
              profile.   The  raw  or  normalized  score  of  this level will then be used to include profile to
              sequence matches in the output list.
              If the specified level does not exist in the profile, the next higher (if cut_off is negative)  or
              next lower (if cut_off is positive) level defined is used instead.
              Type: integer
              Default: 0

       -M mode_nb
              Normalization mode to use for score computation.
              The  'mode_nb' specifies which normalization mode defined in the profile should be used to compute
              the normalized scores for profile to sequence matches. This option  will  override  the  profile's
              PRIORITY parameter.
              If the specified normalization mode does not exist in the profile, an error message will be output
              to standard error and the search is interrupted.
              Type: integer
              Default: lowest priority mode defined in the profile

   Output modifiers
       -d     Limit profile description length.
              If  this  option  is  set,  the  description  of the profile on the header line will be limited in
              length. If the match information is longer than the output width specified using  option  -W,  the
              profile  description  will  not  be  printed. Else the description will be truncated to fit the -W
              value.
              By default, the profile description is not truncated. This option can not be used when  option  -k
              is set.

       -k     Use xpsa(5) headers for output.
              When  this option is set, all output types (see below) will use an xpsa(5) style header line. This
              format uses keyword=value  pairs  to  output  alignment  parameters.  It  is  useful  to  transfer
              information between different sequence alignment tools.

       -s     List  the  sequences of the matched regions as well.  The output will be a Pearson/Fasta-formatted
              sequence library.

       -v     Suppress sequence/profile parsing warnings.  If this option is set no  warning  messages  will  be
              printed on stderr.  Only fatal errors will be reported. This option should be used with caution.

       -x     List  profile-sequence alignments in psa(5) format. Please refer to the corresponding man page for
              more information.

       -y     Display alignments between the profile and  the  matched  sequence  regions  in  a  human-friendly
              pairwise alignment format.

       -z     Indicate  starting  and  ending position of the matched profile range. The latter position will be
              given as a negative offset from the end of the profile. Thus the range [    1,    -1] means entire
              profile.

       -W width
              Set alignment output width.
              The value of 'width' specifies how many residues will be output  on  one  line  when  any  of  the
              -s, -x or -y options is set.
              Type: integer
              Default: 60

PARAMETERS

       Note:  for  backwards  compatibility, release 2.3 of the pftools package will parse the version 2.2 style
              parameters, but these are deprecated and the corresponding option (refer to the  options  section)
              should be used instead.

       L=#    Cut-off level.
              Use option -C instead, not -L.

       W=#    Output width.
              Use option -W instead.

EXAMPLES

       (1)    pfscan -s GTPA_HUMAN prosite13.prf

              Scans  the human GAP protein for matches to profiles in PROSITE release 13.  The file 'GTPA_HUMAN'
              contains the  SWISS-PROT  entry  P20936|GTPA_HUMAN.   The  profile  library  file  'prosite13.prf'
              contains  all  profile  entries  of  PROSITE  release 13.  The output is a Pearson/Fasta-formatted
              sequence library containing all sequence regions of the input sequence matching a profile  in  the
              profile library.

       (2)    pfscan -by -C 2 CVPBR322 ecp.prf

              Scans  both  strands  of  plasmid PBR322 for high-scoring (level 2) E. coli promoter matches.  The
              sequence file 'CVPBR322' contains EMBL entry J01749|CVPBR322.  The profile library file  'ecp.prf'
              contains  a  profile  for E. coli promoters.  The output includes profile-sequence alignments in a
              human-friendly format.

EXIT CODE

       On successful completion of its task, pfscan will return an exit  code  of  0.  If  an  error  occurs,  a
       diagnostic  message  will  be  output  on standard error and the exit code will be different from 0. When
       conflicting options where passed to the program but the task could nevertheless  be  completed,  warnings
       will be issued on standard error.

BUGS

       If  the  match  selection is based on normalized scores (i.e.  option -r is not set), rounding errors can
       lead to the exclusion of some matches even if the raw score is above or equal to  the  specified  cut-off
       level score.

SEE ALSO

       pfsearch(1), pfmake(1), psa(5), xpsa(5)

AUTHOR

       The pftools package was developed by Philipp Bucher.
       Any comments or suggestions should be addressed to <pftools@sib.swiss>.

pftools 2.3                                         July 2003                                          PFSCAN(1)