Provided by: libfec-dev_1.0-26-gc5d935f-1_amd64 bug

NAME

       create_viterbi27,  set_viterbi27_polynomial,  init_viterbi27,  update_viterbi27_blk, chainback_viterbi27,
       delete_viterbi27,  create_viterbi29,  set_viterbi_29_polynomial,  init_viterbi29,   update_viterbi29_blk,
       chainback_viterbi29,   delete_viterbi29,   create_viterbi39,  set_viterbi_39_polynomial,  init_viterbi39,
       update_viterbi39_blk,         chainback_viterbi39,          delete_viterbi39,          create_viterbi615,
       set_viterbi615_polynomial,       init_viterbi615,       update_viterbi615_blk,      chainback_viterbi615,
       delete_viterbi615 - IA32 SIMD-assisted Viterbi decoders

SYNOPSIS

       #include "fec.h"
       void *create_viterbi27(int blocklen);
       void set_viterbi27_polynomial(int polys[2]);
       int init_viterbi27(void *vp,int starting_state);
       int update_viterbi27_blk(void *vp,unsigned char syms[],int nbits);
       int chainback_viterbi27(void *vp, unsigned char *data,unsigned int nbits,unsigned int endstate);
       void delete_viterbi27(void *vp);

       void *create_viterbi29(int blocklen);
       void set_viterbi29_polynomial(int polys[2]);
       int init_viterbi29(void *vp,int starting_state);
       int update_viterbi29_blk(void *vp,unsigned char syms[],int nbits);
       int chainback_viterbi29(void *vp, unsigned char *data,unsigned int nbits,unsigned int endstate);
       void delete_viterbi29(void *vp);

       void *create_viterbi39(int blocklen);
       void set_viterbi39_polynomial(int polys[3]);
       int init_viterbi39(void *vp,int starting_state);
       int update_viterbi39_blk(void *vp,unsigned char syms[],int nbits);
       int chainback_viterbi39(void *vp, unsigned char *data,unsigned int nbits,unsigned int endstate);
       void delete_viterbi39(void *vp);

       void *create_viterbi615(int blocklen);
       void set_viterbi615_polynomial(int polys[6]);
       int init_viterbi615(void *vp,int starting_state);
       int update_viterbi615_blk(void *vp,unsigned char syms[],int nbits);
       int chainback_viterbi615(void *vp, unsigned char *data,unsigned int nbits,unsigned int endstate);
       void delete_viterbi615(void *vp);

DESCRIPTION

       These functions implement high performance Viterbi decoders for four  convolutional  codes:  a  rate  1/2
       constraint  length  7  (k=7)  code  ("viterbi27"), a rate 1/2 k=9 code ("viterbi29"), a rate 1/3 k=9 code
       ("viterbi39") and a rate 1/6 k=15 code ("viterbi615").  The decoders use the Intel IA32 or  PowerPC  SIMD
       instruction sets, if available, to improve decoding speed.

       On the IA32 there are three different SIMD instruction sets. The first and most common is MMX, introduced
       on later Intel Pentiums and then on the Intel Pentium II and most Intel clones (AMD K6, Transmeta Crusoe,
       etc).   SSE was introduced on the Pentium III and later implemented in the AMD Athlon 4 (AMD calls it "3D
       Now!  Professional"). Most recently, SSE2 was introduced in the Intel Pentium 4, and has been adopted  by
       more recent AMD CPUs. The presence of SSE2 implies the existence of SSE, which in turn implies MMX.

       Altivec  is the PowerPC SIMD instruction set. It is roughly comparable to SSE2. Altivec was introduced to
       the general public in the Apple Macintosh G4; it is also  present  in  the  G5.  Altivec  is  actually  a
       Motorola trademark; Apple calls it "Velocity Engine" and IBM calls it "VMX". All refer to the same thing.

       When  built  for  the  IA32  or PPC architectures, the functions automatically use the most powerful SIMD
       instruction set available. If no SIMD instructions are available, or if the library is built for  a  non-
       IA32, non-PPC machine, a portable C version is executed instead.

USAGE

       Four  versions  of  each  function  are provided, one for each code.  In the following discussion, change
       "viterbi" to "viterbi27", "viterbi29", "viterbi39" or "viterbi615" as desired.

       Before Viterbi decoding can begin, an  instance  must  first  be  created  with  create_viterbi().   This
       function  creates  and returns a pointer to an internal control structure containing the path metrics and
       the branch decisions. create_viterbi() takes one argument that gives the length  of  the  data  block  in
       bits. You must not attempt to decode a block longer than the length given to create_viterbi().

       Before  decoding  a  new frame, init_viterbi() must be called to reset the decoder state.  It accepts the
       instance pointer returned by create_viterbi() and the initial starting state of the convolutional encoder
       (usually 0). If the initial starting state is unknown or incorrect, the decoder will still  function  but
       the decoded data may be incorrect at the start of the block.

       Blocks  of  received  symbols  are  processed  with  calls  to update_viterbi_blk().  The nbits parameter
       specifies the number of data bits (not channel symbols) represented by the syms  buffer.  (For  rate  1/2
       codes, the number of symbols in syms is twice nbits, and so on.)  Each symbol is expected to range from 0
       through  255,  with  0 corresponding to a "strong 0" and 255 corresponding to a "strong 1". The caller is
       responsible for determining the proper pairing  of  input  symbols  (commonly  known  as  decoder  symbol
       phasing).

       At  the end of the block, the data is recovered with a call to chainback_viterbi(). The arguments are the
       pointer to the decoder instance, a pointer to a user-supplied buffer into which the decoded data is to be
       written, the number of data bits (not bytes) that are to be  decoded,  and  the  terminal  state  of  the
       convolutional encoder at the end of the frame (usually 0). If the terminal state is incorrect or unknown,
       the  decoded  data  bits  at  the end of the frame may be unreliable. The decoded data is written in big-
       endian order, i.e., the first bit in the frame is written into the high order bit of the  first  byte  in
       the  buffer. If the frame is not an integral number of bytes long, the low order bits of the last byte in
       the frame will be unused.

       Note that the decoders assume the use of a tail, i.e., the encoding  and  transmission  of  a  sufficient
       number  of padding bits beyond the end of the user data to force the convolutional encoder into the known
       terminal state given to chainback_viterbi(). The tail is always one bit less than the  constraint  length
       of  the  code, so the k=7 code uses 6 tail bits (12 tail symbols), the k=9 code uses 8 tail bits (16 tail
       symbols) and the k=15 code uses 14 tail bits (84 tail symbols).

       The tail bits are not included in the length arguments to create_viterbi() and  chainback_viterbi().  For
       example,  if  the  block  contains  1000  user  bits,  then  this  would be the length parameter given to
       create_viterbi27() and chainback_viterbi27(), and update_viterbi27_blk() would be called with a total  of
       2012 symbols - the last 12 encoded symbols representing the tail bits.

       After the call to chainback_viterbi(), the decoder may be reset with a call to init_viterbi() and another
       block  can  be  decoded.  Alternatively, delete_viterbi() can be called to free all resources used by the
       Viterbi decoder.

       The set_viterbi_polynomial() function allows use of other than the default  code  generator  polynomials.
       Although  only  one  set  of  polynomials  are  generally  used  with  each code, there can are different
       conventions as to their order and symbol polarity, and these functions simplifies their use.

       The default polynomials for the viterbi27 routes are those of  the  NASA-JPL  convention  without  symbol
       inversion.   The  NASA-JPL  convention normally inverts the first symbol.  The CCSDS/NASA-GSFC convention
       swaps the two symbols and inverts the second.

       To set the NASA-JPL convention with symbol inversion:

       int polys[2] = { -V27POLYA,V27POLYB };
       set_viterbi27_polynomial(polys);

       and to set the CCSDS convention with symbol inversion:

       int polys[2] = { V27POLYB,-V27POLYA };
       set_viterbi27_polynomial(polys);

       The default polynomials for the viterbi615 routines are those used  by  the  Cassini  spacecraft  without
       symbol  inversion.  Mars  Pathfinder  (MPF)  and  STEREO  swap  the  third  and fourth polynomials.  Both
       conventions invert the first, third and fifth  symbols.  Refer  to  fec.h  for  the  polynomial  constant
       definitions.

       To set the Cassini convention with symbol inversion, do the following:

       int polys[6] = { -V615POLYA,V615POLYB,-V615POLYC,V615POLYD,-V615POLYE,V615POLYF };
       set_viterbi615_polynomial(polys);

       and to set the MPF/STEREO convention with symbol inversion:

       int polys[6] = { -V615POLYA,V615POLYB,-V615POLYD,V615POLYC,-V615POLYE,V615POLYF };
       set_viterbi615_polynomial(polys);

       For  performance  reasons, calling this function changes the code generator polynomials for all instances
       of corresponding Viterbi decoder, including those already created.

ERROR PERFORMANCE

       These decoders have all been extensively tested and found to provide  performance  consistent  with  that
       expected for soft-decision Viterbi decoding with 8-bit symbols.

       Due  to  internal  differences,  the  implementations vary slightly in error performance. In general, the
       portable C versions exhibit the best error performance because they use full-sized  branch  metrics,  and
       the  MMX  versions  exhibit  the worst because they use 8-bit branch metrics with modulo comparisons. The
       SSE, SSE2 and Altivec implementations of the r=1/2 k=7 and r=1/2 k=9  codes  use  unsigned  8-bit  branch
       metrics,  and  are  almost as good as the C versions.  The r=1/3 k=9 and r=1/6 k=15 codes are implemented
       with 16-bit path metrics in all SIMD versions.

DIRECT ACCESS TO SPECIFIC FUNCTION VERSIONS

       Calling the functions listed above automatically calls the appropriate version of the function  depending
       on  the  CPU  type  and  available SIMD instructions. A particular version can also be called directly by
       appending the appropriate suffix to the  function  name.  The  available  suffixes  are  "_mmx",  "_sse",
       "_sse2",  "_av"  and  "_port",  for  the MMX, SSE, SSE2, Altivec and portable versions, respectively. For
       example,   the   SSE2   version   of   the   update_viterbi27_blk()   function   can   be   invoked    as
       update_viterbi27_blk_sse2().

       Naturally,  the _av functions are only available on the PowerPC and the _mmx, _sse and _sse2 versions are
       only available on IA-32. Calling a SIMD-enabled function on a CPU that doesn't  support  the  appropriate
       set of instructions will result in an illegal instruction exception.

RETURN VALUES

       create_viterbi  returns  a  pointer  to  the structure containing the decoder state.  The other functions
       return -1 on error, 0 otherwise.

AUTHOR & COPYRIGHT

       Phil Karn, KA9Q (karn@ka9q.net)

LICENSE

       This software may be used under the terms of the GNU Limited General Public License (LGPL).

                                                                                                 SIMD-VITERBI(3)