Provided by: apertium_3.9.4-1build1_amd64 bug

NAME

       apertium-deshtml — HTML format processor for Apertium

SYNOPSIS

       apertium-deshtml [-hino] [input_file [output_file]]

DESCRIPTION

       This tool is part of the Apertium open-source machine translation toolbox: https://apertium.org/.

       apertium-deshtml  is an HTML format processor.  Data should be passed through this processor before being
       piped to lt-proc(1).  The program takes input in the  form  of  an  HTML  document  and  produces  output
       suitable for processing with lt-proc(1).  HTML tags and other format information are enclosed in brackets
       so that lt-proc(1) treats them as whitespace between words.

OPTIONS

       -h, --help
               Display this help.

       -i      Makes  the  addition  of  trailing  sentence  terminator  (‘.’)  unconditional,  often leading to
               duplicates.

       -n      Suppresses the addition of a trailing sentence terminator.

       -o      Inserts a "❡" (U+2761 CURVED STEM PARAGRAPH SIGN ORNAMENT) at the end  of  <h[1–6]>  and  <title>
               tags.

EXAMPLES

       You could write the following to show how the word “gener” is analysed:
             echo "<b>gener</b>" | apertium-deshtml | lt-proc ca-es.automorf.bin

SEE ALSO

       apertium(1), apertium-desrtf(1), apertium-destxt(1), lt-proc(1)

COPYRIGHT

       Copyright  © 2005, 2006 Universitat d'Alacant / Universidad de Alicante.  This is free software.  You may
       redistribute   copies   of   it   under   the   terms   of    the    GNU    General    Public    License:
       https://www.gnu.org/licenses/gpl.html.

BUGS

       Many... lurking in the dark and waiting for you!

Apertium                                         March 21, 2006                              APERTIUM-DESHTML(1)