Provided by: tcllib_1.21+dfsg-1_all bug

NAME

       doctools::idx::parse - Parsing text in docidx format

SYNOPSIS

       package require doctools::idx::parse  ?0.1?

       package require Tcl  8.4

       package require doctools::idx::structure

       package require doctools::msgcat

       package require doctools::tcl::parse

       package require fileutil

       package require logger

       package require snit

       package require struct::list

       package require struct::stack

       ::doctools::idx::parse text text

       ::doctools::idx::parse file path

       ::doctools::idx::parse includes

       ::doctools::idx::parse include add path

       ::doctools::idx::parse include remove path

       ::doctools::idx::parse include clear

       ::doctools::idx::parse vars

       ::doctools::idx::parse var set name value

       ::doctools::idx::parse var unset name

       ::doctools::idx::parse var clear ?pattern?

________________________________________________________________________________________________________________

DESCRIPTION

       This  package  provides  commands to parse text written in the docidx markup language and convert it into
       the canonical serialization of the keyword index encoded in the text.   See  the  section  Keyword  index
       serialization format for specification of their format.

       This is an internal package of doctools, for use by the higher level packages handling docidx documents.

API

       ::doctools::idx::parse text text
              The command takes the string contained in text and parses it under the assumption that it contains
              a  document  written  using  the  docidx markup language. An error is thrown if this assumption is
              found to be false. The format of these errors is described in section Parse errors.

              When successful the command returns the canonical serialization of the  keyword  index  which  was
              encoded in the text.  See the section Keyword index serialization format for specification of that
              format.

       ::doctools::idx::parse file path
              The same as text, except that the text to parse is read from the file specified by path.

       ::doctools::idx::parse includes
              This method returns the current list of search paths used when looking for include files.

       ::doctools::idx::parse include add path
              This method adds the path to the list of paths searched when looking for an include file. The call
              is ignored if the path is already in the list of paths. The method returns the empty string as its
              result.

       ::doctools::idx::parse include remove path
              This method removes the path from the list of paths searched when looking for an include file. The
              call  is  ignored  if the path is not contained in the list of paths. The method returns the empty
              string as its result.

       ::doctools::idx::parse include clear
              This method clears the list of search paths for include files.

       ::doctools::idx::parse vars
              This method returns a dictionary containing the current set of predefined variables known  to  the
              vset markup command during processing.

       ::doctools::idx::parse var set name value
              This  method  adds  the  variable name to the set of predefined variables known to the vset markup
              command during processing, and gives it the specified value. The method returns the  empty  string
              as its result.

       ::doctools::idx::parse var unset name
              This  method  removes  the  variable  name  from the set of predefined variables known to the vset
              markup command during processing. The method returns the empty string as its result.

       ::doctools::idx::parse var clear ?pattern?
              This method removes all variables matching the pattern from the set of predefined variables  known
              to the vset markup command during processing. The method returns the empty string as its result.

              The  pattern  matching  is  done  with  string  match,  and  the default pattern used when none is
              specified, is *.

PARSE ERRORS

       The format of the parse error messages thrown when encountering violations of the docidx markup syntax is
       human readable and not intended for processing by machines. As such it is not documented.

       However, the errorCode attached to the message is machine-readable and has the following format:

       [1]    The error code will be a list, each element describing a single error found in the input. The list
              has at least one element, possibly more.

       [2]    Each error element will be a list containing six  strings  describing  an  error  in  detail.  The
              strings will be

              [1]    The path of the file the error occurred in. This may be empty.

              [2]    The  range of the token the error was found at. This range is a two-element list containing
                     the offset of the first and last character in the range, counted from the beginning of  the
                     input (file). Offsets are counted from zero.

              [3]    The line the first character after the error is on.  Lines are counted from one.

              [4]    The column the first character after the error is at.  Columns are counted from zero.

              [5]    The message code of the error. This value can be used as argument to msgcat::mc to obtain a
                     localized   error   message,   assuming  that  the  application  had  a  suitable  call  of
                     doctools::msgcat::init  to  initialize  the  necessary  message   catalogs   (See   package
                     doctools::msgcat).

              [6]    A  list  of details for the error, like the markup command involved. In the case of message
                     code docidx/include/syntax this value is the set of errors  found  in  the  included  file,
                     using the format described here.

[DOCIDX] NOTATION OF KEYWORD INDICES

       The docidx format for keyword indices, also called the docidx markup language, is too large to be covered
       in single section.  The interested reader should start with the document

       [1]    docidx language introduction

       and then proceed from there to the formal specifications, i.e. the documents

       [1]    docidx language syntax and

       [2]    docidx language command reference.

       to get a thorough understanding of the language.

KEYWORD INDEX SERIALIZATION FORMAT

       Here  we  specify  the  format used by the doctools v2 packages to serialize keyword indices as immutable
       values for transport, comparison, etc.

       We distinguish between regular and canonical serializations. While a keyword index may have more than one
       regular serialization only exactly one of them will be canonical.

       regular serialization

              [1]    An index serialization is a nested Tcl dictionary.

              [2]    This dictionary holds a single key, doctools::idx, and its  value.  This  value  holds  the
                     contents of the index.

              [3]    The contents of the index are a Tcl dictionary holding the title of the index, a label, and
                     the keywords and references. The relevant keys and their values are

                     title  The value is a string containing the title of the index.

                     label  The value is a string containing a label for the index.

                     keywords
                            The  value  is  a Tcl dictionary, using the keywords known to the index as keys. The
                            associated values are lists containing the identifiers of the references  associated
                            with that particular keyword.

                            Any reference identifier used in these lists has to exist as a key in the references
                            dictionary, see the next item for its definition.

                     references
                            The value is a Tcl dictionary, using the identifiers for the references known to the
                            index  as  keys.  The  associated values are 2-element lists containing the type and
                            label of the reference, in this order.

                            Any key here has to be associated with at least one keyword, i.e. occur in at  least
                            one  of  the  reference  lists  which are the values in the keywords dictionary, see
                            previous item for its definition.

              [4]    The type of a reference can be one of two values,

                     manpage
                            The identifier of the reference is interpreted as symbolic file name,  referring  to
                            one of the documents the index was made for.

                     url    The identifier of the reference is interpreted as an url, referring to some external
                            location, like a website, etc.

       canonical serialization
              The  canonical  serialization of a keyword index has the format as specified in the previous item,
              and then additionally satisfies the constraints below, which make it unique among all the possible
              serializations of the keyword index.

              [1]    The keys found in all the nested Tcl dictionaries are sorted in ascending dictionary order,
                     as generated by Tcl's builtin command lsort -increasing -dict.

              [2]    The references listed for each keyword of the  index,  if  any,  are  listed  in  ascending
                     dictionary  order  of their labels, as generated by Tcl's builtin command lsort -increasing
                     -dict.

BUGS, IDEAS, FEEDBACK

       This document, and the package it describes, will undoubtedly contain bugs and  other  problems.   Please
       report  such  in  the  category  doctools  of the Tcllib Trackers [http://core.tcl.tk/tcllib/reportlist].
       Please also report any ideas for enhancements you may have for either package and/or documentation.

       When proposing code changes, please provide unified diffs, i.e the output of diff -u.

       Note further that attachments are strongly preferred over inlined patches. Attachments  can  be  made  by
       going  to the Edit form of the ticket immediately after its creation, and then using the left-most button
       in the secondary navigation bar.

KEYWORDS

       docidx, doctools, lexer, parser

CATEGORY

       Documentation tools

COPYRIGHT

       Copyright (c) 2009 Andreas Kupries <andreas_kupries@users.sourceforge.net>

tcllib                                                  1                             doctools::idx::parse(3tcl)