Ubuntu Manpage: erl_scan - The Erlang token scanner.

Provided by: erlang-manpages_25.3.2.8+dfsg-1ubuntu4.4_all

NAME

       erl_scan - The Erlang token scanner.

DESCRIPTION

       This module contains functions for tokenizing (scanning) characters into Erlang tokens.

DATA TYPES

       category() = atom()

       error_description() = term()

       error_info() =
           {erl_anno:location(), module(), error_description()}

       option() =
           return | return_white_spaces | return_comments | text |
           {reserved_word_fun, resword_fun()} |
           {text_fun, text_fun()}

       options() = option() | [option()]

       symbol() = atom() | float() | integer() | string()

       resword_fun() = fun((atom()) -> boolean())

       token() =
           {category(), Anno :: erl_anno:anno(), symbol()} |
           {category(), Anno :: erl_anno:anno()}

       tokens() = [token()]

       tokens_result() =
           {ok, Tokens :: tokens(), EndLocation :: erl_anno:location()} |
           {eof, EndLocation :: erl_anno:location()} |
           {error,
            ErrorInfo :: error_info(),
            EndLocation :: erl_anno:location()}

       text_fun() = fun((atom(), string()) -> boolean())

EXPORTS

       category(Token) -> category()

              Types:

                 Token = token()

              Returns the category of Token.

       column(Token) -> erl_anno:column() | undefined

              Types:

                 Token = token()

              Returns the column of Token's collection of annotations.

       end_location(Token) -> erl_anno:location() | undefined

              Types:

                 Token = token()

              Returns  the  end  location of the text of Token's collection of annotations. If there is no text,
              undefined is returned.

       format_error(ErrorDescriptor) -> string()

              Types:

                 ErrorDescriptor = error_description()

              Uses an ErrorDescriptor and returns a string that describes the error or warning. This function is
              usually  called  implicitly  when  an  ErrorInfo  structure  is  processed  (see   section   Error
              Information).

       line(Token) -> erl_anno:line()

              Types:

                 Token = token()

              Returns the line of Token's collection of annotations.

       location(Token) -> erl_anno:location()

              Types:

                 Token = token()

              Returns the location of Token's collection of annotations.

       reserved_word(Atom :: atom()) -> boolean()

              Returns true if Atom is an Erlang reserved word, otherwise false.

       string(String) -> Return

       string(String, StartLocation) -> Return

       string(String, StartLocation, Options) -> Return

              Types:

                 String = string()
                 Options = options()
                 Return =
                     {ok, Tokens :: tokens(), EndLocation} |
                     {error, ErrorInfo :: error_info(), ErrorLocation}
                 StartLocation = EndLocation = ErrorLocation = erl_anno:location()

              Takes  the  list  of  characters  String  and  tries  to  scan (tokenize) them. Returns one of the
              following:

                {ok, Tokens, EndLocation}:
                  Tokens are the Erlang tokens from String. EndLocation is the first  location  after  the  last
                  token.

                {error, ErrorInfo, ErrorLocation}:
                  An error occurred. ErrorLocation is the first location after the erroneous token.

              string(String) is equivalent to string(String, 1), and string(String, StartLocation) is equivalent
              to string(String, StartLocation, []).

              StartLocation  indicates  the  initial  location when scanning starts. If StartLocation is a line,
              Anno, EndLocation, and ErrorLocation are lines. If StartLocation is a pair of a line and a column,
              Anno takes the form of an opaque compound data type, and EndLocation and ErrorLocation  are  pairs
              of  a  line  and a column. The token annotations contain information about the column and the line
              where the token begins, as well as the text of the token (if option text  is  specified),  all  of
              which can be accessed by calling column/1, line/1, location/1, and text/1.

              A token is a tuple containing information about syntactic category, the token annotations, and the
              terminal symbol. For punctuation characters (such as ; and |) and reserved words, the category and
              the  symbol  coincide,  and  the token is represented by a two-tuple. Three-tuples have one of the
              following forms:

                * {atom, Anno, atom()}

                * {char, Anno, char()}

                * {comment, Anno, string()}

                * {float, Anno, float()}

                * {integer, Anno, integer()}

                * {var, Anno, atom()}

                * {white_space, Anno, string()}

              Valid options:

                {reserved_word_fun, reserved_word_fun()}:
                  A callback function that is called when the  scanner  has  found  an  unquoted  atom.  If  the
                  function  returns  true,  the  unquoted  atom itself becomes the category of the token. If the
                  function returns false, atom becomes the category of the unquoted atom.

                return_comments:
                  Return comment tokens.

                return_white_spaces:
                  Return white space tokens. By convention, a newline character, if present, is always the first
                  character of the text (there cannot be more than one newline in a white space token).

                return:
                  Short for [return_comments, return_white_spaces].

                text:
                  Include the token  text  in  the  token  annotation.  The  text  is  the  part  of  the  input
                  corresponding to the token. See also text_fun.

                {text_fun, text_fun()}:
                  A callback function used to determine whether the full text for the token shall be included in
                  the  token  annotation.  Arguments  of the function are the category of the token and the full
                  token string. This is only used when text is not present. If neither are present the text will
                  not be saved in the token annotation.

       symbol(Token) -> symbol()

              Types:

                 Token = token()

              Returns the symbol of Token.

       text(Token) -> erl_anno:text() | undefined

              Types:

                 Token = token()

              Returns the text of Token's collection of annotations. If there is no text, undefined is returned.

       tokens(Continuation, CharSpec, StartLocation) -> Return

       tokens(Continuation, CharSpec, StartLocation, Options) -> Return

              Types:

                 Continuation = return_cont() | []
                 CharSpec = char_spec()
                 StartLocation = erl_anno:location()
                 Options = options()
                 Return =
                     {done,
                      Result :: tokens_result(),
                      LeftOverChars :: char_spec()} |
                     {more, Continuation1 :: return_cont()}
                 char_spec() = string() | eof
                 return_cont()
                   An opaque continuation.

              This is the re-entrant scanner, which scans characters until either a dot ('.' followed by a white
              space) or eof is reached. It returns:

                {done, Result, LeftOverChars}:
                  Indicates that there is sufficient input data to get a result. Result is:

                  {ok, Tokens, EndLocation}:
                    The scanning was successful. Tokens is the list of tokens including dot.

                  {eof, EndLocation}:
                    End of file was encountered before any more tokens.

                  {error, ErrorInfo, EndLocation}:
                    An error occurred. LeftOverChars is the remaining characters of  the  input  data,  starting
                    from EndLocation.

                {more, Continuation1}:
                  More  data  is  required  for  building  a term. Continuation1 must be passed in a new call to
                  tokens/3,4 when more data is available.

              The CharSpec eof signals end of file. LeftOverChars then takes the value eof as well.

              tokens(Continuation, CharSpec, StartLocation)  is  equivalent  to  tokens(Continuation,  CharSpec,
              StartLocation, []).

              For a description of the options, see string/3.

ERROR INFORMATION

       ErrorInfo  is  the  standard  ErrorInfo structure that is returned from all I/O modules. The format is as
       follows:

       {ErrorLocation, Module, ErrorDescriptor}

       A string describing the error is obtained with the following call:

       Module:format_error(ErrorDescriptor)

NOTES

       The continuation of the first call to  the  re-entrant  input  functions  must  be  [].  For  a  complete
       description  of  how  the re-entrant input scheme works, see Armstrong, Virding and Williams: 'Concurrent
       Programming in Erlang', Chapter 13.

NAME

DESCRIPTION

DATA TYPES

EXPORTS

ERROR INFORMATION

NOTES

SEE ALSO