Provided by: libbiblio-citation-parser-perl_1.10+dfsg-4_all bug

NAME

       Biblio::Citation::Parser::Standard - citation parsing functionality

SYNOPSIS

         use Biblio::Citation::Parser::Standard;
         # Parse a simple reference
         $parser = new Biblio::Citation::Parser::Standard;
         $metadata = $parser->parse("M. Jewell (2004) Citation Parsing for Beginners. Journal of Madeup References 4(3).");
         print "The title of this article is ".$metadata->{atitle}."\n";

DESCRIPTION

       Biblio::Citation::Parser::Standard uses a relatively simple template matching technique to extract
       metadata from citations.

       The Templates.pm module currently provides almost 400 templates, with more being added regularly, and the
       parser returns the metadata in a form that is easily massaged into OpenURLs (see the Biblio::OpenURL
       module for an even easier way).

METHODS

       $parser = Biblio::Citation::Parser::Standard->new()
           The new() method creates a new parser.

       $reliability = Biblio::Citation::Parser::Standard::get_reliability($template)
           The  get_reliability method returns a value that acts as an indicator of the likelihood of a template
           matching correctly. Fields such as page ranges, URLs, etc, have  high  likelihoods  (as  they  follow
           rigorous patterns), whereas titles, publications, etc have lower likelihoods.

           The method takes a template as a parameter, but you shouldn't really need to use this method much.

       $concreteness = Biblio::Citation::Parser::Standard::get_concreteness($template)
           As with the get_reliability() method, get_concreteness() takes a template as a parameter, and returns
           a  numeric  indicator.  In  this case, it is the number of non-field characters in the template.  The
           more 'concrete' a template, the higher  the  probability  that  it  will  match  well.  For  example,
           '_PUBLICATION_  Vol.  _VOLUME_'  is a better match than '_PUBLICATION_ _VOLUME_', as _PUBLICATION_ is
           likely to subsume 'Vol.' in the second case.

       $string = Biblio::Citation::Parser::Standard::strip_spaces(@strings)
           This is a helper function to remove spaces from all elements of an array.

       $templates = Biblio::Citation::Parser::Standard::get_templates()
           Returns the current template list from the  Biblio::Citation::Parser::Templates  module.  Useful  for
           giving status lists.

       @authors = Biblio::Citation::Parser::Standard::handle_authors($string)
           This  (rather large) function handles the author fields of a reference.  It is not all-inclusive yet,
           but it is usably accurate. It can handle author lists that are separated by semicolons, commas, and a
           few other delimiters, as well as &, and, and 'et al'.

           The method takes an author string as a parameter, and returns an array of  extracted  information  in
           the format '{family => $family, given => $given}'.

       %metadata = $parser->xtract_metadata($reference)
           This  is  the key method in the Standard module, although it is not actually called directly by users
           (the 'parse' method provides a wrapper). It takes a reference, and returns a  hashtable  representing
           extracted metadata.

           A  regular  expression  map  is  present in this method to transform '_AUFIRST_', '_ISSN_', etc, into
           expressions that should match them. The method  then  finds  the  template  which  best  matches  the
           reference,  picking  the  result  that  has the highest concreteness and reliability (see above), and
           returns the fields in the hashtable. It also creates  the  marked-up  version,  that  is  useful  for
           further formatting.

       $metadata = $parser->parse($reference);
           This  method provides a wrapper to the extract_metadata function. Simply pass a reference string, and
           a metadata hash is returned.

NOTES

       The parser provided should not be seen as exhaustive. As new techniques are implemented, further  modules
       will be released.

AUTHOR

       Mike Jewell <moj@ecs.soton.ac.uk>

perl v5.36.0                                       2022-11-19             Biblio::Citatio...arser::Standard(3pm)