Provided by: libcatmandu-oai-perl_0.20-1_all bug

NAME

       Catmandu::Importer::OAI - Package that imports OAI-PMH feeds

SYNOPSIS

           # From the command line

           # Harvest records
           $ catmandu convert OAI --url http://myrepo.org/oai
           $ catmandu convert OAI --url http://myrepo.org/oai --metadataPrefix didl --handler raw

           # Harvest repository description
           $ catmandu convert OAI --url http://myrepo.org/oai --identify 1

           # Harvest identifiers
           $ catmandu convert OAI --url http://myrepo.org/oai --listIdentifiers 1

           # Harvest sets
           $ catmandu convert OAI --url http://myrepo.org/oai --listSets 1

           # Harvest metadataFormats
           $ catmandu convert OAI --url http://myrepo.org/oai --listMetadataFormats 1

           # Harvest one record
           $ catmandu convert OAI --url http://myrepo.org/oai --getRecord 1 --identifier oai:myrepo:1234

DESCRIPTION

       Catmandu::Importer::OAI is an Catmandu importer to harvest metadata records from an OAI-PMH endpoint.

CONFIGURATION

       url OAI-PMH Base URL.

       metadataPrefix
           Metadata prefix to specify the metadata format. Set to "oai_dc" by default.

       handler( sub {} | $object | 'NAME' | '+NAME' )
           Handler to transform each record from XML DOM (XML::LibXML::Element) into Perl hash.

           Handlers  can  be  provided  as  function  reference,  an  instance of a Perl package that implements
           'parse', or by  a  package  NAME.  Package  names  should  be  prepended  by  "+"  or  prefixed  with
           "Catmandu::Importer::OAI::Parser".         E.g         "foobar"         will         create         a
           "Catmandu::Importer::OAI::Parser::foobar" instance.

           By default the handler Catmandu::Importer::OAI::Parser::oai_dc is used for  metadataPrefix  "oai_dc",
           Catmandu::Importer::OAI::Parser::marcxml  for  "marcxml",  Catmandu::Importer::OAI::Parser::mods  for
           "mods",  and  Catmandu::Importer::OAI::Parser::struct  for  other  formats.   In  addition  there  is
           Catmandu::Importer::OAI::Parser::raw to return the XML as it is.

       identifier
           Option return only results for this particular identifier

       set An optional set for selective harvesting.

       from
           An  optional  datetime  value (YYYY-MM-DD or YYYY-MM-DDThh:mm:ssZ) as lower bound for datestamp-based
           selective harvesting.

       until
           An optional datetime value (YYYY-MM-DD or YYYY-MM-DDThh:mm:ssZ) as upper  bound  for  datestamp-based
           selective harvesting.

       identify
           Harvest the repository description instead of all records.

       getRecord
           Harvest one record instead of all records.

       listIdentifiers
           Harvest identifiers instead of full records.

       listRecords
           Harvest full records. Default operation.

       listSets
           Harvest sets instead of records.

       listMetadataFormats
           Harvest metadata formats of records

       resumptionToken
           An optional resumptionToken to start harvesting from.

       dry Don't do any HTTP requests but return URLs that data would be queried from.

       strict
           Optional  validate  all parameters first against the OAI 2 specifications before sending it to an OAI
           server. Default: undef.

       xslt
           Preprocess XML records with XSLT script(s) given as comma separated list or array reference. Requires
           Catmandu::XML.

       max_retries
           When an oai request fails, the importer will retry this number of times.  Set to '0' by default.

           Internally the exponential backoff algorithm is used for this. This means  that  after  every  failed
           request  the importer will choose a random number between 0 and 2^collision (excluded), and wait that
           number of seconds. So the actual amount of time before the importer stops can differ:

            first retry:
               wait [ 0..2^1 [ seconds
            second retry:
               wait [ 0..2^2 [ seconds
            third retry:
               wait [ 0..2^3 [ seconds

            ..

       sleep
           Sleep a number of seconds between OAI-PMH calls to the endpoint (default 0).

       realm
           An optional realm value. This value is used when the importer harvests from  a  repository  which  is
           secured with basic authentication through Integrated Windows Authentication (NTLM or Kerberos).

       username
           An  optional username value. This value is used when the importer harvests from a repository which is
           secured with basic authentication.

       password
           An optional password value. This value is used when the importer harvests from a repository which  is
           secured with basic authentication.

METHOD

       Every    Catmandu::Importer    is   a   Catmandu::Iterable   all   its   methods   are   inherited.   The
       Catmandu::Importer::OAI methods are not idempotent: OAI-PMH feeds can only be read once.

       In addition to methods inherited from Catmandu::Iterable,  this  module  provides  the  following  public
       methods:

   handle_record( $dom )
       Process an XML DOM as with xslt and handler as configured and return the result.

ENVIRONMENT

       If  you are connected to the internet via a proxy server you need to set the coordinates to this proxy in
       your environment:

           export http_proxy="http://localhost:8080"

       If you are connecting to a HTTPS server and don't want to verify the validity of certificates of the peer
       you can set the PERL_LWP_SSL_VERIFY_HOSTNAME to false in your environment. This maybe required to connect
       to broken SSL servers:

           export PERL_LWP_SSL_VERIFY_HOSTNAME=0

SEE ALSO

       Catmandu , Catmandu::Importer

AUTHOR

       Nicolas Steenlant, "<nicolas.steenlant at ugent.be>"

CONTRIBUTOR

       Patrick Hochstenbach, "<patrick.hochstenbach at ugent.be>"

       Jakob Voss, "<nichtich at cpan.org>"

       Nicolas Franck, "<nicolas.franck at ugent.be>"

LICENSE AND COPYRIGHT

       Copyright 2016 Ghent University Library

       This program is free software; you can redistribute it and/or modify it under the terms  of  either:  the
       GNU General Public License as published by the Free Software Foundation; or the Artistic License.

       See http://dev.perl.org/licenses/ for more information.

perl v5.36.0                                       2023-10-26                       Catmandu::Importer::OAI(3pm)