Provided by: liburi-fetch-perl_0.15-1_all bug

NAME

       URI::Fetch - Smart URI fetching/caching

SYNOPSIS

           use URI::Fetch;

           ## Simple fetch.
           my $res = URI::Fetch->fetch('http://example.com/atom.xml')
               or die URI::Fetch->errstr;
           do_something($res->content) if $res->is_success;

           ## Fetch using specified ETag and Last-Modified headers.
           $res = URI::Fetch->fetch('http://example.com/atom.xml',
                   ETag => '123-ABC',
                   LastModified => time - 3600,
           )
               or die URI::Fetch->errstr;

           ## Fetch using an on-disk cache that URI::Fetch manages for you.
           my $cache = Cache::File->new( cache_root => '/tmp/cache' );
           $res = URI::Fetch->fetch('http://example.com/atom.xml',
                   Cache => $cache
           )
               or die URI::Fetch->errstr;

DESCRIPTION

       URI::Fetch is a smart client for fetching HTTP pages, notably syndication feeds (RSS, Atom, and others),
       in an intelligent, bandwidth- and time-saving way. That means:

       •   GZIP support

           If  you  have  Compress::Zlib  installed,  URI::Fetch will automatically try to download a compressed
           version of the content, saving bandwidth (and time).

       •   Last-Modified and ETag support

           If you use a local cache (see the Cache parameter to fetch), URI::Fetch will keep track of the  Last-
           Modified  and  ETag  headers  from  the  server,  allowing  you to only download pages that have been
           modified since the last time you checked.

       •   Proper understanding of HTTP error codes

           Certain HTTP error codes are special, particularly when fetching syndication feeds, and  well-written
           clients should pay special attention to them.  URI::Fetch can only do so much for you in this regard,
           but it gives you the tools to be a well-written client.

           The response from fetch gives you the raw HTTP response code, along with special handling of 4 codes:

           •   200 (OK)

               Signals that the content of a page/feed was retrieved successfully.

           •   301 (Moved Permanently)

               Signals that a page/feed has moved permanently, and that your database of feeds should be updated
               to reflect the new URI.

           •   304 (Not Modified)

               Signals that a page/feed has not changed since it was last fetched.

           •   410 (Gone)

               Signals  that  a  page/feed  is  gone and will never be coming back, so you should stop trying to
               fetch it.

   Change from 0.09
       If you make a request using a cache and get back a 304 response code (Not Modified), then if the  content
       was  returned from the cache, then "is_success()" will return true, and "$response->content" will contain
       the cached content.

       I think this is the right behaviour, given the philosophy of "URI::Fetch", but please let me (NEILB) know
       if you disagree.

USAGE

   URI::Fetch->fetch($uri, %param)
       Fetches a page identified by the URI $uri.

       On success, returns a URI::Fetch::Response object; on failure, returns "undef".

       %param can contain:

       •   LastModified

       •   ETag

           LastModified and ETag can be supplied to force the server to  only  return  the  full  page  if  it's
           changed since the last request. If you're writing your own feed client, this is recommended practice,
           because it limits both your bandwidth use and the server's.

           If  you'd  rather  not have to store the LastModified time and ETag yourself, see the Cache parameter
           below (and the SYNOPSIS above).

       •   Cache

           If you'd like URI::Fetch to cache responses between requests, provide the  Cache  parameter  with  an
           object  supporting  the  Cache  API  (e.g.  Cache::File, Cache::Memory). Specifically, an object that
           supports "$cache->get($key)" and "$cache->set($key, $value, $expires)".

           If supplied, URI::Fetch will store the page content, ETag, and last-modified time of the response  in
           the cache, and will pull the content from the cache on subsequent requests if the page returns a Not-
           Modified response.

       •   UserAgent

           Optional.   You  may  provide your own LWP::UserAgent instance.  Look into LWPx::ParanoidUserAgent if
           you're fetching URLs given to you by possibly malicious parties.

       •   NoNetwork

           Optional.    Controls   the   interaction   between   the    cache    and    HTTP    requests    with
           If-Modified-Since/If-None-Match headers.  Possible behaviors are:

           false (default)
               If  a  page  is in the cache, the origin HTTP server is always checked for a fresher copy with an
               If-Modified-Since and/or If-None-Match header.

           1   If set to 1, the origin HTTP is never contacted, regardless of the page being in  cache  or  not.
               If  the page is missing from cache, the fetch method will return undef.  If the page is in cache,
               that page will be returned, no matter how old it is.  Note that setting  this  option  means  the
               URI::Fetch::Response object will never have the http_response member set.

           "N", where N > 1
               The  origin HTTP server is not contacted if the page is in cache and the cached page was inserted
               in the last N seconds.  If the cached copy is older than N seconds, a normal HTTP  request  (full
               or cache check) is done.

       •   ContentAlterHook

           Optional.   A  subref  that gets called with a scalar reference to your content so you can modify the
           content before it's returned and before it's put in cache.

           For instance, you may want to only cache the <head> section of an HTML document, or you may  want  to
           take a feed URL and cache only a pre-parsed version of it.  If you modify the scalarref given to your
           hook  and  change  it  into  a  hashref,  scalarref,  or some blessed object, that same value will be
           returned to you later on not-modified responses.

       •   CacheEntryGrep

           Optional.  A subref that gets called with the URI::Fetch::Response object about to  be  cached  (with
           the  contents already possibly transformed by your "ContentAlterHook").  If your subref returns true,
           the page goes into the cache.  If false, it doesn't.

       •   Freeze

       •   Thaw

           Optional. Subrefs that get called to serialize and deserialize, respectively, the data that  will  be
           cached.  The  cached  data  should  be  assumed  to  be  an arbitrary Perl data structure, containing
           (potentially) references to arrays, hashes, etc.

           Freeze should serialize the structure into a scalar; Thaw should deserialize the scalar into  a  data
           structure.

           By default, Storable will be used for freezing and thawing the cached data structure.

       •   ForceResponse

           Optional.  A  boolean that indicates a URI::Fetch::Response should be returned regardless of the HTTP
           status. By default "undef" is returned when a response is not a "success" (200 codes) or one  of  the
           recognized  HTTP  status  codes listed above. The HTTP status message can then be retreived using the
           "errstr" method on the class.

REPOSITORY

       <https://github.com/neilbowers/URI-Fetch>

LICENSE

       URI::Fetch is free software; you may redistribute it and/or modify  it  under  the  same  terms  as  Perl
       itself.

AUTHOR & COPYRIGHT

       Except  where  otherwise noted, URI::Fetch is Copyright 2004 Benjamin Trott, ben+cpan@stupidfool.org. All
       rights reserved.

       Currently maintained by Neil Bowers.

CONTRIBUTORS

       •   Tim Appnel

       •   Mario Domgoergen

       •   Karen Etheridge

       •   Brad Fitzpatrick

       •   Jason Hall

       •   Naoya Ito

       •   Tatsuhiko Miyagawa

perl v5.32.1                                       2021-09-16                                    URI::Fetch(3pm)