Provided by: w3c-linkchecker_5.0.0-2_all bug

NAME

       checklink - check the validity of links in an HTML or XHTML document

SYNOPSIS

       checklink  [ options ] uri ...

DESCRIPTION

       This manual page documents briefly the checklink command, a.k.a. the W3C® Link Checker.

       checklink is a program that reads an HTML or XHTML document, extracts a list of anchors and links and
       checks that no anchor is defined twice and that all the links are dereferenceable, including the
       fragments. It warns about HTTP redirects, including directory redirects, and can check recursively a part
       of a web site.

       The program can be used either as a command line tool or as a CGI script.

OPTIONS

       This program follow the usual GNU command line syntax, with long options starting with two dashes (`-').
       A summary of options is included below.

       -?, -h, --help
            Show summary of options.

       -V, --version
            Output version information.

       -s, --summary
            Show result summary only.

       -b, --broken
            Show only the broken links, not the redirects.

       -e, --dir-redirects
            Hide directory redirects - e.g. <http://www.w3.org/TR> -> <http://www.w3.org/TR/>.

       -r, --recursive
            Check the documents linked from the first one.

       -D, --depth n
            Check the documents linked from the first one to depth n (implies --recursive).

       -l, --location uri
            Scope  of  the documents checked (implies --recursive).  Can be specified multiple times in order to
            specify multiple recursion bases.  If the URI of a candidate document is downwards relative  to  any
            of  the  bases,  it is considered to be within the scope.  If not specified, the default is the base
            URI of the initial document, for example for <http://www.w3.org/TR/html4/Overview.html> it would  be
            <http://www.w3.org/TR/html4/>.

       -X, --exclude regexp
            Do  not check links whose full, canonical URIs match regexp.  Note that this option limits recursion
            the same way as --exclude-docs with the same regular expression would.

       --exclude-docs regexp
            In recursive mode, do not check links in documents whose full, canonical URIs  match  regexp.   This
            option may be specified multiple times.

       --suppress-redirect URI->URI
            Do  not  report a redirect from the first to the second URI.  The "->" is literal text.  This option
            may be specified multiple times.  Whitespace may be used instead of "->" to separate the URIs.

       --suppress-redirect-prefix URI->URI
            Do not report a redirect from a child of the first URI to the same child of  the  second  URI.   The
            \"->\"  is  literal  text.   This  option  may  be specified multiple times.  Whitespace may be used
            instead of "->" to separate the URIs.

       --suppress-temp-redirects
            Do not report warnings about temporary redirects.

       --suppress-broken CODE:URI
            Do not report a broken link with the given CODE.  CODE is  the  HTTP  response,  or  -1  for  robots
            exclusion.   The  ":" is literal text.  This option may be specified multiple times.  Whitespace may
            be used instead of ":" to separate the CODE and the URI.

       --suppress-fragment URI
            Do not report the given broken fragment URI.  A fragment URI  contains  "#".   This  option  may  be
            specified multiple times.

       -L, --languages accept-language
            The  "Accept-Language"  HTTP  header  to  send.   In  command  line mode, this header is not sent by
            default.  The special value "auto" causes a  value  to  be  detected  from  the  "LANG"  environment
            variable, and sent if found.  In CGI mode, the default is to send the value received from the client
            as is.

       -c, --cookies cookie-file
            Use  cookies,  load/save  them in cookie-file.  The special value "tmp" causes non-persistent use of
            cookies, i.e. they are used but only stored in memory for the duration of this link checker run.

       -R, --no-referer
            Do not send the "Referer" HTTP header.

       -q, --quiet
            No output if no errors are found.  Implies --summary.

       -v, --verbose
            Verbose mode.

       -i, --indicator
            Show progress while parsing as percentage of lines processed.  No indicator is shown  for  documents
            containing no linefeeds.

       -u, --user username
            Specify a username for authentication.

       -p, --password password
            Specify a password for authentication.

       --hide-same-realm
            Hide 401's that are in the same realm as the document checked.

       -S, --sleep secs
            Sleep  the specified number of seconds between requests to each server.  Defaults to 1 second, which
            is also the minimum allowed.

       -t, --timeout secs
            Timeout for requests, in seconds.  The default is 30.

       -C, --connection-cache number
            Maximum number of cached connections.   Using  this  option  overrides  the  "Connection_Cache_Size"
            configuration  file  parameter,  see  its  documentation  below  for  the  default  value  and  more
            information.

       -d, --domain domain
            Perl regular expression describing the domain to which the authentication information  (if  present)
            will  be  sent.   The  default  value can be specified in the configuration file.  See the "Trusted"
            entry in the configuration file description below for more information.

       --masquerade "real-prefix surrogate-prefix"
            Perform a simple string substitution: URIs which begin with the string "real-prefix"  are  rewritten
            using  the  "surrogate-prefix"  before  being  dereferenced.   Useful  for  making a local directory
            masquerade as a remote one. For example:

              --masquerade "http://example.com/x/y/z/ file:///my/local/dir/"

            If the document being checked contains a link to http://example.com/x/y/z/foo.html, then  the  local
            file system will be checked for file:///my/local/dir/foo.html.

            --masquerade  takes  a  single  argument consisting of two URIs, separated by whitespace.  The quote
            marks are not part of the argument, but one usual way of providing a value with embedded  whitespace
            is to enclose it in quotes.

       -H, --html
            HTML output.

FILES

       /etc/w3c/checklink.conf
            The main configuration file.  You can use the W3C_CHECKLINK_CFG environment variable to override the
            default location.

            "Trusted"  specifies a regular expression for matching trusted domains (ie. domains where HTTP basic
            authentication, if any, will be sent).  The regular expression will be  matched  case  insensitively
            against  host  names.   The  default  behavior  (when  unset, that is) is to send the authentication
            information only to the host which requests it; usually you don't want to change this.  For example,
            the following configures only the w3.org domain as trusted:

                Trusted = \.w3\.org$

            "Allow_Private_IPs" is a boolean flag indicating whether checking links on non-public  IP  addresses
            is  allowed.   The  default  is  true  in command line mode and false when run as a CGI script.  For
            example, to disallow checking non-public IP addresses, regardless of the mode, use:

               Allow_Private_IPs = 0

            "Forbidden_Protocols" is a comma separated list of additional protocols/URI schemes  that  the  link
            checker  is  not allowed to use.  The "javascript" and "mailto" schemes are always forbidden, and so
            is the "file" scheme when running as a CGI script.

               Forbidden_Protocols = javascript,mailto

            "Markup_Validator_URI" and "CSS_Validator_URI" are formatted URIs to the respective validators.  The
            %s in these will be replaced with the full "URI encoded" URI to  the  document  being  checked,  and
            shown in the link checker results view in the online/CGI version.  The defaults are:

               Markup_Validator_URI =
                 http://validator.w3.org/check?uri=%s
               CSS_Validator_URI =
                 http://jigsaw.w3.org/css-validator/validator?uri=%s

            "Doc_URI"  is  a  URI  used  for  linking  to the documentation, and CSS and JavaScript files in the
            dynamically generated content of the link checker.  The default is:

               Doc_URI = http://validator.w3.org/docs/checklink.html

            "Connection_Cache_Size" is an integer denoting the maximum number of connections  the  link  checker
            will keep open at any given time.  The default is:

               Connection_Cache_Size = 2

ENVIRONMENT

       checklink  uses  the  libwww-perl  library  which  has  a  number  of environment variables affecting its
       behaviour.  See "SEE ALSO" for some pointers.

       W3C_CHECKLINK_CFG
            If set, overrides the path to the configuration file.

SEE ALSO

       The     documentation     for     this     program     is     available      on      the      web      at
       <http://validator.w3.org/docs/checklink.html>.

       LWP, Net::FTP, Net::NNTP, Net::IP, perlre.

AUTHOR

       This  program was originally written by Hugo Haas <hugo@w3.org>, based on Renaud Bruyeron's checklink.pl.
       It has been enhanced by Ville Skyttä and many other volunteers  since.   Use  the  <www-validator@w3.org>
       mailing   list   for   feedback,   and  see  <http://validator.w3.org/docs/checklink.html#csb>  for  more
       information.

       This manual page was originally written by Frédéric Schütz <schutz@mathgen.ch> for the  Debian  GNU/Linux
       system (but may be used by others).

COPYRIGHT

       This       program       is       licensed       under       the       W3C®       Software       License,
       <http://www.w3.org/Consortium/Legal/copyright-software>.

perl v5.36.0                                       2022-10-02                                      CHECKLINK(1p)