Provided by: linklint_2.3.5-7_all bug

NAME

       Linklint - fast link checker and website maintenance tool

SYNOPSIS

       linklint [-cache directory] [-case] [-checksum] [-concise_url] [-db1..9] [-delay d] [-doc] [-docbase
       base] [-dont_output xxxx] [-error] [-flush] [-forward] [-help] [-help_all] [-host hostname:port] [-host
       hostname] [-htmlonly] [-http] [-http_header name:value] [-ignore ignoreset] [-index file] [-language zz]
       [-limit n] [-list] [-local linkset] [-map /a=[/b]] [-net] [-netmod] [-netset] [-no_anchors]
       [-no_query_string] [-no_warn_index] [-orphan] [-out file] [-output_frames] [-output_index filename]
       [-password realm user:password] [-proxy hostname[:port]] [-quiet] [-redirect] [-retry] [-silent] [-skip
       skipset] [-textonly] [-timeout t] [-url_doc_prefix url/] [-version] [-warn] [-xref] linkset

VERSION

       2.3.5 August 13, 2001

DESCRIPTION

       This manual page documents briefly the Linklint program, which is an Open Source Perl program that checks
       local and remote HTML links.

       This manual page was written for the Debian distribution because the original program does not have a
       manual page.  Instead, it has documentation in the HTML format; see below.

OPTIONS

   Input File Selection
       Whether you are doing a local site check or an HTTP site check, you specify which directories (presumably
       containing HTML files) to check with one or more linksets. A linkset uses two wildcard characters @ and
       #.  Each linkset specifies one or more directories much like the standard * and ? wildcard characters are
       used to specify the characters in the * names of files in one directory.

       The @ character matches any string of characters (this kind of acts like "*"), and the # character (which
       is kind of like "?") matches any string of characters except "/" . The best way to understand how @ and #
       work is to look at a few examples:

                                 the entire site /@
                     the homepage only (default) /
                files in the root directory only /#
                    . . . and one directory down /#/#
                 files in the sub directory only /sub/#
            files in the sub directory and below /sub/@
                                  specific files /file1 /file2 ...
                         specific subdirectories /sub1/@ /sub2/@ ...

       If you specify more than one linkset, files matching any of the linksets will be checked. HTML files that
       don't match any of the linksets will be skipped. Linklint will see if they exist but won't check any of
       their links.

   Other File Selection Options
       -skip skipset
           Skips  HTML  files that match skipset.  "Linklint" will make sure these files exist but won't add any
           of their links to the list of files to check.  Multiple  skipsets  are  allowed,  but  each  must  be
           preceded with -skip on the command line. Skipsets use the same wildcard characters as linksets.

       -ignore ignoreset
           Ignores  files  matching  ignoreset.   "Linklint"  doesn't  even  check  to see if these files exist.
           Multiple ignoresets are allowed, but each  must  be  preceded  with  -ignore  on  the  command  line.
           Ignoresets use the same wildcard characters as linksets.

       -limit n
           Limits checking to n HTML files (default 500).  All HTML files after the first n are skipped.

   Local Site Checking
       If  you  are  developing  HTML  pages  on  a  computer  that  does not have an http server, or if you are
       developing a simple site that does not use Server Redirection or extensive CGI, you should use local site
       checking.

            linklint /@

       Checks all HTML files in the current directory and below. Assumes  that  the  current  directory  is  the
       server  root directory so links starting with "/" default to this directory. You must specify /@ to check
       the entire site. See Which Files to Check for details.

            linklint -root dir /@

       Checks all HTML files in dir and below. This is useful if you want to check several  sites  on  the  same
       machine or if you don't want to run Linklint in your public HTML directory.

   Other Local Site Options
       -host hostname
           By  default  "Linklint"  assumes all links on your site that start with "http://" are remote links to
           other sites.  If you have absolute links to your own site, give "Linklint" your  hostname  and  links
           starting  with "http://hostname" will be treated as local files.  If you specify -host hostname:port,
           only http links to this hostname and port will be treated as local files.

       -case
           Makes sure that the filename (upper/lower) case used links inside of html tags matches the case  used
           by  the  file system.  This is for Windows only and is very handy if you are porting a site to a Unix
           host.

       -orphan
           Checks all directories that contain files used on the site for unused (orphan) files.

       -index file
           Uses file as the default index file instead of the default list used by "Linklint". You  can  specify
           more  than  one file but each one must be preceded by -index on the command line.  If a default index
           file is not found, "Linklint" uses a listing of the entire directory. See the  Default  File  section
           for details.

       -map /a=[/b]
           Substitutes leading /a with /b.  For server-side image maps or to simulate Server Redirection.

       -no_warn_index
           Turns of the "index file not found" warning.  Applies to local site checking only.

       -no_anchors
           Tells  "Linklint"  to  ignore  named  anchors.  This could ease memory problems for people with large
           sites who are primarily interested in missing pages and not missing named anchors.  This option works
           for both HTTP and local site checks.

   HTTP Site Checking
       If you have a complicated site that uses lots of CGI or Server Redirection,  you  should  use  HTTP  site
       checking.  Even  though  an  HTTP  site  check  reads  pages  via your HTTP server, you will get the best
       performance if you do your checking on a machine that has a high speed connection to your server.

            linklint -http -host www.site.com /@

       The -http flag tells "Linklint" to  check  HTML  files  on  the  site  www.site.com  via  a  remote  http
       connection.  You  must specify a -host whenever you do an HTTP site check (otherwise Linklint won't where
       to get your pages). You can specify /@ to check the entire site.  See Which Files to Check for details.

   HTTP Site Check Options
       -http
           This flag tells Linklint to perform an HTTP site check instead of a  local  site  check.   All  files
           (except server side image maps) will be read via the HTTP protocol from your web server.

       -host hostname:port
           If you include :port at the end of your hostname, Linklint uses this port for the HTTP site check.

       -password realm user:password
           Uses  user and password as authorization to enter password protected realm. Realms are named areas of
           a site that share a common set of usernames and passwords.  If passwords are  needed  to  check  your
           site,  Linklint  will tell you which realms need passwords in warning messages.  Enclose the realm in
           double quotes if it contains spaces.  If no password is given for a specific realm, Linklint will try
           using the password for the ""DEFAULT"" realm if it was provided.

       -timeout t
           Times out after t seconds (default 15) when getting files  via  http.   Once  data  is  received,  an
           additional  t seconds is allowed.  The timeout is disabled on Windows machines since the Windows port
           of Perl does not support the "alarm()" function.

       -delay d
           Delays d seconds between requests to the same host (default 0).  This  is  a  friendly  thing  to  do
           especially if you are checking many links on the same host.

       -local linkset
           Gets  files  that match linkset locally.  The default -local linkset is @.map (which matches any link
           ending in .map).  This allows Linklint to follow links through server-side image maps.   The  default
           is  ignored  if you specify your own -local expressions.  You need to specify the -root directory for
           this option to work propery.

       -map /a=[/b]
           Substitutes leading /a with /b.  For server-side image maps or to simulate Server Redirection.

       -no_anchors
           Tells "Linklint" to ignore named anchors.

       -no_query_string
           Up until version 2.3.4, Linklint did not use query strings  while  doing  HTTP  site  checks.   Query
           strings were removed before making HTTP requests.  As of 2.3.4 query strings in links are used in the
           requests.  Use the -no_query_string flag to get back the "old" behavior.

       -http_header Name:value
           Adds  the  HTTP  header Name: value to all HTTP requests generated by Linklint.  You will need to use
           quotation marks to hide spaces in the header line from the command line  interpreter.  Linklint  will
           automatically add a space after the first colon if there is not one there already.  Multiple (unique)
           header lines are allowed.

       -language zz
           This  option  is  only useful if you are checking a site that uses content negotiation to present the
           same URL in different languages.

           Creates an HTTP Request header of the form Accept-Language: zz that is included as part of  all  HTTP
           requests  generated by Linklint.  Multiple -language specifications are allowed.  This will result in
           a single Accept-Language: header that lists all of the languages you have specified  in  alphabetical
           order.  Some web sites can use this information to return pages to you in a specific language.

           If  you  need  to get more complicated than this, use the more general purpose -http_header to create
           your own header.  There is a partial list of language abbreviations (taken from Debian)  included  as
           part of the Linklint documentation.

   Remote URL Checking
       A  remote  URL  check is used to see if a remote URL exists (or has been recently modified). Links in the
       remote pages are not checked nor does Linklint look for named anchors in remote URLs.

       Remote URL checking can be used to check all of the "remote" links on your site (those that link to pages
       on other sites) or it can check a list of URLs. There are several ways to specify which  remote  URLs  to
       check:

            linklint http://somehost/file.html

       Checks  to  see if /file.html exists on somehost. Multiple URLs can be entered on the command line, in an
       @commandfile, or in an @@httpfile.  Every URL to be checked must begin with "http://". This will  disable
       site checking.

            linklint @@httpfile

       Checks  all  the  remote  http  URLs  found  in httpfile. Anything in the file starting with "http://" is
       considered to be a URL. If the file looks like a remoteX.txt file generated by Linklint then  all  failed
       URLs will be cross referenced.

            linklint @@ -doc linkdoc

       Assuming  you have already done a site check and used -doc linkdoc to put all of your output files in the
       linkdoc directory, Linklint will check all the remote links that  were  found  on  your  site  and  cross
       reference  all failed URLs without doing a site check. You can use the -netmod or -netset flags to enable
       the status-cache.

            linklint -net [site check options]

       The -net flag tells Linklint to check all remote links after doing either a  local  or  HTTP  site  check
       site.  If  you  are  having memory problems, don't use the -net option, instead use one of the @@ options
       above.

   Other Remote URL Options
       -timeout t
           Times out after t seconds (default 15) when getting files  via  http.   Once  data  is  received,  an
           additional  t seconds is allowed.  The timeout is disabled on Windows machines since the Windows port
           of Perl does not support the "alarm()" function.

       -delay d
           Delays d seconds between requests to the same host (default 0).  This  is  a  friendly  thing  to  do
           especially if you are checking many links on the same host.

       -redirect
           Checks  for  <meta>  redirects  in the headers of remote  URLs that are html files.  If a redirect is
           found it is followed.  This feature is disabled if the status cache is used.

       -proxy hostname[:port]
           Sends all remote HTTP requests through the proxy server hostname and the optional port.  This  allows
           you to check remote URLs or (new with version 2.3.1) your entire site from within a firewall that has
           an  http  proxy server.  Some error messages (relating to host errors) may not be available through a
           proxy server.

       -concise_url
           Turns off printing successful URLs to STDOUT during remote link checking.

   Status Cache Options
       The Status Cache is a very powerful feature. It allows you to keep track of recent changes in all of  the
       remote  (off-site) pages you link to. You can then use the Linklint output files to quickly check changed
       pages to see if they still meet your needs.

       The flags below make use of the status cache file linklint.url (kept in your HOME or LINKLINT directory).
       This file keeps track of the modification dates of all the remote URLs that you check.

       -netmod
           Operates just like -net but makes use of the status cache.  Newly checked URLs will be entered in the
           cache.  Linklint will tell you which (previously cached) URLs  have  been  modified  since  the  last
           -netset.

       -netset
           Like  -netmod but also resets the last modified status in the cache for all URLs that checked ok.  If
           you always use -netset, modified URLs will be reported just once.

       -retry
           Only checks URLs that have a host fail status in the cache.  Sometimes a URL fails because  its  host
           is  temporarily down.  This flag enables you to recheck just those links.  An easy way to recheck all
           the cached URLs with host failures is "linklint  @@  -retry".   Use  "linklint  @@linkdoc/remoteX.txt
           -retry" if you want failed URLs to be cross referenced.

       -flush
           Removes  all URLs from the cache that are not currently being checked.  The -retry flag has no effect
           on which URLs are flushed.

       -checksum
           Ensures that every URL that has been modified is reported as such.  This flag  can  make  the  remote
           checking  take  longer.  Many of the pages that require a checksum are dynamically generated and will
           always be reported as modified.

       -cache directory
           Reads and writes the linklint.url cache file in this directory.  The default directory is set by your
           LINKLINT or HOME environment variables.

   Output Options
       No output files are generated by default, only progress and a brief summary of the results are printed to
       the screen. You can produce complete documentation (split up into separate files) in a -doc directory  or
       put selected output in a single -out file or by redirecting the standard output to a file. See the Output
       File Specification section for a detailed description of all output files.

   Multi File Output
       -doc linkdoc
           Sends all output to the linkdoc directory.  The output is divided into separate .txt and .html files.
           Complete documentation is always produced regardless of the single file flags.

           The  file  index.txt  contains  an index to all the other files; index.html is an HTML version of the
           index.  The index files for remote URL checking are ur_lindex.txt and url_index.html.

       -textonly
           Prevents any HTML files from being created in the -doc directory.

       -htmlonly
           Erases redundant text files in the -doc directory after they have been used to create the HTML output
           files.  The files remote.txt and remoteX.txt are not erased since they can be  used  by  Linklint  to
           recheck remote URLs.

       -docbase base
           Overrides  the  default  base  expression used for directing a browser to the resources listed in the
           output HTML files.  The base is prepended to local links in the output HTML files.  This only affects
           the links in HTML output files, it has no effect on what is displayed  in  these  files.   Ordinarily
           this flag would only be used during a local site check to set the base to "http://host".

       -output_frames
           All  HTML  output  data  files are linked to from index.html.  If you use this flag then the the data
           files will be opened up in a new frame (window) which can be handy in  some  cases  since  it  always
           leaves the index.html file open in its own window.

       -output_index filename
           The  output  index  files  were previously named linklint.txt and linklint.html.  These have now been
           changed to index.txt and index.html.  You can use the -output_index option to change this  name  back
           to "linklint" or to something else.

       -url_doc_prefix url/
           By default, the output files associate with remote URL checking all start with "url".  You can change
           this  with  the  -url_doc_prefix  option.   If  the  url_doc_prefix contains a "/" character then the
           appropriate directory will be created (as a subdirectory of the -doc directory).

       -dont_output xxxx
           Don't create output files that contain "xxxx".  Can be repeated.  Example:

                   -dont_output "X$"

           will supress the output of all cross reference files.

   Single File Output
       -error
           Lists missing files and other errors.

       -out file
           Sends list output and summary information to file.

       -list
           Lists all found files, links, directories etc.

       -warn
           Lists all warnings.

       -xref
           Adds cross references to the lists.

       -forward
           Sorts lists by referring file.

   Debug and other Flags
       -db1
           Debugs command line input and linkset expressions.

       -db2
           Prints the name of every file that gets checked (not just HTML files).

       -db3
           Debugs HTML parser, prints out tags and resulting links.

       -db4
           Debugs socket connection (kind of).

       -db5
           Not used.

       -db6
           Details last-modified status for remote URLs (requires -netset or -netmod).

       -db7
           Prints brief debug information while checking remote URLs.

       -db8
           Prints all http headers while checking remote URLs.

       -db9
           Generates random http errors.

       -version
           Gives version information.

       -help
           Lists a few simple examples of how to use Linklint.

       -help_all
           Lists all help (contained in program) including every input option.

       -quiet
           Disables printing progress to the screen.

       -silent
           Disables printing summarys to the screen.

AUTHOR

       Linklint is written by James B. Bowlin <jbowlin@linklint.org>.  This manual page  was  written  by  Denis
       Barbier  <barbier@debian.org>  for  the  Debian  system  (but  may be used by others) by cut'n'paste from
       original documentation written in HTML.

perl v5.36.0                                       2023-12-12                                        LINKLINT(1)