Provided by: inn2_2.7.2~20240212-1build3_amd64 bug

NAME

       pullnews - Pull news from multiple news servers and feed it to another

SYNOPSIS

       pullnews [-BhnOqRx] [-a hashfeed] [-b fraction] [-c config] [-C width] [-d level] [-f fraction] [-F
       fakehop] [-g groups] [-G newsgroups] [-H headers] [-k checkpt] [-l logfile] [-L size] [-m header_pats]
       [-M num] [-N timeout] [-p port] [-P hop_limit] [-Q level] [-r file] [-s to-server[:port][_tlsmode]] [-S
       max-run] [-t retries] [-T connect-pause] [-w num] [-z article-pause] [-Z group-pause] [from-server ...]

REQUIREMENTS

       The "Net::NNTP" module must be installed.  This module is available as part of the libnet distribution
       and comes with recent versions of Perl.  For older versions of Perl, you can download it from
       <http://www.cpan.org/>.

DESCRIPTION

       pullnews reads a config file named pullnews.marks, and connects to the upstream servers given there as a
       reader client.  This file is looked for in pathdb when pullnews is run as the user set in runasuser in
       inn.conf (which is by default the "news" user); otherwise, this file is looked for in the running user's
       home directory.

       By default, pullnews connects to all servers listed in the configuration file, but you can limit pullnews
       to specific servers by listing them on the command line: a whitespace-separated list of server names can
       be specified, like from-server for one of them.  For each server it connects to, it pulls over articles
       and feeds them to the destination server via the IHAVE or POST commands.  This means that the system
       pullnews is run on must have feeding access to the destination news server.

       pullnews is designed for very small sites that do not want to bother setting up traditional peering and
       is not meant for handling large feeds.

       In case you have running peers and don't want to propagate them the articles you are pulling from
       upstream servers, you should add a fake hop with the -F flag to all the pulled articles, and add that
       very fake hop in the exclusion sub-field of all the sites configured in your newsfeeds file which should
       not receive these articles.  For example, using "pullnews -F myserverimported", change
       "sitename:*:Tm:innfeed!"  to "sitename/myserverimported:*:Tm:innfeed!" for every sitename in newsfeeds
       you don't want to feed the pulled articles to (like your outgoing peers and a possible "inpaths!" entry).
       Entries like "ME", "controlchan!", "innfeed!" or "nocem!" do not need that exclusion.

OPTIONS

       -a hashfeed
           This option is a deterministic way to control the flow of articles and to split a feed.  The hashfeed
           parameter  must  be  in  the  form "value/mod" or "start-end/mod".  The Message-ID of each article is
           hashed using MD5, which results in a 128-bit hash.  The lowest 32 bits are then taken by  default  as
           the  hashfeed  value  (which  is  an  integer).   If the hashfeed value modulus "mod" plus one equals
           "value" or is between "start" and "end", pullnews will feed the article.  All these numbers  must  be
           integers.

           For instance:

               pullnews -a 1/2      Feeds about 50% of all articles.
               pullnews -a 2/2      Feeds the other 50% of all articles.

           Another example:

               pullnews -a 1-3/10   Feeds about 30% of all articles.
               pullnews -a 4-5/10   Feeds about 20% of all articles.
               pullnews -a 6-10/10  Feeds about 50% of all articles.

           You  can  use  an  extended syntax of the form "value/mod:offset" or "start-end/mod:offset" (using an
           underscore "_" instead of a colon ":" is also recognized).  As MD5 generates a 128-bit return  value,
           it  is  possible  to  specify from which byte-offset the 32-bit integer used by hashfeed starts.  The
           default value for "offset" is ":0" and thirteen overlapping values from ":0" to ":12"  can  be  used.
           Only up to four totally independent values exist: ":0", ":4", ":8" and ":12".

           Therefore,  it  allows  generating a second level of deterministic distribution.  Indeed, if pullnews
           feeds "1/2", it can go on splitting  thanks  to  "1-3/9:4"  for  instance.   Up  to  four  levels  of
           deterministic distribution can be used.

           The algorithm is compatible with the one used by Diablo 5.1 and up.

       -b fraction
           Backtrack  on  server numbering reset.  Specify the proportion ("0.0" to "1.0") of a group's articles
           to pull when the server's article number is less than our high for  that  group.   When  fraction  is
           "1.0", pull all the articles on a renumbered server.  The default is to do nothing.

       -B  Feed  is  header-only, that is to say pullnews only feeds the headers of the articles, plus one blank
           line.  It adds the Bytes header field if the article does not already have one, and  keeps  the  body
           only if the article is a control article.

       -c config
           Normally,  the  config  file  is  stored in pullnews.marks in pathdb when pullnews is run as the news
           user, or otherwise in the running user's home directory.  If -c is given, config will be used as  the
           config  file  instead.   This  is  useful if you're running pullnews as a system user on an automated
           basis out of cron or as an individual user, rather than the news user.

           See "CONFIG FILE" below for the format of this file.

       -C width
           Use width characters per line for the progress table.  The default value is "50".

       -d level
           Set the debugging level to the integer level (up to "4"); more debugging output  will  be  logged  as
           this increases.  The default value is "0".

       -f fraction
           This changes the proportion of articles to get from each group to fraction and should be in the range
           "0.0" to "1.0" ("1.0" being the default).

       -F fakehop
           Prepend fakehop as a host to the Path header field body of articles fed.

       -g groups
           Specify  a  collection  of  groups  to get.  groups is a list of newsgroups separated by commas (only
           commas, no spaces).  Each group must be defined in the config file, and only the  remote  hosts  that
           carry  those  groups  will  be  contacted.   Note that this is a simple list of groups, not a wildmat
           expression, and wildcards are not supported.

       -G newsgroups
           Add the comma-separated list of groups newsgroups to each server in the configuration file (see  also
           -g and -w).

       -h  Print a usage message and exit.

       -H headers
           Remove these named header fields (colon-separated list) from fed articles.

       -k checkpt
           Checkpoint  (save)  the config file every checkpt articles (default is "0", that is to say at the end
           of the session).

       -l logfile
           Log progress/stats to logfile (default is "stdout").

       -L size
           Specify the largest wanted article size in bytes.  The default is to download all articles,  whatever
           their  size.   When this option is used, pullnews will first retrieve overview data (if available) of
           each newsgroup to process so as to obtain articles sizes, before deciding which articles to  actually
           download.

       -m header_pats
           Feed  an  article  based  on  header  field  body  matching.  The argument is a number of whitespace-
           separated tuples (each tuple being a colon-separated header field name and regular expression).   For
           instance:

               -m "Hdr1:regexp1 !Hdr2:regexp2 #Hdr3:regexp3 !#Hdr4:regexp4"

           specifies  that the article will be passed only if the "Hdr1" header field body matches "regexp1" and
           the "Hdr2" header field body does not match "regexp2".  Besides, if  the  "Hdr3"  header  field  body
           matches  "regexp3",  that  header  is  removed;  and  if  the "Hdr4" header field body does not match
           "regexp4", that header is removed.

       -M num
           Specify the maximum number of articles (per group) to process.  The default is  to  process  all  new
           articles.  See also -f.

       -n  Do  nothing  but  read  articles -- does not feed articles downstream, writes no rnews file, does not
           update the config file.

       -N timeout
           Specify the timeout length, as timeout seconds, when establishing an NNTP connection.

       -O  Use an optimized mode: pullnews checks whether the article already exists on the  downstream  server,
           before downloading it.  It may help for huge articles or a slow link to upstream hosts.

       -p port
           Connect  to  the destination news server on a port other than the default of "119".  This option does
           not change the port used to connect to the source news servers.

       -P hop_limit
           Restrict feeding an article based on the number of hops it has already made.  Count the hops  in  the
           Path  header  field body (hop_count), feeding the article only when hop_limit is "+num" and hop_count
           is more than num; or hop_limit is "-num" and hop_count is less than num.

       -q  Print out less status information while running.

       -Q level
           Set the quietness level ("-Q 2" is equivalent to "-q").  The higher this value, the less gets logged.
           The default is "0".

       -r file
           Rather than feeding the downloaded articles to a destination server, instead create a batch file that
           can later be fed to a server using rnews.  See rnews(1) for more information  about  the  batch  file
           format.

       -R  Be  a  reader  (use MODE READER and POST commands) to the downstream server.  Some posts will then be
           rejected because of unexpected injection header fields,  obsolete  or  incorrectly  formatted  header
           fields,  or  with a date too far in the past.  You may then want to set artcutoff to "0" in inn.conf,
           and use the -H flag to strip unwanted header fields.  Even with that, a few  articles  may  still  be
           rejected.

           The  default  is  to  behave  like  a  feeder  and  use  the IHAVE command.  (You'll have to allow in
           incoming.conf the connections from pullnews so that it is recognized as a feeder.)

       -s to-server[:port][_tlsmode]
           Normally, pullnews will feed the articles it retrieves to the news server running on  localhost.   To
           connect  to  a different host, specify a server with the -s flag.  You can also specify the port with
           this same flag or use -p.  Default port is "119".

           The connection is by default unencrypted.  To negotiate a TLS encryption layer, you can  set  tlsmode
           to  "TLS" for implicit TLS (negotiated immediately upon connection on a dedicated port) or "STARTTLS"
           for explicit TLS (the appropriate command will be sent before authenticating  or  feeding  messages).
           Examples of use are:

               pullnews -s news.server.com
               pullnews -s news.server.com_STARTTLS
               pullnews -s news.server.com:433_TLS

           Note that not all NNTP servers implement TLS for feeding articles.

       -S max-run
           Specify the maximum time max-run in seconds for pullnews to run.

       -t retries
           The  maximum  number  (retries)  of  attempts  to connect to a server or reconnect to a server if the
           socket is unexpectedly closed (see also -T).  The default is "0".

       -T connect-pause
           Pause connect-pause seconds between connection retries (see also -t).  The default is "1".

       -w num
           Set each group's high water mark (last  received  article  number)  to  num.   If  num  is  negative,
           calculate  Current+num instead (i.e. get the last num articles).  Therefore, a num of "0" will re-get
           all articles on the server; whereas a num of "-0" will get no old articles, setting the water mark to
           Current (the most recent article on the server).

       -x  If the -x flag is used, an Xref header field is added to any article  that  lacks  one.   It  can  be
           useful for instance if articles are fed to a news server which has xrefslave set in inn.conf.

       -z article-pause
           Sleep article-pause seconds between articles.  The default is "0".

       -Z group-pause
           Sleep group-pause seconds between groups.  The default is "0".

CONFIG FILE

       The  config  file for pullnews is divided into blocks, one block for each remote server to connect to.  A
       block begins with the host line (which must have no leading whitespace) and contains just the hostname of
       the remote server with optional port and TLS mode (with the same semantics as the  -s  flag),  optionally
       followed  by  authentication  details  (username and password for that server).  Note that authentication
       details can also be provided for the downstream server (a host  line  for  "localhost"  or  the  hostname
       specified with the -s flag could be added for it in the configuration file, with no newsgroup to fetch).

       Following the host line should be one or more newsgroup lines which start with whitespace followed by the
       name of a newsgroup to retrieve.  Only one newsgroup should be listed on each line.

       pullnews  will  update  the  config  file  to include the time the group was last checked and the highest
       numbered article successfully retrieved and transferred to the destination server.  It uses this data  to
       avoid doing duplicate work the next time it runs.

       The full syntax is:

           <host>[:<port>][_<tlsmode>] [<username> <password>]
               <group> [<time> <high>]
               <group> [<time> <high>]

       where the <host> line must not have leading whitespace and the <group> lines must.

       A typical configuration file would be:

           # Format: group date high
           data.pa.vix.com
               rec.bicycles.racing 908086612 783
               rec.humor.funny 908086613 18
               comp.programming.threads
           nnrp.vix.com pull sekret
               comp.std.lisp
           news.server.com:563_TLS joe password
               news.software.nntp

       Note  that an earlier run of pullnews has filled in details about the last article downloads from the two
       rec.* groups.  The two comp.* groups and the news.* group were just added by the user and  have  not  yet
       been checked.

       The  nnrp.vix.com  server  requires  authentication,  and  pullnews  will use the username "pull" and the
       password "sekret" (without any encryption layer).

       The connection to news.server.com will be encrypted with implicit TLS on port 563.  Joe's password  won't
       be sent in plaintext.

FILES

       pathbin/pullnews
           The Perl script itself used to pull news from upstream servers and feed it to another news server.

       pathdb/pullnews.marks or ~/pullnews.marks
           The  default  config file.  It is stored in pullnews.marks in pathdb when pullnews is run as the news
           user, or otherwise in the running user's home directory.

HISTORY

       pullnews was written by James Brister for INN.  The documentation was rewritten in POD  by  Russ  Allbery
       <eagle@eyrie.org>.

       Geraint  A. Edwards  greatly  improved pullnews, adding no more than 16 new recognized flags, fixing some
       bugs and integrating the backupfeed contrib script by Kai Henningsen, adding again 6 other flags.

SEE ALSO

       incoming.conf(5), rnews(1).

INN 2.7.2                                          2024-03-31                                        PULLNEWS(1)