Ubuntu Manpage: pdsh - issue commands to groups of hosts in parallel

NAME

       pdsh - issue commands to groups of hosts in parallel

SYNOPSIS

       pdsh [options]... command

DESCRIPTION

pdsh is a variant of the rsh(1) command. Unlike rsh(1), which runs commands on a single remote host, pdsh
can run multiple remote commands in parallel. pdsh uses a "sliding window" (or fanout) of threads to
conserve resources on the initiating host while allowing some connections to time out.

When pdsh receives SIGINT (ctrl-C), it lists the status of current threads. A second SIGINT within one
second terminates the program. Pending threads may be canceled by issuing ctrl-Z within one second of
ctrl-C. Pending threads are those that have not yet been initiated, or are still in the process of
connecting to the remote host.

If a remote command is not specified on the command line, pdsh runs interactively, prompting for commands
and executing them when terminated with a carriage return. In interactive mode, target nodes that time
out on the first command are not contacted for subsequent commands, and commands prefixed with an
exclamation point will be executed on the local system.

The core functionality of pdsh may be supplemented by dynamically loadable modules. The modules may
provide a new connection protocol (replacing the standard rcmd(3) protocol used by rsh(1)), filtering
options (e.g. removing hosts that are "down" from the target list), and/or host selection options (e.g.,
-a selects all hosts from a configuration file.). By default, pdsh must have at least one "rcmd" module
loaded. See the RCMD MODULES section for more information.

RCMD MODULES

The method by which pdsh runs commands on remote hosts may be selected at runtime using the -R option
(See OPTIONS below). This functionality is ultimately implemented via dynamically loadable modules, and
so the list of available options may be different from installation to installation. A list of currently
available rcmd modules is printed when using any of the -h, -V, or -L options. The default rcmd module
will also be displayed with the -h and -V options.

A list of rcmd modules currently distributed with pdsh follows.

rsh Uses an internal, thread-safe implementation of BSD rcmd(3) to run commands using the standard
rsh(1) protocol.

exec Executes an arbitrary command for each target host. The first of the pdsh remote arguments is the
local command to execute, followed by any further arguments. Some simple parameters are
substitued on the command line, including %h for the target hostname, %u for the remote username,
and %n for the remote rank [0-n] (To get a literal % use %%). For example, the following would
duplicate using the ssh module to run hostname(1) across the hosts foo[0-10]:

pdsh -R exec -w foo[0-10] ssh -x -l %u %h hostname

and this command line would run grep(1) in parallel across the files console.foo[0-10]:

pdsh -R exec -w foo[0-10] grep BUG console.%h

ssh Uses a variant of popen(3) to run multiple copies of the ssh(1) command.

mrsh This module uses the mrsh(1) protocol to execute jobs on remote hosts. The mrsh protocol uses a
credential based authentication, forgoing the need to allocate reserved ports. In other aspects,
it acts just like rsh. Remote nodes must be running mrshd(8) in order for the mrsh module to
work.

krb4 The krb4 module allows users to execute remote commands after authenticating with kerberos. Of
course, the remote rshd daemons must be kerberized.

xcpu The xcpu module uses the xcpu service to execute remote commands.

OPTIONS

       The list of available options is determined at runtime by supplementing the list of standard pdsh options
       with any options provided by loaded rcmd and misc modules.  In some cases, options  provided  by  modules
       may  conflict  with  each other. In these cases, the modules are incompatible and the first module loaded
       wins.

Standard target nodelist options

-w TARGETS,...
Target and or filter the specified list of hosts. Do not use with any other node selection options
(e.g. -a, -g, if they are available). No spaces are allowed in the comma-separated list.
Arguments in the TARGETS list may include normal host names, a range of hosts in hostlist format
(See HOSTLIST EXPRESSIONS), or a single `-' character to read the list of hosts on stdin.

If a host or hostlist is preceded by a `-' character, this causes those hosts to be explicitly
excluded. If the argument is preceded by a single `^' character, it is taken to be the path to
file containing a list of hosts, one per line. If the item begins with a `/' character, it is
taken as a regular expression on which to filter the list of hosts (a regex argument may also be
optionally trailed by another '/', e.g. /node.*/). A regex or file name argument may also be
preceeded by a minus `-' to exclude instead of include thoses hosts.

A list of hosts may also be preceded by "user@" to specify a remote username other than the
default, or "rcmd_type:" to specify an alternate rcmd connection type for these hosts. When used
together, the rcmd type must be specified first, e.g. "ssh:user1@host0" would use ssh to connect
to host0 as user "user1."

-x host,host,...
Exclude the specified hosts. May be specified in conjunction with other target node list options
such as -a and -g (when available). Hostlists may also be specified to the -x option (see the
HOSTLIST EXPRESSIONS section below). Arguments to -x may also be preceeded by the filename (`^')
and regex ('/') characters as described above, in which case the resulting hosts are excluded as
if they had been given to -w and preceeded with the minus `-' character.

Standard pdsh options

-S Return the largest of the remote command return values.

-h Output usage menu and quit. A list of available rcmd modules will also be printed at the end of
the usage message.

-s Only on AIX, separate remote command stderr and stdout into two sockets.

-q List option values and the target nodelist and exit without action.

-b Disable ctrl-C status feature so that a single ctrl-C kills parallel job. (Batch Mode)

-l user
This option may be used to run remote commands as another user, subject to authorization. For BSD
rcmd, this means the invoking user and system must be listed in the user´s .rhosts file (even for
root).

-t seconds
Set the connect timeout. Default is 10 seconds. This option may also be set via the
PDSH_CONNECT_TIMEOUT environment variable.

-u seconds
Set a limit on the amount of time a remote command is allowed to execute. Default is no limit.
See note in LIMITATIONS if using -u with ssh. This option may also be set via the
PDSH_COMMAND_TIMEOUT environment variable.

-f number
Set the maximum number of simultaneous remote commands to number. The default is 32.

-R name
Set rcmd module to name. This option may also be set via the PDSH_RCMD_TYPE environment variable.
A list of available rcmd modules may be obtained via the -h, -V, or -L options. The default will
be listed with -h or -V.

-M name,...
When multiple misc modules provide the same options to pdsh, the first module initialized "wins"
and subsequent modules are not loaded. The -M option allows a list of modules to be specified
that will be force-initialized before all others, in-effect ensuring that they load without
conflict (unless they conflict with eachother). This option may also be set via the
PDSH_MISC_MODULES environment variable.

-L List info on all loaded pdsh modules and quit.

-N Disable hostname: prefix on lines of output.

-d Include more complete thread status when SIGINT is received, and display connect and command time
statistics on stderr when done.

-V Output pdsh version information, along with list of currently loaded modules, and exit.

machines module options

       -a     Target all nodes from machines file.

genders module options

In addition to the genders options presented below, the genders attribute pdsh_rcmd_type may also be used
in the genders database to specify an alternate rcmd connect type than the pdsh default for hosts with
this attribute. For example, the following line in the genders file

host0 pdsh_rcmd_type=ssh

would cause pdsh to use ssh to connect to host0, even if rsh were the default. This can be overridden on
the commandline with the "rcmd_type:host0" syntax.

-A Target all nodes in genders database. The -A option will target every host listed in genders -- if
you want to omit some hosts by default, see the -a option below.

-a Target all nodes in genders database except those with the "pdsh_all_skip" attribute. This is
shorthand for running "pdsh -A -X pdsh_all_skip ..."

-g attr[=val][,attr[=val],...]
Target nodes that match any of the specified genders attributes (with optional values). Conflicts
with the -a option. If used in combination with other node selection options like -w, the -g
option will select from the supplied node list, instead of from the genders file as a whole.
Otherwise, This option targets the alternate hostnames in the genders database by default. The -i
option provided by the genders module may be used to translate these to the canonical genders
hostnames. If the installed version of genders supports it, attributes supplied to -g may also
take the form of genders queries. Genders queries will query the genders database for the union,
intersection, difference, or complement of genders attributes and values. The set operation union
is represented by two pipe symbols ('||'), intersection by two ampersand symbols ('&&'),
difference by two minus symbols ('--'), and complement by a tilde ('~'). Parentheses may be used
to change the order of operations. See the nodeattr(1) manpage for examples of genders queries.

-X attr[=val][,attr[=val],...]
Exclude nodes that match any of the specified genders attributes (optionally with values). This
option may be used in combination with any other of the node selection options (e.g. -w, -g, -a,
-X may also take the form of genders queries. Please see documentation for the genders -g option
for more information about genders queries.

-i Request translation between canonical and alternate hostnames.

-F filename
Read genders information from filename instead of the system default genders file. If filename
doesn't specify an absolute path then it is taken to be relative to the directory specified by the
PDSH_GENDERS_DIR environment variable (/etc by default). An alternate genders file may also be
specified via the PDSH_GENDERS_FILE environment variable.

nodeupdown module options

       -v     Eliminate target nodes that are considered "down" by libnodeupdown.

slurm module options

       The  slurm  module allows pdsh to target nodes based on currently running SLURM jobs. The slurm module is
       typically called after all other node selection options have been processed, and if no  nodes  have  been
       selected,  the  module  will  attempt  to  read a running jobid from the SLURM_JOBID environment variable
       (which is set when running under a SLURM allocation). If SLURM_JOBID references an invalid job,  it  will
       be silently ignored.

       -j jobid[,jobid,...]
              Target  list  of nodes allocated to the SLURM job jobid. This option may be used multiple times to
              target multiple SLURM jobs. The special argument "all" can be used to  target  all  nodes  running
              SLURM jobs, e.g.  -j all.

       -P partition[,partition,...]
              Target  list  of  nodes  containing  in  the  SLURM  partition partition.  This option may be used
              multiple times to target multiple SLURM partitions and/or partitions may  be  given  in  a  comma-
              delimited list.

torque module options

       The  torque module allows pdsh to target nodes based on currently running Torque/PBS jobs. Similar to the
       slurm module, the torque module is typically called after all other  node  selection  options  have  been
       processed,  and  if no nodes have been selected, the module will attempt to read a running jobid from the
       PBS_JOBID environment variable (which is set when running under a Torque allocation).

       -j jobid[,jobid,...]
              Target list of nodes allocated to the Torque job jobid. This option may be used multiple times  to
              target multiple Torque jobs.

dshgroup module options

       The  dshgroup module allows pdsh to use dsh (or Dancer's shell) style group files from /etc/dsh/group/ or
       ~/.dsh/group/. The default search path may be overridden with the DSHGROUP_PATH environment  variable,  a
       colon-separated list of directories to search. The default value for DSHGROUP_PATH is /etc/dsh/group.

       -g groupname,...
              Target   nodes   in   dsh  group  file  "groupname"  found  in  either  ~/.dsh/group/groupname  or
              /etc/dsh/group/groupname.

       -X groupname,...
              Exclude nodes in dsh group file "groupname."

       As an enhancement in pdsh, dshgroup files may optionally include  other  dshgroup  files  via  a  special
       #include  STRING  syntax.   The argument to #include may be either a file path, or a group name, in which
       case the path used to search for the group file is the same as if the group had been specified to -g.

netgroup module options

       The netgroup module allows pdsh to use  standard  netgroup  entries  to  build  lists  of  target  hosts.
       (/etc/netgroup or NIS)

       -g groupname,...
              Target nodes in netgroup "groupname."

       -X groupname,...
              Exclude nodes in netgroup "groupname."

ENVIRONMENT VARIABLES

PDSH_RCMD_TYPE
Equivalent to the -R option, the value of this environment variable will be used to set the
default rcmd module for pdsh to use (e.g. ssh, rsh).

PDSH_SSH_ARGS
Override the standard arguments that pdsh passes to the ssh(1) command ("-2 -a -x -l%u %h"). The
use of the parameters %u, %h, and %n (as documented in the rcmd/exec section above) is optional.
If these parameters are missing, pdsh will append them to the ssh commandline because it is
assumed they are mandatory.

PDSH_SSH_ARGS_APPEND
Append additional options to the ssh(1) command invoked by pdsh. For example,
PDSH_SSH_ARGS_APPEND="-q" would run ssh in quiet mode, or "-v" would increase the verbosity of
ssh. (Note: these arguments are actually prepended to the ssh commandline to ensure they appear
before any target hostname argument to ssh.)

WCOLL If no other node selection option is used, the WCOLL environment variable may be set to a filename
from which a list of target hosts will be read. The file should contain a list of hosts, one per
line (though each line may contain a hostlist expression. See HOSTLIST EXPRESSIONS section
below).

DSHPATH
If set, the path in DSHPATH will be used as the PATH for the remote processes.

FANOUT Set the pdsh fanout (See description of -f above).

HOSTLIST EXPRESSIONS

       As noted in sections above pdsh accepts lists of hosts the general form: prefix[n-m,l-k,...], where n < m
       and l < k, etc., as an alternative to explicit lists of hosts. This form  should  not  be  confused  with
       regular expression character classes (also denoted by ``[]''). For example, foo[19] does not represent an
       expression matching foo1 or foo9, but rather represents the degenerate hostlist: foo19.

       The  hostlist  syntax is meant only as a convenience on clusters with a "prefixNNN" naming convention and
       specification of ranges should not be considered necessary -- the list foo1,foo9 could  be  specified  as
       such, or by the hostlist foo[1,9].

       Some examples of usage follow:

       Run command on foo01,foo02,...,foo05
           pdsh -w foo[01-05] command

       Run command on foo7,foo9,foo10
            pdsh -w foo[7,9-10] command

       Run command on foo0,foo4,foo5
            pdsh -w foo[0-5] -x foo[1-3] command

       A suffix on the hostname is also supported:

       Run command on foo0-eth0,foo1-eth0,foo2-eth0,foo3-eth0
          pdsh -w foo[0-3]-eth0 command

       As  a  reminder  to  the  reader, some shells will interpret brackets ('[' and ']') for pattern matching.
       Depending on your shell, it may be necessary to enclose ranged lists  within  quotes.   For  example,  in
       tcsh, the first example above should be executed as:

            pdsh -w "foo[01-05]" command

ORIGIN

       Originally  a  rewrite of IBM dsh(1) by Jim Garlick <garlick@llnl.gov> on LLNL's ASCI Blue-Pacific IBM SP
       system. It is now used on Linux clusters at LLNL.

LIMITATIONS

       When using ssh for remote execution, expect the stderr of ssh to be folded in with  that  of  the  remote
       command.  When  invoked  by  pdsh, it is not possible for ssh to prompt for passwords if RSA/DSA keys are
       configured properly, etc..  For ssh implementations that suppport a connect timeout option, pdsh attempts
       to use that option to enforce the  timeout  (e.g.  -oConnectTimeout=T  for  OpenSSH),  otherwise  connect
       timeouts  are  not  supported  when using ssh.  Finally, there is no reliable way for pdsh to ensure that
       remote commands are actually terminated when using a command  timeout.  Thus  if  -u  is  used  with  ssh
       commands may be left running on remote hosts even after timeout has killed local ssh processes.

       The  number of nodes that pdsh can simultaneously execute remote jobs on is limited by the maximum number
       of threads that can be created concurrently, as well as the availability of reserved  ports  in  the  rsh
       module.  On  systems  that  implement  Posix  threads,  the  limit  is  typically defined by the constant
       PTHREADS_THREADS_MAX.