Provided by: nvidia-compute-utils-575-server_575.57.08-0ubuntu0.24.04.2_amd64 bug

NAME

       nvidia-cuda-mps-control - NVIDIA CUDA Multi Process Service management program

SYNOPSIS

       nvidia-cuda-mps-control [-d | -f]

DESCRIPTION

       MPS  is  a runtime service designed to let multiple MPI processes using CUDA to run concurrently in a way
       that's transparent to the MPI program.  A CUDA program runs in MPS mode if  the  MPS  control  daemon  is
       running on the system.

       When  CUDA  is  first  initialized  in  a program, the CUDA driver attempts to connect to the MPS control
       daemon. If the connection attempt fails, the program continues to run as it normally would  without  MPS.
       If  however,  the  connection  attempt  to the control daemon succeeds, the CUDA driver then requests the
       daemon to start an MPS server on its behalf. If there's an MPS server already running, and the user id of
       that server process matches that of the requesting client process, the control daemon simply notifies the
       client process of it, which then proceeds to connect to the server. If  there's  no  MPS  server  already
       running  on  the system, the control daemon launches an MPS server with the same user id (UID) as that of
       the requesting client process. If there's an MPS server already running, but with  a  different  user  id
       than  that  of the client process, the control daemon requests the existing server to shutdown as soon as
       all its clients are done. Once the existing server has terminated, the  control  daemon  launches  a  new
       server with the user id same as that of the queued client process.

       The  MPS  server  creates  the  shared GPU context, and manages its clients.  An MPS server can support a
       finite amount of CUDA contexts determined by the hardware architecture it  is  running  on.  For  compute
       capability SM 3.5 through SM 6.0 the limit is 16 clients per GPU at a time. Compute capability SM 7.0 has
       a  limit of 48. MPS is transparent to CUDA programs, with all the complexity of communication between the
       client process, the server and the control daemon hidden within the driver binaries.

       Currently, CUDA MPS is available on 64-bit Linux only, requires a device that  supports  Unified  Virtual
       Address  (UVA) and has compute capability SM 3.5 or higher.  Applications requiring pre-CUDA 4.0 APIs are
       not supported under CUDA MPS. Certain capabilities are only available starting with compute capability SM
       7.0.

       Refer to the MPS documentation on NVIDiA Docs for more details.

OPTIONS

   -d
       Start the MPS control daemon in background mode, assuming the user  has  enough  privilege  (e.g.  root).
       Parent process exits when control daemon started listening for client connections.

   -f
       Start  the MPS control daemon in foreground mode, assuming the user has enough privilege (e.g. root). The
       debug messages are sent to standard output.

   -multiuser-server
       Relax the single UID user requirement, allowing any UID to connect and share an  MPS  server  started  as
       root.

   -h, --help
       Print a help message.

   <no arguments>
       Start the front-end management user interface to the MPS control daemon, which needs to be started first.
       The  front-end  UI  keeps  reading  commands from stdin until EOF.  Commands are separated by the newline
       character. If an invalid command is issued and rejected, an error message will be printed to stdout.  The
       exit  status of the front-end UI is zero if communication with the daemon is successful. A non-zero value
       is returned if the daemon is not found or connection to the daemon is broken unexpectedly. See the "quit"
       command below for more information about the exit status.

       Commands supported by the MPS control daemon:

       get_server_list
              Print out a list of PIDs of all MPS servers.

       get_server_status PID
              Print out the status of the server with given (PID).

       start_server -uid UID
              Start a new MPS server for the specified user (UID).

       shutdown_server PID [-f]
              Shutdown the MPS server with given PID. MPS server only exits after all clients disconnect and the
              MPS server may accept new clients while there is a  connected  client.   -f  is  forced  immediate
              shutdown.  If  a  client  launches a faulty kernel that runs forever, a forced shutdown of the MPS
              server may be required, since the MPS server creates and issues GPU work on behalf of its clients.

       get_client_list PID
              Print out a list of PIDs of all clients connected to the MPS server with given PID.

       quit [-t TIMEOUT]
              Shutdown the MPS control daemon process  and  all  MPS  servers.  The  MPS  control  daemon  stops
              accepting  new clients while waiting for current MPS servers and MPS clients to finish. If TIMEOUT
              is specified (in seconds), the daemon will force MPS servers to shutdown if they are still running
              after TIMEOUT seconds.

              This command is synchronous. The front-end UI waits for the daemon to shutdown, then  returns  the
              daemon's exit status. The exit status is zero iff all MPS servers have exited gracefully.

       Commands available to Volta MPS control daemon:

       get_device_client_list PID
              List  the devices and PIDs of client applications that enumerated this device. It optionally takes
              the server instance PID.

       set_default_active_thread_percentage percentage
              Set the default active thread percentage for MPS servers. If there is already  a  server  spawned,
              this  command  will  only  affect  the  next  server.  The  set value is lost if a quit command is
              executed. The default is 100.

       get_default_active_thread_percentage
              Query the current default available thread percentage.

       set_active_thread_percentage PID percentage
              Set the active thread percentage for the MPS server instance of the given PID. All clients created
              with that server afterwards will observe the new limit. Existing clients are not affected.

       get_active_thread_percentage PID
              Query the current available thread percentage of the MPS server instance of the given PID.

       set_default_device_pinned_mem_limit dev value
              Sets the default device pinned memory limit for for the MPS servers. If there is already a  server
              spawned,  this  command  will  only  affect  the  next server. The value must be in the form of an
              integer followed by a qualifier, either “G” or “M”  that  specifies  the  value  in  Gigabytes  or
              Megabytes respectively.

       get_default_device_pinned_mem_limit dev
              Query the current default pinned memory limit for the device.

       set_device_pinned_mem_limit PID dev value
              Overrides  the  device  pinned  memory  limit for the MPS server of the given PID. All the clients
              created with that server afterwards will observe the new limit. Existing clients are not affected.

       get_default_device_pinned_mem_limit PID dev
              Query the current device pinned memory limit of the MPS server instance of the given PID  for  the
              device dev.

       terminate_client server PID client PID
              Terminates  all the outstanding GPU work of the MPS client process of the given client PID running
              on the MPS server denoted by <server PID>.

       ps [-p PID]
              Reports a snapshot of the current client processes.

       set_default_client_priority priority
              Set the client priority level that will  be  used  for  new  clients.  Priority  values  are  only
              considered  as  hints  to  the  CUDA driver and can be ignored or overriden depending on platform.
              priority follows a convention where  smaller  numbers  are  higher  priorities,  and  the  default
              priority  value  is 0. The only other supported value for priority is 1, which represents a below-
              normal priority level.

       get_default_client_priority
              Query the current priority value that will be used for new clients.

ENVIRONMENT

       CUDA_MPS_PIPE_DIRECTORY
              Specify the directory that contains the named pipes and UNIX domain sockets used for communication
              among the MPS control, MPS server, and MPS clients. The value of this environment variable  should
              be  consistent  in  the  MPS  control  daemon  and  all MPS client processes. Default directory is
              /tmp/nvidia-mps

       CUDA_MPS_LOG_DIRECTORY
              Specify the directory that contains the MPS log files. This variable is used by  the  MPS  control
              daemon only. Default directory is /var/log/nvidia-mps

       CUDA_VISIBLE_DEVICES
              Specify which CUDA devices are visible to the control daemon. Takes either an index value or UUID.

       CUDA_DEVICE_MAX_CONNECTIONS
              Specify the preferred number of connections between the host and device.

       CUDA_MPS_ACTIVE_THREAD_PERCENTAGE
              Specify the portion of available threads that clients can use on Volta+ hardware. Setting this for
              the  control  daemon  will set the default active thread percentage for all servers spawned, while
              setting this for a client or client context will constrain the active thread percentage  for  that
              unit and cannot be higher than the active thread percentage value set for the control daemon.

       CUDA_MPS_ENABLE_PER_CTX_DEVICE_MULTIPROCESSOR_PARTITIONING
              Specify   whether   individual   client   contexts  are  allowed  to  have  different  values  for
              CUDA_MPS_ACTIVE_THREAD_PERCENTAGE.

       CUDA_MPS_PINNED_DEVICE_MEM_LIMIT
              Specify the amount of GPU memory available to MPS client processes.

       CUDA_MPS_CLIENT_PRIORITY
              Specify the default client priority value at initialization.

FILES

       Log files created by the MPS control daemon in the specified directory

       control.log
              Record startup and shutdown of MPS control daemon, user commands issued with  their  results,  and
              status of MPS servers.

       server.log
              Record startup and shutdown of MPS servers, and status of MPS clients.

nvidia-cuda-mps-control                            2013-02-26                         nvidia-cuda-mps-control(1)