Ubuntu Manpage: libpipeline — pipeline manipulation library

Provided by: libpipeline-dev_1.5.7-2_amd64

NAME

       libpipeline — pipeline manipulation library

SYNOPSIS

       #include <pipeline.h>

DESCRIPTION

       libpipeline  is a C library for setting up and running pipelines of processes, without needing to involve
       shell command-line parsing which is often error-prone and insecure.  This  relieves  programmers  of  the
       need to laboriously construct pipelines using lower-level primitives such as fork and execve.

       The  general  way  to  use  libpipeline involves constructing a pipeline structure and adding one or more
       pipecmd structures to it.  A pipecmd represents a subprocess (or “command”), while a pipeline  represents
       a  sequence of subprocesses each of whose outputs is connected to the next one's input, as in the example
       ls | grep  pattern  |  less.   The  calling  program  may  adjust  certain  properties  of  each  command
       independently, such as its environment and nice(3) priority, as well as properties of the entire pipeline
       such as its input and output and the way signals are handled while executing it.  The calling program may
       then start the pipeline, read output from it, wait for it to complete, and gather its exit status.

       Strings passed as const char * function arguments will be copied by the library.

   Functions to build individual commands
       pipecmd *pipecmd_new(const char *name)

             Construct a new command representing execution of a program called name.

       pipecmd *pipecmd_new_argv(const char *name, va_list argv)
       pipecmd *pipecmd_new_args(const char *name, ...)

             Convenience  constructors  wrapping  pipecmd_new()  and  pipecmd_arg().   Construct  a  new command
             representing execution of a program called name with arguments.  Terminate arguments with NULL.

       pipecmd *pipecmd_new_argstr(const char *argstr)

             Split argstr on whitespace to construct a command  and  arguments,  honouring  shell-style  single-
             quoting,  double-quoting, and backslashes, but not other shell evilness like wildcards, semicolons,
             or backquotes.  This is included only to support situations where  command  arguments  are  encoded
             into  configuration  files  and  the  like.   While  it  is safer than system(3), it still involves
             significant string parsing which is inherently riskier than avoiding it altogether.  Please try  to
             avoid using it in new code.

       typedef void pipecmd_function_type (void *);
       typedef void pipecmd_function_free_type (void *);
       pipecmd                *pipecmd_new_function(const char *name,               pipecmd_function_type *func,
             pipecmd_function_free_type *free_func, void *data)

             Construct a new command that calls a given function rather than executing a process.

             The data argument is passed as the function's only argument, and will  be  freed  before  returning
             using free_func (if non-NULL).

             pipecmd_*  functions  that  deal  with  arguments  cannot be used with the command returned by this
             function.

       pipecmd *pipecmd_new_sequencev(const char *name, va_list cmdv)
       pipecmd *pipecmd_new_sequence(const char *name, ...)

             Construct a new command that itself runs a sequence of commands, supplied as  command  *  arguments
             following  name  and  terminated by NULL.  The commands will be executed in forked children; if any
             exits non-zero then it will terminate the sequence, as with "&&" in shell.

             pipecmd_* functions that deal with arguments cannot be used  with  the  command  returned  by  this
             function.

       pipecmd *pipecmd_new_passthrough(void)

             Return a new command that just passes data from its input to its output.

       pipecmd *pipecmd_dup(pipecmd *cmd)

             Return a duplicate of a command.

       void pipecmd_arg(pipecmd *cmd, const char *arg)

             Add an argument to a command.

       void pipecmd_argf(pipecmd *cmd, const char *format, ...)

             Convenience function to add an argument with printf substitutions.

       void pipecmd_argv(pipecmd *cmd, va_list argv)
       void pipecmd_args(pipecmd *cmd, ...)

             Convenience  functions  wrapping  pipecmd_arg()  to  add  multiple  arguments  at  once.  Terminate
             arguments with NULL.

       void pipecmd_argstr(pipecmd *cmd, const char *argstr)

             Split argstr on whitespace to add  a  list  of  arguments,  honouring  shell-style  single-quoting,
             double-quoting,  and  backslashes,  but  not  other  shell  evilness like wildcards, semicolons, or
             backquotes.  This is included only to support situations where command arguments are  encoded  into
             configuration  files and the like.  While it is safer than system(3), it still involves significant
             string parsing which is inherently riskier than avoiding it altogether.  Please try to avoid  using
             it in new code.

       void pipecmd_get_nargs(pipecmd *cmd)

             Return  the  number  of arguments to this command.  Note that this includes the command name as the
             first argument, so the command ‘echo foo bar’ is counted as having three arguments.

             Added in libpipeline 1.1.0.

       void pipecmd_nice(pipecmd *cmd, int value)

             Set the nice(3) value for this command.  Defaults to 0.  Errors while attempting to  set  the  nice
             value are ignored, aside from emitting a debug message.

       void pipecmd_discard_err(pipecmd *cmd, int discard_err)

             If discard_err is non-zero, redirect this command's standard error to /dev/null.  Otherwise, and by
             default, pass it through.  This is usually a bad idea.

       void pipecmd_chdir(pipecmd *cmd, const char *directory)

             Change the working directory to directory while running this command.

             Added in libpipeline 1.3.0.

       void pipecmd_fchdir(pipecmd *cmd, int directory_fd)

             Change  the working directory to the directory given by the open file descriptor directory_fd while
             running this command.

             Added in libpipeline 1.4.0.

       void pipecmd_setenv(pipecmd *cmd, const char *name, const char *value)

             Set environment variable name to value while running this command.

       void pipecmd_unsetenv(pipecmd *cmd, const char *name)

             Unset environment variable name while running this command.

       void pipecmd_clearenv(pipecmd *cmd)

             Clear the environment while running this  command.   (Note  that  environment  operations  work  in
             sequence;  pipecmd_clearenv  followed  by  pipecmd_setenv  causes the command to have just a single
             environment variable set.)  Beware that this may cause unexpected failures, for example if some  of
             the contents of the environment are necessary to execute programs at all (say, PATH).

             Added in libpipeline 1.1.0.

       void  pipecmd_pre_exec(pipecmd *cmd,  pipecmd_function_type *func, pipecmd_function_free_type *free_func,
             void *data)

             Install a pre-exec handler.  This will be run immediately before executing  the  command's  payload
             (process  or  function).   Pass  NULL to clear any existing pre-exec handler.  The data argument is
             passed as the function's only argument, and will be freed before returning using free_func (if non-
             NULL).

             This is similar to pipeline_install_post_fork, except that is specific to a single  command  rather
             than  installing  a global handler, and it runs slightly later (immediately before exec rather than
             immediately after fork).

             Added in libpipeline 1.5.0.

       void pipecmd_sequence_command(pipecmd *cmd, pipecmd *child)

             Add a command to a sequence created using pipecmd_new_sequence().

       void pipecmd_dump(pipecmd *cmd, FILE *stream)

             Dump a string representation of a command to stream.

       char *pipecmd_tostring(pipecmd *cmd)

             Return a string representation of a command.  The caller should free the result.

       void pipecmd_exec(pipecmd *cmd)

             Execute a single command, replacing the current process.  Never returns, instead  exiting  non-zero
             on failure.

             Added in libpipeline 1.1.0.

       void pipecmd_free(pipecmd *cmd)

             Destroy a command.  Safely does nothing if cmd is NULL.

   Functions to build pipelines
       pipeline *pipeline_new(void)

             Construct a new pipeline.

       pipeline *pipeline_new_commandv(pipecmd *cmd1, va_list cmdv)
       pipeline *pipeline_new_commands(pipecmd *cmd1, ...)

             Convenience  constructors wrapping pipeline_new() and pipeline_command().  Construct a new pipeline
             consisting of the given list of commands.  Terminate commands with NULL.

       pipeline *pipeline_new_command_argv(const char *name, va_list argv)
       pipeline *pipeline_new_command_args(const char *name, ...)

             Construct a new pipeline and add a single command to it.

       pipeline *pipeline_join(pipeline *p1, pipeline *p2)

             Joins two pipelines, neither of which are allowed to be started.  Discards want_out,  want_outfile,
             and outfd from p1, and want_in, want_infile, and infd from p2.

       void pipeline_connect(pipeline *source, pipeline *sink, ...)

             Connect  the  input  of  one or more sink pipelines to the output of a source pipeline.  The source
             pipeline may be started, but in that case pipeline_want_out() must have been called with a negative
             fd; otherwise, calls pipeline_want_out(source, -1).  In any event, calls pipeline_want_in(sink, -1)
             on all sinks, none of which are allowed to be started.  Terminate arguments with NULL.

             This is an application-level connection; data may be  intercepted  between  the  pipelines  by  the
             program  before  calling pipeline_pump(), which sets data flowing from the source to the sinks.  It
             is primarily useful when more than one sink pipeline is  involved,  in  which  case  the  pipelines
             cannot simply be concatenated into one.

             The  result  is  similar  to tee(1), except that output can be sent to more than two places and can
             easily be sent to multiple processes.

       void pipeline_command(pipeline *p, pipecmd *cmd)

             Add a command to a pipeline.

       void pipeline_command_argv(pipeline *p, const char *name, va_list argv)
       void pipeline_command_args(pipeline *p, const char *name, ...)

             Construct a new command and add it to a pipeline in one go.

       void pipeline_command_argstr(pipeline *p, const char *argstr)

             Construct a new command from a shell-quoted string and add it to a pipeline in  one  go.   See  the
             comment against pipecmd_new_argstr() above if you're tempted to use this function.

       void pipeline_commandv(pipeline *p, va_list cmdv)
       void pipeline_commands(pipeline *p, ...)

             Convenience  functions  wrapping  pipeline_command()  to  add multiple commands at once.  Terminate
             arguments with NULL.

       void pipeline_want_in(pipeline *p, int fd)
       void pipeline_want_out(pipeline *p, int fd)

             Set file descriptors to use as the input and output of the whole pipeline.  If non-negative, fd  is
             used  directly as a file descriptor.  If negative, pipeline_start() will create pipes and store the
             input writing half and  the  output  reading  half  in  the  pipeline's  infd  or  outfd  field  as
             appropriate.    The   default   is   to   leave  input  and  output  as  stdin  and  stdout  unless
             pipeline_want_infile() or pipeline_want_outfile() respectively has been called.

             Calling   these   functions   supersedes   any   previous   call   to   pipeline_want_infile()   or
             pipeline_want_outfile() respectively.

       void pipeline_want_infile(pipeline *p, const char *file)
       void pipeline_want_outfile(pipeline *p, const char *file)

             Set  file  names  to  open and use as the input and output of the whole pipeline.  This may be more
             convenient than supplying file descriptors, and guarantees that the files are opened with the  same
             privileges under which the pipeline is run.

             Calling  these  functions (even with NULL, which returns to the default of leaving input and output
             as stdin and stdout) supersedes any previous call to pipeline_want_in() or  pipeline_want_outfile()
             respectively.

             The  given  files  will be opened when the pipeline is started.  If an output file does not already
             exist, it is created (with mode 0666 modified in the usual way by umask); if it does exist, then it
             is truncated.

       void pipeline_ignore_signals(pipeline *p, int ignore_signals)

             If ignore_signals is non-zero, ignore SIGINT and SIGQUIT in the calling process while the  pipeline
             is running, like system(3).  Otherwise, and by default, leave their dispositions unchanged.

       int pipeline_get_ncommands(pipeline *p)

             Return the number of commands in this pipeline.

       pipecmd *pipeline_get_command(pipeline *p, int n)

             Return command number n from this pipeline, counting from zero, or NULL if n is out of range.

       pipecmd *pipeline_set_command(pipeline *p, int n, pipecmd *cmd)

             Set  command number n in this pipeline, counting from zero, to cmd, and return the previous command
             in that position.  Do nothing and return NULL if n is out of range.

       pid_t pipeline_get_pid(pipeline *p, int n)

             Return the process ID of command number n from this pipeline, counting  from  zero.   The  pipeline
             must  be  started.   Return  -1  if n is out of range or if the command has already exited and been
             reaped.

             Added in libpipeline 1.2.0.

       FILE *pipeline_get_infile(pipeline *p)
       FILE *pipeline_get_outfile(pipeline *p)

             Get streams corresponding to infd and outfd respectively.  The pipeline must be started.

       void pipeline_dump(pipeline *p, FILE *stream)

             Dump a string representation of p to stream.

       char *pipeline_tostring(pipeline *p)

             Return a string representation of p.  The caller should free the result.

       void pipeline_free(pipeline *p)

             Destroy a pipeline and all its commands.  Safely does nothing if p  is  NULL.   May  wait  for  the
             pipeline to complete if it has not already done so.

   Functions to run pipelines and handle signals
       typedef void pipeline_post_fork_fn (void);
       void pipeline_install_post_fork(pipeline_post_fork_fn *fn)

             Install a post-fork handler.  This will be run in any child process immediately after it is forked.
             For  instance, this may be used for cleaning up application-specific signal handlers.  Pass NULL to
             clear any existing post-fork handler.

             See pipecmd_pre_exec for a similar facility limited to a single command rather than global  to  the
             calling process.

       void pipeline_start(pipeline *p)

             Start  the  processes  in  a  pipeline.   Installs  this  library's  SIGCHLD handler if not already
             installed.  Calls error (FATAL) on error.

             The standard file descriptors (0, 1, and 2) must be open before calling this function.

       int pipeline_wait_all(pipeline *p, int **statuses, int *n_statuses)

             Wait for a pipeline to complete.  Set *statuses to a newly-allocated array  of  wait  statuses,  as
             returned  by  waitpid(2), and *n_statuses to the length of that array.  The return value is similar
             to the exit status that a shell would return, with some modifications.  If the last  command  exits
             with a signal (other than SIGPIPE, which is considered equivalent to exiting zero), then the return
             value  is  128  plus  the  signal number; if the last command exits normally but non-zero, then the
             return value is its exit status; if any other command exits non-zero, then the return value is 127;
             otherwise, the return value is 0.  This means that the return value is only 0 if  all  commands  in
             the pipeline exit successfully.

       int pipeline_wait(pipeline *p)

             Wait  for  a  pipeline  to  complete  and  return  its  combined  exit  status,  calculated  as for
             pipeline_wait_all().

       int pipeline_run(pipeline *p)

             Start a pipeline, wait for it to complete, and free it, all in one go.

       void pipeline_pump(pipeline *p, ...)

             Pump data among one or more pipelines connected using pipeline_connect() until all source pipelines
             have reached end-of-file and all data has been written to all  sinks  (or  failed).   All  relevant
             pipelines  must  be supplied: that is, no pipeline that has been connected to a source pipeline may
             be supplied unless that source pipeline is also supplied.  Automatically starts  all  pipelines  if
             they are not already started, but does not wait for them.  Terminate arguments with NULL.

   Functions to read output from pipelines
       In  general,  output is returned as a pointer into a buffer owned by the pipeline, which is automatically
       freed when pipeline_free() is called.  This saves the caller from having to  explicitly  free  individual
       blocks of output data.

       const char *pipeline_read(pipeline *p, size_t *len)

             Read len bytes of data from the pipeline, returning the data block.  len is updated with the number
             of bytes read.

       const char *pipeline_peek(pipeline *p, size_t *len)

             Look  ahead  in  the  pipeline's  output  for  len bytes of data, returning the data block.  len is
             updated with the number of bytes read.  The starting position of the  next  read  or  peek  is  not
             affected by this call.

       size_t pipeline_peek_size(pipeline *p)

             Return the number of bytes of data that can be read using pipeline_read() or pipeline_peek() solely
             from the peek cache, without having to read from the pipeline itself (and thus potentially block).

       void pipeline_peek_skip(pipeline *p, size_t len)

             Skip over and discard len bytes of data from the peek cache.  Asserts that enough data is available
             to skip, so you may want to check using pipeline_peek_size() first.

       const char *pipeline_readline(pipeline *p)

             Read a line of data from the pipeline, returning it.

       const char *pipeline_peekline(pipeline *p)

             Look ahead in the pipeline's output for a line of data, returning it.  The starting position of the
             next read or peek is not affected by this call.

   Signal handling
       libpipeline  installs  a  signal  handler for SIGCHLD, and collects the exit status of child processes in
       pipeline_wait().  Applications using this library must either refrain from changing  the  disposition  of
       SIGCHLD  (in other words, must rely on libpipeline for all child process handling) or else must make sure
       to restore libpipeline's SIGCHLD handler before calling any of its functions.

       If the ignore_signals flag is set in a pipeline (which is the  default),  then  the  SIGINT  and  SIGQUIT
       signals  will  be  ignored  in  the  parent  process while child processes are running.  This mirrors the
       behaviour of system(3).

       libpipeline leaves child processes with the default disposition  of  SIGPIPE,  namely  to  terminate  the
       process.  It ignores SIGPIPE in the parent process while running pipeline_pump().

   Reaping of child processes
       libpipeline installs a SIGCHLD handler that will attempt to reap child processes which have exited.  This
       calls  waitpid(2)  with  -1,  so  it will reap any child process, not merely those created by way of this
       library.  At present, this means that if the calling program forks other child processes which  may  exit
       while  a  pipeline is running, the program is not guaranteed to be able to collect exit statuses of those
       processes.

       You should not rely on this behaviour, and in future it  may  be  modified  either  to  reap  only  child
       processes  created  by  this  library  or to provide a way to return foreign statuses to the application.
       Please contact the author if you have an example application and  would  like  to  help  design  such  an
       interface.

ENVIRONMENT

       If  the  PIPELINE_DEBUG environment variable is set to “1”, then libpipeline will emit debugging messages
       on standard error.

       If the PIPELINE_QUIET environment variable is set to  any  value,  then  libpipeline  will  refrain  from
       printing an error message when a subprocess is terminated by a signal.  Added in libpipeline 1.4.0.

EXAMPLES

       In  the  following  examples,  function  names  starting  with pipecmd_ or pipeline_ are real libpipeline
       functions, while any other function names are pseudocode.

       The simplest case is simple.  To run a single command, such as mv source dest:

             pipeline *p = pipeline_new_command_args ("mv", source, dest, NULL);
             int status = pipeline_run (p);

       libpipeline is often used to mimic shell pipelines, such as the following example:

             zsoelim < input-file | tbl | nroff -mandoc -Tutf8

       The code to construct this would be:

             pipeline *p;
             int status;

             p = pipeline_new ();
             pipeline_want_infile (p, "input-file");
             pipeline_command_args (p, "zsoelim", NULL);
             pipeline_command_args (p, "tbl", NULL);
             pipeline_command_args (p, "nroff", "-mandoc", "-Tutf8", NULL);
             status = pipeline_run (p);

       You might want to construct a command more dynamically:

             pipecmd *manconv = pipecmd_new_args ("manconv", "-f", from_code,
                                                  "-t", "UTF-8", NULL);
             if (quiet)
                     pipecmd_arg (manconv, "-q");
             pipeline_command (p, manconv);

       Perhaps you want an environment variable set only while running a certain command:

             pipecmd *less = pipecmd_new ("less");
             pipecmd_setenv (less, "LESSCHARSET", lesscharset);

       You might find yourself needing to pass the output of one pipeline to several other pipelines, in a “tee”
       arrangement:

             pipeline *source, *sink1, *sink2;

             source = make_source ();
             sink1 = make_sink1 ();
             sink2 = make_sink2 ();
             pipeline_connect (source, sink1, sink2, NULL);
             /* Pump data among these pipelines until there's nothing left. */
             pipeline_pump (source, sink1, sink2, NULL);
             pipeline_free (sink2);
             pipeline_free (sink1);
             pipeline_free (source);

       Maybe one of your commands is actually an in-process function, rather than an external program:

             pipecmd *inproc = pipecmd_new_function ("in-process", &func,
                                                     NULL, NULL);
             pipeline_command (p, inproc);

       Sometimes your program needs to consume the output of a pipeline, rather than  sending  it  all  to  some
       other subprocess:

             pipeline *p = make_pipeline ();
             const char *line;

             pipeline_want_out (p, -1);
             pipeline_start (p);
             line = pipeline_peekline (p);
             if (!strstr (line, "coding: UTF-8"))
                     printf ("Unicode text follows:0);
             while (line = pipeline_readline (p))
                     printf ("  %s", line);
             pipeline_free (p);

AUTHORS

       Most of libpipeline was written by Colin Watson <cjwatson@debian.org>, originally for use in man-db.  The
       initial  version  was  based  very  loosely on the run_pipeline() function in GNU groff, written by James
       Clark <jjc@jclark.com>.  It also contains library code by Markus Armbruster, and by various  contributors
       to Gnulib.

       libpipeline  is  licensed  under the GNU General Public License, version 3 or later.  See the README file
       for full details.

BUGS

       Using this library in a program which runs any other child processes  and/or  installs  its  own  SIGCHLD
       handler is unlikely to work.

GNU                                              April 28, 2022                                   LIBPIPELINE(3)

NAME

SYNOPSIS

DESCRIPTION

ENVIRONMENT

EXAMPLES

SEE ALSO

AUTHORS

BUGS