Provided by: manpages-posix_2017a-2_all bug

PROLOG

       This  manual  page  is part of the POSIX Programmer's Manual.  The Linux implementation of this interface
       may differ (consult the corresponding Linux manual page for details of Linux behavior), or the  interface
       may not be implemented on Linux.

NAME

       comm — select or reject lines common to two files

SYNOPSIS

       comm [-123] file1 file2

DESCRIPTION

       The  comm  utility shall read file1 and file2, which should be ordered in the current collating sequence,
       and produce three text columns as output: lines only in file1, lines only in file2,  and  lines  in  both
       files.

       If the lines in both files are not ordered according to the collating sequence of the current locale, the
       results are unspecified.

       If the collating sequence of the current locale does not have a total ordering of all characters (see the
       Base  Definitions  volume  of POSIX.1‐2017, Section 7.3.2, LC_COLLATE) and any lines from the input files
       collate equally but are not identical, comm should treat them as different lines but may  treat  them  as
       being  the  same.  If  it  treats them as different, comm should expect them to be ordered according to a
       further byte-by-byte comparison using the collating sequence for the POSIX locale and  if  they  are  not
       ordered  in this way, the output of comm can identify such lines as being both unique to file1 and unique
       to file2 instead of being in both files.

OPTIONS

       The comm utility shall conform to the Base Definitions volume  of  POSIX.1‐2017,  Section  12.2,  Utility
       Syntax Guidelines.

       The following options shall be supported:

       -1        Suppress the output column of lines unique to file1.

       -2        Suppress the output column of lines unique to file2.

       -3        Suppress the output column of lines duplicated in file1 and file2.

OPERANDS

       The following operands shall be supported:

       file1     A pathname of the first file to be compared. If file1 is '-', the standard input shall be used.

       file2     A  pathname  of  the  second  file to be compared. If file2 is '-', the standard input shall be
                 used.

       If both file1 and file2 refer to standard input or to the same FIFO special, block special, or  character
       special file, the results are undefined.

STDIN

       The standard input shall be used only if one of the file1 or file2 operands refers to standard input. See
       the INPUT FILES section.

INPUT FILES

       The input files shall be text files.

ENVIRONMENT VARIABLES

       The following environment variables shall affect the execution of comm:

       LANG      Provide a default value for the internationalization variables that are unset or null. (See the
                 Base  Definitions  volume  of POSIX.1‐2017, Section 8.2, Internationalization Variables for the
                 precedence  of  internationalization  variables  used  to  determine  the  values   of   locale
                 categories.)

       LC_ALL    If  set  to a non-empty string value, override the values of all the other internationalization
                 variables.

       LC_COLLATE
                 Determine the locale for the collating sequence comm expects to have been used when  the  input
                 files were sorted.

       LC_CTYPE  Determine  the  locale  for the interpretation of sequences of bytes of text data as characters
                 (for example, single-byte as opposed to multi-byte characters in arguments and input files).

       LC_MESSAGES
                 Determine the locale that should be used to  affect  the  format  and  contents  of  diagnostic
                 messages written to standard error.

       NLSPATH   Determine the location of message catalogs for the processing of LC_MESSAGES.

ASYNCHRONOUS EVENTS

       Default.

STDOUT

       The  comm  utility  shall produce output depending on the options selected. If the -1, -2, and -3 options
       are all selected, comm shall write nothing to standard output.

       If the -1 option is not selected, lines contained only in file1 shall be written using the format:

           "%s\n", <line in file1>

       If the -2 option is not selected, lines contained only in file2 are written using the format:

           "%s%s\n", <lead>, <line in file2>

       where the string <lead> is as follows:

       <tab>     The -1 option is not selected.

       null string
                 The -1 option is selected.

       If the -3 option is not selected, lines contained in both files shall be written using the format:

           "%s%s\n", <lead>, <line in both>

       where the string <lead> is as follows:

       <tab><tab>
                 Neither the -1 nor the -2 option is selected.

       <tab>     Exactly one of the -1 and -2 options is selected.

       null string
                 Both the -1 and -2 options are selected.

       If the input files were ordered according to the collating sequence of  the  current  locale,  the  lines
       written  shall be in the collating sequence of the current locale. If the input files contained any lines
       that collated equally but were not identical and within each file those lines were ordered according to a
       further byte-by-byte comparison using the collating sequence for the POSIX locale, and comm treated  them
       as  different  lines,  then  lines  written  that collate equally but are not identical should be ordered
       according to a further byte-by-byte comparison using the collating sequence for the POSIX locale.

STDERR

       The standard error shall be used only for diagnostic messages.

OUTPUT FILES

       None.

EXTENDED DESCRIPTION

       None.

EXIT STATUS

       The following exit values shall be returned:

        0    All input files were successfully output as specified.

       >0    An error occurred.

CONSEQUENCES OF ERRORS

       Default.

       The following sections are informative.

APPLICATION USAGE

       If the input files are not properly presorted, the output of comm might not be useful.

       When using comm to process pathnames, it is recommended that LC_ALL, or at least LC_CTYPE and LC_COLLATE,
       are set to POSIX or C in the environment, since pathnames can contain byte sequences  that  do  not  form
       valid  characters  in some locales, in which case the utility's behavior would be undefined. In the POSIX
       locale each byte is a valid single-byte character, and therefore this problem is avoided.

       If the collating sequence of the current locale does not have a total ordering of  all  characters,  this
       can affect the behavior of comm in the following ways:

        *  If  comm  treats  lines  as being the same only if they are identical, some lines can be misleadingly
           identified as being both unique to file1 and unique to file2.

        *  If comm treats lines as being the same if they collate equally and a line from file1 collates equally
           with a line from file2 but is not identical to it, one of the lines  is  misleadingly  identified  as
           being in both files and the other is not written to the output at all.

       Such  problems  can  be  avoided  by  forcing  the  use  of  the POSIX locale; for example, the following
       identifies lines in both file1 and file2:

           LC_ALL=POSIX sort file1 > file1.posix
           LC_ALL=POSIX sort file2 > file2.posix
           LC_ALL=POSIX comm -12 file1.posix file2.posix | sort

       The final sort re-sorts the output of comm according to the collating sequence of  the  original  locale.
       Doing this might be difficult if more than one column is output and leading <blank>s cannot be ignored.

EXAMPLES

       If  a file named xcu contains a sorted list of the utilities in this volume of POSIX.1‐2017, a file named
       xpg3 contains a sorted list of the utilities specified in the X/Open Portability Guide, Issue  3,  and  a
       file  named  svid89  contains  a  sorted list of the utilities in the System V Interface Definition Third
       Edition:

           comm -23 xcu xpg3 | comm -23 - svid89

       would print a list of utilities in this volume of POSIX.1‐2017 not  specified  by  either  of  the  other
       documents:

           comm -12 xcu xpg3 | comm -12 - svid89

       would print a list of utilities specified by all three documents, and:

           comm -12 xpg3 svid89 | comm -23 - xcu

       would  print a list of utilities specified by both XPG3 and the SVID, but not specified in this volume of
       POSIX.1‐2017.

RATIONALE

       None.

FUTURE DIRECTIONS

       A future version of this standard may require that if any lines from the input files collate equally  but
       are not identical, then comm treats them as different lines and expects them to be ordered according to a
       further byte-by-byte comparison using the collating sequence for the POSIX locale.

       A  future  version of this standard may require that if the input files contained any lines that collated
       equally but were not identical and within each file those lines were ordered according to a further byte-
       by-byte comparison using the collating sequence for the POSIX locale, then  lines  written  that  collate
       equally  but  are  not  identical  are  ordered  according to a further byte-by-byte comparison using the
       collating sequence for the POSIX locale.

SEE ALSO

       cmp, diff, sort, uniq

       The Base Definitions volume of POSIX.1‐2017, Section 7.3.2, LC_COLLATE, Chapter 8, Environment Variables,
       Section 12.2, Utility Syntax Guidelines

COPYRIGHT

       Portions of this text are reprinted and reproduced in electronic form from IEEE Std 1003.1-2017, Standard
       for Information  Technology  --  Portable  Operating  System  Interface  (POSIX),  The  Open  Group  Base
       Specifications  Issue  7, 2018 Edition, Copyright (C) 2018 by the Institute of Electrical and Electronics
       Engineers, Inc and The Open Group.  In the event of any discrepancy between this version and the original
       IEEE and The Open Group Standard, the original IEEE and The Open Group Standard is the referee  document.
       The original Standard can be obtained online at http://www.opengroup.org/unix/online.html .

       Any  typographical  or formatting errors that appear in this page are most likely to have been introduced
       during  the  conversion  of  the  source  files  to  man  page  format.  To  report  such   errors,   see
       https://www.kernel.org/doc/man-pages/reporting_bugs.html .

IEEE/The Open Group                                   2017                                          COMM(1POSIX)