Provided by: makepp_2.0.98.5-2.1_all bug

NAME

       makepp_signatures -- How makepp knows when files have changed

DESCRIPTION

       C: C,
         c_compilation_md5,  M: "md5",  P: "plain",  S: "shared_object",  X: "xml",
         xml_space

       Each file is associated with a signature, which is a string that changes if the file has changed.  Makepp
       compares signatures to see whether it needs to rebuild anything.  The default signature for files is a
       concatenation of the file's modification time and its size, unless you're executing a C/C++ compilation
       command, in which case the default signature is a cryptographic checksum on the file's contents, ignoring
       comments and whitespace.  If you want, you can switch to a different method, or you can define your own
       signature functions.

       How the signature is actually used is controlled by the build check method (see makepp_build_check).
       Normally, if a file's signature changes, the file itself is considered to have changed, and makepp forces
       a rebuild.

       If makepp is building a file, and you don't think it should be, you might want to check the build log
       (see makepplog).  Makepp writes an explanation of what it thought each file depended on, and why it chose
       to rebuild.

       There are several signature methods included in makepp.  Makepp usually picks the most appropriate
       standard one automatically.  However, you can change the signature method for an individual rule by using
       ":signature" modifier on the rule which depends on the files you want to check, or for all rules in a
       makefile by using the "signature" statement, or for all makefiles at once using the "-m" or
       "--signature-method" command line option.

   Mpp::Signature methods included in the distribution
       plain (actually nameless)
           The  plain signature method is the file's modification time and the file's size, concatenated.  These
           values are quickly obtainable from the operating system  and  almost  always  change  when  the  file
           changes.  For symlinks it uses the values of the linkee.  If there is no linkee, i.e. it's a dangling
           symlink, then it uses its own values, but prepends a 0 to mark the fact.

           Makepp  used to look only at the file's modification time, but if you run makepp several times within
           a second (e.g., in a script that's building several small things), sometimes modification times won't
           change.  Then, hopefully the file's size will change.

           If the case where you may run makepp several times a second is a problem for you, you may  find  that
           using the "md5" method is somewhat more reliable.  If makepp builds a file, it flushes its cached MD5
           signatures even if the file's date hasn't changed.

           For  efficiency's  sake,  makepp  won't reread the file and recompute the complex signatures below if
           this plain signature hasn't changed since the last time it computed it.  This can theoretically cause
           a problem, since it's possible to change the file's contents without changing its date and size.   In
           practice,  this is quite hard to do so it's not a serious danger.  In the future, as more filesystems
           switch to timestamps of under a second, hopefully Perl will give us access to this info, making  this
           failsafe.

       C
       c_compilation_md5
           This  is  the method for input files to C like compilers.  It checks if a file's name looks like C or
           C++ source code, including things like Corba IDL.  If it does, this method applies.  If  it  doesn't,
           it  falls  back to plain signatures for binary files (determined by name or else by content) and else
           to "md5".

           The idea is to be independent of formatting changes.  This is done by pulling everything up as far as
           possible, and by eliminating insignificant spaces.  Words are exempt  from  pulling  up,  since  they
           might be macros containing "__LINE__", so they remain on the line where they were.

               // ignored comment

               #ifdef XYZ
                   #include <xyz.h>
               #endif

               int a = 1;

               #line 20
               void f
               (
                   int b
               )
               {
                   a += b + ++c;
               }

                   /* more ignored comment */

           is treated as though it were

               #ifdef XYZ
               #include<xyz.h>
               #endif

               int a=1;
               #line 20
               void f(

               int b){

               a+=b+ ++c;}

           That  way  you can reindent your code or add or change comments without triggering a rebuild, so long
           as you don't change the line numbers.  (This signature method recompiles if line numbers have changed
           because that causes calls to "__LINE__" and most debugging information to change.)  It  also  ignores
           whitespace  and  comments  after  the last token.  This is useful for preventing a useless rebuild if
           your VC adds lines at a "$""Log$" tag when checking in.

           This method is particularly useful for the following situations:

           •   You want to make changes to the comments in a commonly included  header  file,  or  you  want  to
               reformat  or reindent part of it.  For one project that I worked on a long time ago, we were very
               unwilling to correct inaccurate comments in a common header file, even when they  were  seriously
               misleading,  because  doing  so  would  trigger  several  hours of rebuilds.  With this signature
               method, this is no longer a problem.

           •   You like to save your files often, and your editor (unlike emacs) will happily write a  new  copy
               out even if nothing has changed.

           •   You have C/C++ source files which are generated automatically by other build commands (e.g., yacc
               or  some  other  preprocessor).   For  one system I work with, we have a preprocessor which (like
               yacc) produces two output files, a ".cxx" and a ".h" file:

                   %.h %.cxx: %.qtdlg $(HLIB)/Qt/qt_dialog_generator
                       $(HLIB)/Qt/qt_dialog_generator $(input)

               Every time the input file changed, the resulting .h file also was rewritten, and ordinarily  this
               would  trigger  a rebuild of everything that included it.  However, most of the time the contents
               of the .h file didn't actually change (except for a comment about the build time written  by  the
               preprocessor), so a recompilation was not actually necessary.

           Actually  in  practice  this  saves less recompiles than you'd hope for, because mere comment changes
           often add lines.  In order for logging with "__LINE__" or the debugger to  match  your  source,  this
           requires recompilation.  So this signature is specially useless for the "tangle" family of tools from
           literate programming, where your code resides in some bigger file and even changes to a documentation
           section irrelevant to code will be reflected in the extracted source via a "#line" directive.

           If   you   can   live  with  wrong  line  numbers  during  development,  you  can  set  the  variable
           "makepp_signature_C_flat" (with an uppercase C) to some true  value  (like  1).   Then,  whereas  the
           compiler still sees the real file, the above example will be flattened for signing as:

               #ifdef XYZ
               #include<xyz.h>
               #endif
               int a=1;void f(int b){a+=b+ ++c;}

           Note  that  signatures  are  only recalculated when files change.  So you can build for everyone in a
           repository without this option, and those who want the option can  set  it  when  building  in  their
           sandbox.   When  they  first  locally  change  a  file,  even  only  trivially,  that  will  cause  a
           recompilation, because with this option a totally different signature is calculated.  But  then  they
           can reformat the file as much as they want without further recompilation.

           The  opposite  is  also true: Just omitting this option after it was set and recompiling will not fix
           your line numbers.  So, if line numbers matter, don't do a  production  build  in  the  same  sandbox
           without cleaning first.

       md5 This  is the default method, for files not recognized by the "C" method.  Computes an MD5 checksum of
           the file's contents, rather than looking at the file's date or size.  This means that if  you  change
           the date on the file but don't change its contents, makepp won't try to rebuild anything that depends
           on it.

           This is particularly useful if you have some file which is often regenerated during the build process
           that  other  files  depend  on,  but  which  usually  doesn't  actually change.  If you use the "md5"
           signature checking method, makepp will realize that the file's contents haven't changed even  if  the
           file's date has changed.  (Of course, this won't help if the files have a timestamp written inside of
           them, as archive files do for example.)

       shared_object
           This  method  only works if you have the utility "nm" in your path, and it accepts the "-P" option to
           output Posix format.  In that case only  the  names  and  types  of  symbols  in  dynamically  loaded
           libraries  become part of their signature.  The result is that you can change the coding of functions
           without having to relink the programs that use them.

           In the following command the parser will detect an implicit dependency on $(LIBDIR)/libmylib.so,  and
           build  it  if  necessary.   However  the  link  command will only be reperformed whenever the library
           exports a different set of symbols:

               myprog: $(OBJECTS) :signature shared_object
                   $(LD) -L$(LIBDIR) -lmylib $(inputs) -o $(output)

           This works as long as the functions' interfaces don't change.  But in  that  case  you'd  change  the
           declaration, so you'd also need to change the callers.

           Note  that  this  method only applies to files whose name looks like a shared library.  For all other
           files it falls back to "c_compilation_md5", which may in turn fall back to others.

       xml
       xml_space
           These are two similar methods which treat xml canonically  and  differ  only  in  their  handling  of
           whitespace.   The  first  completely  ignores  it  around  tags  and considers it like a single space
           elsewhere, making the signature immune to formatting changes.  The second respects any whitespace  in
           the  xml,  which  is  necessary even if just a small part requires that, like a "<pre>" section in an
           xhtml document.

           Common to both methods is that they sign the essence of each xml document.  Presence or not of a  BOM
           or  "<?xml?>" header is ignored.  Comments are ignored, as is whether text is protected as "CDATA" or
           with entities.  Order and quoting style of attributes doesn't matter, nor does how you  render  empty
           tags.

           For  any file which is not valid xml, or if the Expat based "XML::Parser" or the "XML::LibXML" parser
           is not installed, this falls back to method md5.  If you switch your Perl installation  from  one  of
           the  parsers  to  the  others,  makepp  will think the files are different as soon as their timestamp
           changes.  This is because the result of either parser  is  logically  equivalent,  but  they  produce
           different  signatures.   In  the  unlikely  case  that  this  is a problem, you can force use of only
           "XML::LibXML" by setting in Perl:

               $Mpp::Signature::xml::libxml = 1;

   Extending applicability
       The "C" or "c_compilation_md5" method has a built in list of suffixes it recognizes as being C or C-like.
       If it gets applied to other files it falls back to simpler signature methods.  But many  file  types  are
       syntactically  close  enough  to  C++  for  this method to be useful.  Close enough means C++ comment and
       string syntax and whitespace is meaningless except one space between words (and C++'s  problem  cases  "-
       -", "+ +", "/ *" and "< <").

       It  (and  its  subclasses)  can  now  easily  be  extended to other suffixes.  Anyplace you can specify a
       signature you can now tack on one one of these syntaxes to make the method accept additional filenames:

       C.suffix1,suffix2,suffix3
           One or more comma-separated suffixes can be added to the method by a colon.  For example  "C.ipp,tpp"
           means  that  besides  the built in suffixes it will also apply to files ending in .ipp or .tpp, which
           you might be using for the inline and template part of C++ headers.

       C.(suffix-regexp)
           This is like the previous, but instead of enumerating suffixes, you give a Perl regular expression to
           match the ones you want.  The previous example would be "C.(ipp|tpp)" or "C.([it]pp)" in this syntax.

       C(regexp)
           Without a dot the Perl regular expression can match anywhere in the file  name.   If  it  includes  a
           slash,  it  will be tried against the fully qualified filename, otherwise only against the last part,
           without any directory.  So if you have C++ style suffixless  headers  in  a  directory  include,  use
           "C(include/)"  as  your signature method.  However the above suffix example would be quite nasty this
           way, "C(\.(?:ipp|tpp)$$)" or "C(\.[it]pp$$)" because "$" is the expansion character in makefiles.

   Shortcomings
       Signature methods apply to all files of a rule.  Now if you have a compiler that takes a  C  like  source
       code  and  an  XML  configuration file you'd either need a combined signature method that smartly handles
       both file types, or you must choose an existing method which will not know whether a change in the  other
       file is significant.

       In the future signature method configuration may be changed to filename-pattern, optionally per command.

   Custom methods
       You  can,  if  you want, define your own methods for calculating file signatures and comparing them.  You
       will need to write a Perl module to do this.  Have a look at the comments in  "Mpp/Signature.pm"  in  the
       distribution, and also at the existing signature algorithms in "Mpp/Signature/*.pm" for details.

       Here are some cases where you might want a custom signature method:

       •   When  you  want  all  changes  in  a  file  to  be  ignored.  Say you always want dateStamp.o to be a
           dependency (to force a rebuild), but you don't want to rebuild if only dateStamp.o has changed.   You
           could  define  a  signature  method  that  inherits  from  "c_compilation_md5"  that  recognizes  the
           dateStamp.o file by its name, and always returns a constant value for that file.

       •   When you want to ignore part of a file.  Suppose that you have a program that generates a  file  that
           has  a  date  stamp  in it, but you don't want to recompile if only the date stamp has changed.  Just
           define a signature method similar to "c_compilation_md5" that understands your file format and  skips
           the parts you don't want to take into account.

perl v5.32.0                                       2021-01-06                               MAKEPP_SIGNATURES(1)