Provided by: makepp_2.0.98.5-2.1_all bug

NAME

       makepp_tutorial_compilation -- Unix compilation commands

DESCRIPTION

       Skip this this manual page if you have a good grasp on what the compilation commands do.

       I find that distressingly few people seem to be taught in their programming classes is how to go about
       compiling programs once they've written them.  Novices rely either on a single memorized command, or else
       on the builtin rules in make.  I have been surprised by extremely computer literate people who learned to
       compile without optimization because they simply never were told how important it is.  Rudimentary
       knowledge of how compilation commands work may make your programs run twice as fast or more, so it's
       worth at least five minutes.  This page describes just about everything you'll need to know to compile C
       or C++ programs on just about any variant of Unix.

       The examples will be mostly for C, since C++ compilation is identical except that the name of the
       compiler is different.  Suppose you're compiling source code in a file called "xyz.c" and you want to
       build a program called "xyz".  What must happen?

       You may know that you can build your program in one step, using a command like this:

           cc -g xyz.c -o xyz

       This will work, but it conceals a two-step process that you must understand if you are writing makefiles.
       (Actually, there are more than two steps, but you only have to understand two of them.)  For a program of
       more than one module, the two steps are usually explicitly separated.

   Compilation
       The first step is the translation of your C or C++ source code into a binary file called an object file.
       Object files usually have an extension of ".o". (For some more recent projects, ".lo" is also used for a
       slightly different kind of object file.)

       The command to produce an object file on Unix looks something like this:

           cc -g -c xyz.c -o xyz.o

       "cc" is the C compiler.  Sometimes alternate C compilers are used; a very common one is called "gcc".  A
       common C++ compiler is the GNU compiler, usually called "g++".  Virtually all C and C++ compilers on Unix
       have the same syntax for the rest of the command (at least for basic operations), so the only difference
       would be the first word.

       We'll explain what the "-g" option does later.

       The "-c" option tells the C compiler to produce a ".o" file as output.  (If you don't specify "-c", then
       it performs the second compilation step automatically.)

       The "-o xyz.o" option tells the compiler what the name of the object file is.  You can omit this, as long
       as the name of the object file is the same as the name of the source file except for the ".o" extension.

       For the most part, the order of the options and the file names does not matter.  One important exception
       is that the output file must immediately follow "-o".

   Linking
       The second step of building a program is called linking.  An object file cannot be run directly; it's an
       intermediate form that must be linked to other components in order to produce a program.  Other
       components might include:

       •   Libraries.   A  library,  roughly  speaking,  is  a collection of object modules that are included as
           necessary.  For example, if your program calls the "printf" function,  then  the  definition  of  the
           "printf"  function  must  be  included  from  the system C library.  Some libraries are automatically
           linked into your program (e.g., the one containing "printf") so you never need to worry about them.

       •   Object files derived from other source files in your program.  If you write your program so  that  it
           actually  has  several source files, normally you would compile each source file to a separate object
           file and then link them all together.

       The linker is the program responsible for taking a collection of object files and libraries  and  linking
       them together to produce an executable file.  The executable file is the program you actually run.

       The command to link the program looks something like this:

           cc -g xyz.o -o xyz

       It  may  seem odd, but we usually run the same program ("cc") to perform the linking.  What happens under
       the surface is that the "cc" program immediately passes off control to a different program  (the  linker,
       sometimes  called  the  loader,  or  "ld")  after adding a number of complex pieces of information to the
       command line.  For example, "cc" tells "ld" where the system library is that includes the  definition  of
       functions  like  "printf".   Until  you  start  writing shared libraries, you usually do not need to deal
       directly with "ld".

       If you do not specify "-o xyz", then the output file will be called "a.out", which seems to me  to  be  a
       completely useless and confusing convention.  So always specify "-o" on the linking step.

       If  your  program  has  more  than  one  object file, you should specify all the object files on the link
       command.

   Why you need to separate the steps
       Why not just use the simple, one-step command, like this:

           cc -g xyz.c -o xyz

       instead of the more complicated two-stage compilation

           cc -g -c xyz.c -o xyz.o
           cc -g xyz.o -o xyz

       if internally the first is converted into the second?  The difference is important only if there is  more
       than  one  module  in  your program.  Suppose we have an additional module, "abc.c".  Now our compilation
       looks like this:

           # One-stage command.
           cc -g xyz.c abc.c -o xyz

       or

           # Two-stage command.
           cc -g -c xyz.c -o xyz.o
           cc -g -c abc.c -o abc.o
           cc -g xyz.o abc.o -o xyz

       The first method, of course, is converted internally into  the  second  method.   This  means  that  both
       "xyz.c"  and  "abc.c"  are  recompiled  each  time  the command is run.  But if you only changed "xyz.c",
       there's no need to recompile "abc.c", so the second line of the two-stage commands does not  need  to  be
       done.   This  can  make  a huge difference in compilation time, especially if you have many modules.  For
       this reason, virtually all makefiles keep the two compilation steps separate.

       That's pretty much the basics, but there are a few more little details you really should know about.

   Debugging vs. optimization
       Usually programmers compile a program either either for debug or for speed.   Compilation  for  speed  is
       called  optimization;  compiling  with  optimization can make your code run up to 5 times faster or more,
       depending on your code, your processor, and your compiler.

       With such dramatic gains possible, why would you ever not want to optimize?  The most important answer is
       that optimization makes use of a debugger much more difficult (sometimes impossible).  (If you don't know
       anything about a debugger, it's time to learn.  The half hour or hour you'll spend  learning  the  basics
       will be repaid many many times over in the time you'll save later when debugging.  I'd recommend starting
       with  a  GUI  debugger  like "kdbg", "ddd", or "gdb" run from within emacs (see the info pages on gdb for
       instructions on how to do this).)  Optimization reorders and  combines  statements,  removes  unnecessary
       temporary  variables,  and  generally  rearranges  your  code  so that it's very tough to follow inside a
       debugger.  The usual procedure is to write your code, compile it without optimization, debug it, and then
       turn on optimization.

       In order for the debugger to work, the compiler has to cooperate not only by not optimizing, but also  by
       putting information about the names of the symbols into the object file so the debugger knows what things
       are called.  This is what the "-g" compilation option does.

       If  you're  done  debugging, and you want to optimize your code, simply replace "-g" with "-O".  For many
       compilers, you can specify increasing levels of optimization by appending a number after "-O".   You  may
       also  be able to specify other options that increase the speed under some circumstances (possibly trading
       off with increased memory usage).  See your compiler's man page for details.  For  example,  here  is  an
       optimizing compile command that I use frequently with the "gcc" compiler:

           gcc -O6 -malign-double -c xyz.c -o xyz.o

       You  may  have  to experiment with different optimization options for the absolute best performance.  You
       may need different options for different pieces of code.  Generally speaking, a simple optimization  flag
       like "-O6" works with many compilers and usually produces pretty good results.

       Warning:  on  rare occasions, your program doesn't actually do exactly the same thing when it is compiled
       with optimization.  This may be due to (1) an invalid assumption you made in your code that was  harmless
       without  optimization,  but  causes problems because the compiler takes the liberty of rearranging things
       when you optimize; or (2) sadly, compilers have bugs too, including bugs  in  their  optimizers.   For  a
       stable  compiler  like "gcc" on a common platform like an Pentium, optimization bugs are seldom a problem
       (as of the year 2000--there were problems a few years ago).

       If you don't specify either "-g" or "-O" in your  compilation  command,  the  resulting  object  file  is
       suitable  neither  for  debugging nor for running fast.  For some reason, this is the default.  So always
       specify either "-g" or "-O".

       On some systems, you must supply "-g" on both the compilation and linking steps; on others (e.g.  Linux),
       it  needs  to  be  supplied  only on the compilation step.  On some systems, "-O" actually does something
       different in the linking phase, while on others, it has no effect.  In any case, it's always harmless  to
       supply "-g" or "-O" for both commands.

   Warnings
       Most  compilers are capable of catching a number of common programming errors (e.g., forgetting to return
       a value from a function that's supposed to return a value).  Usually, you'll want to  turn  on  warnings.
       How  you  do this depends on your compiler (see the man page), but with the "gcc" compiler, I usually use
       something like this:

           gcc -g -Wall -c xyz.c -o xyz.o

       (Sometimes I also add "-Wno-uninitialized" after "-Wall" because of a warning that is usually wrong  that
       crops up when optimizing.)

       These warnings have saved me many many hours of debugging.

   Other useful compilation options
       Often,  necessary  include  files  are  stored  in some directory other than the current directory or the
       system include directory (/usr/include).  This frequently happens when you are using a library that comes
       with include files to define the functions or classes.

       Suppose, for example, you are writing an application that uses the  Qt  libraries.   You've  installed  a
       local  copy of the Qt library in /home/users/joe/qt, which means that the include files are stored in the
       directory /home/users/joe/qt/include.  In your code, you want to be able to do things like this:

           #include <qwidget.h>

       instead of

           #include "/home/users/joe/qt/include/qwidget.h"

       You can tell the compiler to look  for  include  files  in  a  different  directory  by  using  the  "-I"
       compilation option:

           g++ -I/home/users/joe/qt/include -g -c mywidget.cpp -o mywidget.o

       There is usually no space between the "-I" and the directory name.

       When  the  C++  compiler  is  looking  for the file qwidget.h, it will look in /home/users/joe/qt/include
       before looking in the system include directory.  You can specify as many "-I" options as you want.

   Using libraries
       You will often have to tell the linker to link with specific external libraries, if you are  calling  any
       functions  that aren't part of the standard C library.  The "-l" (lowercase L) option says to link with a
       specific library:

           cc -g xyz.o -o xyz -lm

       "-lm" says to link with the system math library, which you will need if  you  are  using  functions  like
       "sqrt".

       Beware:  if  you  specify more than one "-l" option, the order can make a difference on some systems.  If
       you are getting undefined variables when you know you have included the library that  defines  them,  you
       might  try  moving that library to the end of the command line, or even including it a second time at the
       end of the command line.

       Sometimes the libraries you will need are not stored in the default place for system libraries.   "-labc"
       searches for a file called libabc.a or libabc.so or libabc.sa in the system library directories (/usr/lib
       and  usually  a  few  other  places too, depending on what kind of Unix you're running).  The "-L" option
       specifies an additional directory to search for libraries.  To take  the  above  example  again,  suppose
       you've  installed  the  Qt  libraries  in  /home/users/joe/qt,  which means that the library files are in
       /home/users/joe/qt/lib.  Your link step for your program might look something like this:

           g++ -g test_mywidget.o mywidget.o -o test_mywidget -L/home/users/joe/qt/lib -lqt

       (On  some  systems,  if  you  link  in  Qt  you  will  need  to  add  other  libraries  as  well   (e.g.,
       "-L/usr/X11R6/lib -lX11 -lXext").  What you need to do will depend on your system.)

       Note that there is no space between "-L" and the directory name.  The "-L" option usually goes before any
       "-l" options it's supposed to affect.

       How  do  you know which libraries you need?  In general, this is a hard question, and varies depending on
       what kind of Unix you are running.  The documentation for the functions or classes you are  using  should
       say  what  libraries  you  need  to  link  with.   If you are using functions or classes from an external
       package, there is usually a library you need to link with; the library will  usually  be  a  file  called
       "libabc.a" or "libabc.so" or "libabc.sa" if you need to add a "-labc" option.

   Some other confusing things
       You  may  have  noticed that it is possible to specify options which normally apply to compilation on the
       linking step, and options which normally apply to linking on the  compilation  step.   For  example,  the
       following commands are valid:

           cc -g -L/usr/X11R6/lib -c xyz.c -o xyz.o
           cc -g -I/somewhere/include xyz.o -o xyz

       The irrelevant options are ignored; the above commands are exactly equivalent to this:

           cc -g -c xyz.c -o xyz.o
           cc -g xyz.o -o xyz

perl v5.32.0                                       2021-01-06                     MAKEPP_TUTORIAL_COMPILATION(1)