Provided by: lmbench_3.0-a9+debian.1-6build3_amd64 bug

NAME

       lmbench - benchmarking toolbox

SYNOPSIS

       #include ``lmbench.h''

       typedef u_long iter_t

       typedef (*benchmp_f)(iter_t iterations, void* cookie)

       void benchmp(benchmp_f  initialize, benchmp_f benchmark, benchmp_f cleanup, int enough, int parallel, int
       warmup, int repetitions, void* cookie)

       uint64    get_n()

       void milli(char *s, uint64 n)

       void micro(char *s, uint64 n)

       void nano(char *s, uint64 n)

       void mb(uint64 bytes)

       void kb(uint64 bytes)

DESCRIPTION

       Creating benchmarks using the lmbench timing harness is easy.  Since it is so easy to measure performance
       using lmbench , it is possible to quickly answer questions that arise during system design,  development,
       or tuning.  For example, image processing

       There  are  two attributes that are critical for performance, latency and bandwidth, and lmbench´s timing
       harness makes it easy to measure  and  report  results  for  both.   Latency  is  usually  important  for
       frequently executed operations, and bandwidth is usually important when moving large chunks of data.

       There are a number of factors to consider when building benchmarks.

       The  timing  harness  requires  that  the  benchmarked operation be idempotent so that it can be repeated
       indefinitely.

       The timing subsystem, benchmp, is passed up to three function pointers.  Some benchmarks may need as  few
       as one function pointer (for benchmark).

       void benchmp(initialize, benchmark, cleanup, enough, parallel, warmup, repetitions, cookie)
              measures  the  performance of benchmark repeatedly and reports the median result.  benchmp creates
              parallel sub-processes which run benchmark in  parallel.   This  allows  lmbench  to  measure  the
              system's  ability to scale as the number of client processes increases.  Each sub-process executes
              initialize before starting the benchmarking  cycle  with  iterations  set  to  0.   It  will  call
              initialize , benchmark , and cleanup with iterations set to the number of iterations in the timing
              loop several times in order to collect repetitions results.  The calls to benchmark are surrounded
              by  start  and  stop  call  to  time  the  amount of time it takes to do the benchmarked operation
              iterations times.  After all the benchmark results have been collected,  cleanup  is  called  with
              iterations  set  to  0  to  cleanup  any  resources which may have been allocated by initialize or
              benchmark.  cookie is a void pointer to a hunk of memory that can be used to store any  parameters
              or state that is needed by the benchmark.

       void benchmp_getstate()
              returns  a  void pointer to the lmbench-internal state used during benchmarking.  The state is not
              to be used or accessed directly by clients, but rather would be passed into benchmp_interval.

       iter_t    benchmp_interval(void* state)
              returns the number of times the benchmark should execute its benchmark  loop  during  this  timing
              interval.   This  is used only for weird benchmarks which cannot implement the benchmark body in a
              function which can return, such as the page fault handler.  Please see lat_sig.c for sample usage.

       uint64    get_n()
              returns the number of times loop_body was executed during the timing interval.

       void milli(char *s, uint64 n)
              print out the time per operation in milli-seconds.  n is  the  number  of  operations  during  the
              timing  interval,  which  is  passed  as  a  parameter  because each loop_body can contain several
              operations.

       void micro(char *s, uint64 n)
              print the time per opertaion in micro-seconds.

       void nano(char *s, uint64 n)
              print the time per operation in nano-seconds.

       void mb(uint64 bytes)
              print the bandwidth in megabytes per second.

       void kb(uint64 bytes)
              print the bandwidth in kilobytes per second.

USING lmbench

       Here is an example of a simple benchmark that  measures  the  latency  of  the  random  number  generator
       lrand48():

              #include ``lmbench.h''

              void
              benchmark_lrand48(iter_t iterations, void* cookie) {
                   while(iterations-- > 0)
                        lrand48();
              }

              int
              main(int argc, char *argv[])
              {
                   benchmp(NULL, benchmark_lrand48, NULL, 0, 1, 0, TRIES, NULL);
                   micro( lrand48()", get_n());"
                   exit(0);
              }

       Here is a simple benchmark that measures and reports the bandwidth of bcopy:

              #include ``lmbench.h''

              #define MB (1024 * 1024)
              #define SIZE (8 * MB)

              struct _state {
                   int size;
                   char* a;
                   char* b;
              };

              void
              initialize_bcopy(iter_t iterations, void* cookie) {
                   struct _state* state = (struct _state*)cookie;

                  if (!iterations) return;
                   state->a = malloc(state->size);
                   state->b = malloc(state->size);
                   if (state->a == NULL || state->b == NULL)
                        exit(1);
              }

              void
              benchmark_bcopy(iter_t iterations, void* cookie) {
                   struct _state* state = (struct _state*)cookie;

                   while(iterations-- > 0)
                        bcopy(state->a, state->b, state->size);
              }

              void
              cleanup_bcopy(iter_t iterations, void* cookie) {
                   struct _state* state = (struct _state*)cookie;

                  if (!iterations) return;
                   free(state->a);
                   free(state->b);
              }

              int
              main(int argc, char *argv[])
              {
                   struct _state state;

                   state.size = SIZE;
                   benchmp(initialize_bcopy, benchmark_bcopy, cleanup_bcopy,
                        0, 1, 0, TRIES, &state);
                   mb(get_n() * state.size);
                   exit(0);
              }

       A  slightly  more  complex version of the bcopy benchmark might measure bandwidth as a function of memory
       size and parallelism.  The main procedure in this case might look something like this:

              int
              main(int argc, char *argv[])
              {
                   int  size, par;
                   struct _state state;

                   for (size = 64; size <= SIZE; size <<= 1) {
                        for (par = 1; par < 32; par <<= 1) {
                             state.size = size;
                             benchmp(initialize_bcopy, benchmark_bcopy,
                                  cleanup_bcopy, 0, par, 0, TRIES, &state);
                             fprintf(stderr, d%d
                             mb(par * get_n() * state.size);
                        }
                   }
                   exit(0);
              }

VARIABLES

       There are three environment variables that can be used to modify the lmbench  timing  subsystem:  ENOUGH,
       TIMING_O, and LOOP_O.

FUTURES

       Development of lmbench is continuing.

SEE ALSO

       lmbench(8), timing(3), reporting(3), results(3).

AUTHOR

       Carl Staelin and Larry McVoy

       Comments, suggestions, and bug reports are always welcome.

(c)1998-2000 Larry McVoy and Carl Staelin            $Date:$                                          LMBENCH(3)