Ubuntu Manpage: PDL::Indexing - Introduction to indexing and slicing ndarrays.

NAME

       PDL::Indexing - Introduction to indexing and slicing ndarrays.

OVERVIEW

This man page should serve as a first tutorial on the indexing and broadcasting features of PDL.

Like all vectorized languages, PDL automates looping over multi-dimensional data structures ("ndarrays")
using a variant of mathematical vector notation. The automatic looping is called "broadcasting", similar
to NumPy and Julia. PDL also automatically runs broadcasting computation in parallel - see
PDL::ParallelCPU.

A lot of the flexibility and power of PDL relies on the indexing and broadcasting features of the Perl
extension. Indexing allows access to the data of an ndarray in a very flexible way. Broadcasting
provides efficient vectorization of simple operations.

The values of an ndarray are stored compactly as typed values in a single block of memory, not (as in a
normal Perl list-of-lists) as individual Perl scalars.

In the sections that follow many "methods" are called out -- these are Perl operators that apply to
ndarrays. From the perldl shell, you can find out more about each method by typing "?" followed by the
method name.

Dimension lists
A ndarray (PDL variable), in general, is an N-dimensional array where N can be 0 (for a scalar), 1 (e.g.
for a sound sample), or higher values for images and more complex structures. Each dimension of the
ndarray has a positive integer size. The "perl" interpreter treats each ndarray as a special type of
Perl scalar (a blessed Perl object, actually -- but you don't have to know that to use them) that can be
used anywhere you can put a normal scalar.

You can access the dimensions of an ndarray as a Perl list and otherwise determine the size of an ndarray
with several methods. The important ones are:

nelem - the total number of elements in an ndarray
ndims - returns the number of dimensions in an ndarray
dims - returns the dimension list of an ndarray as a Perl list
dim - returns the size of a particular dimension of an ndarray

Indexing and Dataflow
PDL maintains a notion of "dataflow" between an ndarray and indexed subfields of that ndarray. When you
produce an indexed subfield or single element of a parent ndarray, the child and parent remain attached
until you manually disconnect them. This lets you represent the same data different ways within your
code -- for example, you can consider an RGB image simultaneously as a collection of (R,G,B) values in a
3 x 1000 x 1000 image, and as three separate 1000 x 1000 color planes stored in different variables.
Modifying any of the variables changes the underlying memory, and the changes are reflected in all
representations of the data.

There are two important methods that let you control dataflow connections between a child and parent
ndarray:

copy - forces an explicit copy of an ndarray
sever - breaks the dataflow connection between an ndarray and its parents (if any)

Broadcasting and Dimension Order
Most PDL operations act on the first few dimensions of their ndarray arguments. For example, "sumover"
sums all elements along the first dimension in the list (dimension 0). If you feed in a three-
dimensional ndarray, then the first dimension is considered the "active" dimension and the later
dimensions are "broadcast" dimensions because they are simply looped over. There are several ways to
transpose or re-order the dimension list of an ndarray. Those techniques are very fast since they don't
touch the underlying data, only change the way that PDL accesses the data. The main dimension ordering
functions are:

mv - moves a particular dimension somewhere else in the dimension list
xchg - exchanges two dimensions in the dimension list, leaving the rest alone
reorder - allows wholesale mixing of the dimensions
clump - clumps together two or more small dimensions into one larger one
squeeze - eliminates any dimensions of size 1

Physical and Dummy Dimensions
• document Perl level broadcasting (see "broadcast_define" in PDL::Core)

• broadcastids (see "Explicit broadcasting")

• "slice" in PDL::Slices is crucial, useful in niches are "lags" in PDL::Slices and "splitdim" in
PDL::Slices

Indexing and broadcasting with PDL

       A  lot  of  the  flexibility  and  power  of  PDL relies on the indexing and looping features of the Perl
       extension. Indexing allows access to the data of an ndarray in a very flexible way. Broadcasting provides
       efficient implicit looping functionality (since the loops are implemented as optimized C code).

       ndarrays are Perl objects that represent multidimensional arrays and operations on those. In contrast  to
       simple  Perl @x style lists the array data is compactly stored in a single block of memory thus taking up
       a lot less memory and enabling use of fast C  code  to  implement  operations  (e.g.  addition,  etc)  on
       ndarrays.

   ndarrays can have children
       Central  to  many  of  the  indexing capabilities of PDL are the relation of "parent" and "child" between
       ndarrays. Many of the indexing commands create a new ndarray from an existing ndarray. The new ndarray is
       the "child" and the old one is the "parent". The data of the new ndarray is defined by  a  transformation
       that specifies how to generate (compute) its data from the parent's data.

       Another  way  to  interpret  the  ndarrays  created by our indexing commands is to view them as a kind of
       intelligent pointer that points back to some portion or all of its parent's data. Therefore,  it  is  not
       surprising  that  the parent's data (or a portion of it) changes when manipulated through this "pointer".
       We will further illustrate the pointer/dataflow analogies in the context of some of  the  examples  later
       on.

       There are two different implementations of this ``smart pointer'' relationship: the first one, which is a
       little  slower but works for any transformation is simply to do the transformation forwards and backwards
       as necessary. The other is to consider the child ndarray a  ``virtual''  ndarray,  which  only  stores  a
       pointer  to  the  parent  and  access  information  so that routines which use the child ndarray actually
       directly access the data in the parent.  If the virtual ndarray is given to a routine  which  cannot  use
       it, PDL transparently physicalizes the virtual ndarray before letting the routine use it.

       Currently (still true as of 2.086) all transformations which are ``affine'', i.e. the indices of the data
       item in the parent ndarray are determined by a linear transformation (+ constant) from the indices of the
       child  ndarray  result  in  virtual  affine,  or  "vaffine"  ndarrays.  All other indexing routines (e.g.
       "->index(...)") result in physical ndarrays.  All routines compiled by PP  can  accept  affine  ndarrays,
       except  those  routines  that  pass  pointers  to  external  library  functions, and those will simply be
       "physicalised" (memory allocated, data copied in).  There will then be dataflow  back  and  forth,  in  a
       lazy-evaluation way.  See PDL::Dataflow for more.

       Note  that  whether  something  is affine or not does not affect the semantics of what you do in any way:
       both

        $x->index(...) .= 5;
        $x->slice(...) .= 5;

       change the data in $x. The affinity does,  however,  have  a  significant  impact  on  memory  usage  and
       performance.

   Slicing ndarrays
       Probably  the most important application of the concept of parent/child ndarrays is the representation of
       rectangular slices of a physical ndarray by a virtual ndarray. Having talked long enough  about  concepts
       let's  get  more  specific.  Suppose  we  are  working  with  a 2D ndarray representing a 5x5 image (it's
       unusually small so that we can print it without filling several screens full of digits).

        pdl> $im = sequence(5,5)
        pdl> p $im

        [
         [ 0  1  2  3  4]
         [ 5  6  7  8  9]
         [10 11 12 13 14]
         [15 16 17 18 19]
         [20 21 22 23 24]
        ]

        pdl> help vars
        PDL variables in package main::

        Name         Type   Dimension       Flow  State          Mem
        ----------------------------------------------------------------
        $im          Double D [5,5]                P            0.20Kb

       The interactive "perldl" shells that come with PDL have a "help vars" command that  provides  information
       about ndarrays. Use "help help" for more.

       Now suppose we want to create a 1-D ndarray that just references one line of the image, say line 2; or an
       ndarray  that represents all even lines of the image (imagine we have to deal with even and odd frames of
       an interlaced image due to some peculiar behaviour of our frame grabber). As another frequent application
       of slices we might want to create an ndarray that represents a rectangular region of the image  with  top
       and  bottom  reversed.  All  these effects (and many more) can be easily achieved with the powerful slice
       function:

        pdl> $line = $im->slice(':,(2)')
        pdl> $even = $im->slice(':,1:-1:2')
        pdl> $area = $im->slice('3:4,3:1')
        pdl> help vars  # or just PDL->vars
        PDL variables in package main::

        Name         Type   Dimension       Flow  State          Mem
        ----------------------------------------------------------------
        $even        Double D [5,2]                -C           0.00Kb
        $im          Double D [5,5]                P            0.20Kb
        $line        Double D [5]                  -C           0.00Kb
        $area        Double D [2,3]                -C           0.00Kb

       All three "child" ndarrays are children of $im  or  in  the  other  (largely  equivalent)  interpretation
       pointers  to data of $im.  Operations on those virtual ndarrays access only those portions of the data as
       specified by the argument to slice. So we can just print line 2:

        pdl> p $line
        [10 11 12 13 14]

       Also note the difference in the "Flow State" of $area above and below:

        pdl> p $area
        pdl> help $area
        This variable is Double D [2,3]                VC           0.00Kb

       The following demonstrates that $im and $line really behave as  you  would  expect  from  a  pointer-like
       object (or in the dataflow picture: the changes in $line's data are propagated back to $im):

        pdl> $im++
        pdl> p $line
        [11 12 13 14 15]
        pdl> $line += 2
        pdl> p $im

        [
         [ 1  2  3  4  5]
         [ 6  7  8  9 10]
         [13 14 15 16 17]
         [16 17 18 19 20]
         [21 22 23 24 25]
        ]

       Note  how assignment operations on the child virtual ndarrays change the parent physical ndarray and vice
       versa (however, the basic "=" assignment doesn't, use ".=" to obtain  that  effect.  See  below  for  the
       reasons).   The  virtual child ndarrays are something like "live links" to the "original" parent ndarray.
       As previously said, they can be thought of to work similar to a C-pointer. But in contrast to a C-pointer
       they carry a lot more information. Firstly, they specify the structure of the data  they  represent  (the
       dimensionality  of  the  new ndarray) and secondly, specify how to create this structure from its parents
       data (the way this works is buried in the internals of PDL and not  important  for  you  to  know  anyway
       (unless  you  want  to  hack  the core in the future or would like to become a PDL guru in general (for a
       definition of this strange creature see PDL::Internals)).

       The previous examples  have  demonstrated  typical  usage  of  the  slice  function.  Since  the  slicing
       functionality is so important here is an explanation of the syntax for the string argument to slice:

        $vpdl = $x->slice('ind0,ind1...')

       where  "ind0"  specifies  what  to  do  with index No 0 of the ndarray $x, etc. Each element of the comma
       separated list can have one of the following forms:

       ':'   Use the whole dimension

       'n'   Use only index "n". The dimension of this index in the resulting virtual ndarray is 1.  An  example
             involving those first two index formats:

              pdl> $column = $im->slice('2,:')
              pdl> $row = $im->slice(':,0')
              pdl> p $column

              [
               [ 3]
               [ 8]
               [15]
               [18]
               [23]
              ]

              pdl> p $row

              [
               [1 2 3 4 5]
              ]

              pdl> help $column
              This variable is Double D [1,5]                VC           0.00Kb

              pdl> help $row
              This variable is Double D [5,1]                VC           0.00Kb

       '(n)' Use  only index "n". This dimension is removed from the resulting ndarray (relying on the fact that
             a dimension of size 1 can always be removed). The distinction between this case  and  the  previous
             one  becomes  important  in  assignments  where  left  and right hand side have to have appropriate
             dimensions.

              pdl> $line = $im->slice(':,(0)')
              pdl> help $line
              This variable is Double D [5]                  -C           0.00Kb

              pdl> p $line
              [1 2 3 4 5]

             Spot the difference to the previous example?

       'n1:n2' or 'n1:n2:n3'
             Take the range of indices from "n1" to "n2" or (second form) take the range of indices from "n1" to
             "n2" with step "n3". An example for the use of this format is the previous definition of  the  sub-
             image composed of even lines.

              pdl> $even = $im->slice(':,1:-1:2')

             This example also demonstrates that negative indices work like they do for normal Perl style arrays
             by  counting  backwards from the end of the dimension. If "n2" is smaller than "n1" (in the example
             -1 is equivalent to index 4) the elements in the virtual  ndarray  are  effectively  reverted  with
             respect to its parent.

       '*[n]'
             Add  a  dummy  dimension.  The  size  of this dimension will be 1 by default or equal to "n" if the
             optional numerical argument is given.

             Now, this is really something a bit strange on first sight. What is  a  dummy  dimension?  A  dummy
             dimension  inserts  a dimension where there wasn't one before. How is that done ? Well, in the case
             of the new dimension having size 1 it can be easily explained by the way in which you can  identify
             a vector (with "m" elements) with an "(1,m)" or "(m,1)" matrix. The same holds obviously for higher
             dimensional  objects.  More  interesting is the case of a dummy dimensions of size greater than one
             (e.g. "slice('*5,:')"). This works in the same way as a call to the dummy function  creates  a  new
             dummy dimension.  So read on and check its explanation below.

       Generalised diagonals

       With  "diagonal"  in  PDL::Slices  you  can  make  generalised  diagonals.  The best way to explain it is
       probably to give an example, here we make an ndarray that refers to the elements along the space diagonal
       of its parent ndarray (a cube):

        $cube = zeroes(5,5,5);
        $sdiag = $cube->diagonal(1..3);

       The above command creates a virtual ndarray that represents the diagonal along the parents' dimension no.
       0, 1 and 2 and makes its dimension 0 (the only dimension) of it. You  use  the  extended  syntax  if  the
       dimension  sizes of the parent dimensions you want to build the diagonal from have different sizes or you
       want to reverse the sequence of elements in the diagonal, e.g.

        $rect = zeroes(12,3,5,6,2);
        $vpdl = $rect->slice('2:7,0:1,(4),5:4'); # now 6,2,2,2
        $vpdl = $vpdl->diagonal(1..3); # now 6,2

       So the elements of $vpdl will then be related to those of its parent in way we can express as:

         vpdl(i,j) = rect(i+2,j,4,5-j,j)       0<=i<6, 0<=j<2

   There are different kinds of assignments in PDL
       The previous examples have already shown that virtual ndarrays can  be  used  to  operate  on  or  access
       portions of data of a parent ndarray. They can also be used as lvalues in assignments (as the use of "++"
       in some of the examples above has already demonstrated). For explicit assignments to the data represented
       by  a  virtual  ndarray  you  have  to  use  the  overloaded ".=" operator (which in this context we call
       propagated assignment). Why can't you use the normal assignment operator "="?

       Well, you definitely still can use the '=' operator but it wouldn't do what you want.  In  recent  Perls,
       the  '='  operator  can  be  overloaded  in  the  same  way  as other assignment operators, but for back-
       compatibility reasons PDL uses '=' for a reference copy, so the variable representing the virtual ndarray
       (a reference to a blessed thingy) would after the  assignment  just  contain  the  reference  to  another
       blessed thingy which would behave to future assignments as a "physical" copy of the original rvalue.

       E.g.

        pdl> $line = $im->slice(':,(2)')
        pdl> $line = zeroes(5);
        pdl> $line++;
        pdl> p $im

        [
         [ 1  2  3  4  5]
         [ 6  7  8  9 10]
         [13 14 15 16 17]
         [16 17 18 19 20]
         [21 22 23 24 25]
        ]

        pdl> p $line
        [1 1 1 1 1]

       But using ".="

        pdl> $line = $im->slice(':,(2)')
        pdl> $line .= zeroes(5)
        pdl> $line++
        pdl> p $im

        [
         [ 1  2  3  4  5]
         [ 6  7  8  9 10]
         [ 1  1  1  1  1]
         [16 17 18 19 20]
         [21 22 23 24 25]
        ]

        pdl> print $line
        [1 1 1 1 1]

       Also, you can substitute

        pdl> $line .= 0;

       for  the  assignment  above  (the  zero is converted to a scalar ndarray, with no dimensions so it can be
       assigned to any ndarray).

       A nice feature in recent perl versions is lvalue subroutines (i.e., versions 5.6.x and  higher  including
       all  perls  currently  supported by PDL).  That allows one to use the slicing syntax on both sides of the
       assignment:

        pdl> $im->slice(':,(2)') .= xvals(float, 5)

       Note that there can be a problem with assignments like this when lvalue  and  rvalue  ndarrays  refer  to
       overlapping portions of data in the parent ndarray:

        # revert the elements of the first line of $x
        $x->slice(':,(1)') .= $x->slice('-1:0,(1)');

       Currently,  the  parent  data  on  the  right side of the assignments is not copied before the (internal)
       assignment loop proceeds. Therefore, the outcome of this assignment will depend on the sequence in  which
       elements  are  assigned  and  almost  certainly  not  do what you wanted.  So the semantics are currently
       undefined for now and liable to change anytime. To obtain the desired behaviour, use

        $x->slice(':,(1)') .= $x->slice('-1:0,(1)')->copy;

       which makes a physical copy of the slice or

        $x->slice(':,(1)') .= $x->slice('-1:0,(1)')->sever;

       which returns the same slice but first severs the connection of the slice to its parent.

   Other functions that manipulate dimensions
       Having talked extensively about the slice function it should be noted that  this  is  not  the  only  PDL
       indexing  function.  There  are  additional  indexing  functions which are also useful (especially in the
       context of broadcasting which we will talk about later). Here are a list and some  examples  how  to  use
       them.

       "dummy"
           inserts  a dummy dimension of the size you specify (default 1) at the chosen location. You can't wait
           to hear how that is achieved?  Well, all elements  with  index  "(X,x,Y)"  ("0<=x<size_of_dummy_dim")
           just  map  to  the  element  with index "(X,Y)" of the parent ndarray (where "X" and "Y" refer to the
           group of indices before and after the location where the dummy dimension was inserted.)

           This example calculates the x coordinate of the centroid of an image (later we  will  learn  that  we
           didn't  actually  need  the  dummy  dimension thanks to the magic of implicit broadcasting; but using
           dummy dimensions the code would also work in a broadcast-less world; though once you have worked with
           PDL broadcasting you wouldn't want to live without them again).

            # centroid
            ($xd,$yd) = $im->dims;
            $xc = sum($im*xvals($xd)->dummy(1,$yd))/sum($im);

           Let's explain how that works in a little more detail. First, the product:

            $xvs = xvals($xd);
            print $xvs->dummy(1,$yd);      # repeat the line $yd times
            $prod = $im*xvs->dummy(1,$yd); # form the pixel-wise product with
                                           # the repeated line of x-values

           The rest is then summing the results of the pixel-wise product together and normalizing with the  sum
           of  all  pixel  values  in  the original image thereby calculating the x-coordinate of the "center of
           mass" of the image (interpreting pixel values as local mass) which is known as  the  centroid  of  an
           image.

           Next  is  a  (from  the point of view of memory consumption) very cheap conversion from grey-scale to
           RGB, i.e. every pixel holds now a triple of values instead of a  scalar.  The  three  values  in  the
           triple  are, fortunately, all the same for a grey image, so that our trick works well in that it maps
           all the three members of the triple to the same source element:

            # a cheap grey-scale to RGB conversion
            $rgb = $grey->dummy(0,3)

           Unfortunately this trick cannot be used to convert your old B/W photos to color ones in the way you'd
           like. :(

           Note that the memory usage of ndarrays with dummy dimensions is especially sensitive to the  internal
           representation. If the ndarray can be represented as a virtual affine (``vaffine'') ndarray, only the
           control structures are stored. But if $y in

            $x = zeroes(10000);
            $y = $x->dummy(1,10000);

           is  made  physical  by some routine, you will find that the memory usage of your program has suddenly
           grown by 100Mb.

       "diagonal"
           replaces two dimensions (which have to be of equal size) by one dimension  that  references  all  the
           elements  along  the  "diagonal"  along those two dimensions. Here, we have two examples which should
           appear familiar to anyone who has ever done some linear algebra. Firstly, make a unity matrix:

            # unity matrix
            $e = zeroes(float, 3, 3); # make everything zero
            $e->diagonal(0,1) .= 1; # set the elements along the diagonal to 1
            print $e;

           Or the other diagonal:

            $e->slice(':-1:0')->diagonal(0,1) .= 2;
            print $e;

           (Did you notice how we used the slice function to reverse the sequence of lines  before  setting  the
           diagonal of the new child, thereby setting the cross diagonal of the parent ?)  Or a mapping from the
           space of diagonal matrices to the field over which the matrices are defined, the trace of a matrix:

            # trace of a matrix
            $trace = sum($mat->diagonal(0,1));  # sum all the diagonal elements

       "xchg" and "mv"
           xchg exchanges or "transposes" the two  specified dimensions.  A straightforward example:

            # transpose a matrix (without explicitly reshuffling data and
            # making a copy)
            $prod = $x x $x->xchg(0,1);

           $prod should now be pretty close to the unity matrix if $x is an orthogonal matrix. Often "xchg" will
           be used in the context of broadcasting but more about that later.

           mv  works in a similar fashion. It moves a dimension (specified by its number in the parent) to a new
           position in the new child ndarray:

            $y = $x->mv(4,0);  # make the 5th dimension of $x the first in the
                               # new child $y

           The difference between "xchg" and "mv" is that "xchg" only changes the  position  of  two  dimensions
           with  each  other,  whereas "mv" inserts the first dimension to the place of second, moving the other
           dimensions around accordingly.

       "clump"
           collapses several dimensions into one. Its only argument specifies how many dimensions of the  source
           ndarray  should  be  collapsed (starting from the first). An (admittedly unrealistic) example is a 3D
           ndarray which holds data from a stack of image files that you have just read in.  However,  the  data
           from each image really represents a 1D time series and has only been arranged that way because it was
           digitized with a frame grabber. So to have it again as an array of time sequences you say

            pdl> $seqs = $stack->clump(2)
            pdl> help vars
            PDL variables in package main::

            Name         Type   Dimension       Flow  State          Mem
            ----------------------------------------------------------------
            $seqs        Double D [8000,50]            -C           0.00Kb
            $stack       Double D [100,80,50]          P            3.05Mb

           Unrealistic  as  it  may seem, our confocal microscope software writes data (sometimes) this way. But
           more often you use clump to achieve a certain effect when using implicit or explicit broadcasting.

   Calls to indexing functions can be chained
       As you might have noticed in some of the examples above calls to the indexing  functions  can  be  nicely
       chained  since  all of these functions return a newly created child object. However, when doing extensive
       index manipulations in a chain be sure to keep track of what you are doing, e.g.

        $x->xchg(0,1)->mv(0,4)

       moves the dimension 1 of $x to position 4  since  when  the  second  command  is  executed  the  original
       dimension  1  has been moved to position 0 of the new child that calls the "mv" function. I think you get
       the idea (in spite of my convoluted explanations).

   Propagated assignments ('.=') and dummy dimensions
       A subtlety related to indexing is the assignment to ndarrays containing dummy dimensions of size  greater
       than  1.  These assignments (using ".=") are forbidden since several elements of the lvalue ndarray point
       to the same element of the parent. As a consequence the value of those parent  elements  are  potentially
       ambiguous and would depend on the sequence in which the implementation makes the assignments to elements.
       Therefore, an assignment like this:

        $x = pdl [1,2,3];
        $y = $x->dummy(1,4);
        $y .= yvals(3,4);

       can  produce unexpected results and the results are explicitly undefined, and will (as of 2.086) throw an
       error.

       From the point of view of dataflow the introduction of greater-size-than-one dummy dimensions is regarded
       as an irreversible transformation (similar to the terminology in thermodynamics) which precludes backward
       propagation of assignment to a parent (which you had explicitly requested using the ".="  assignment).  A
       similar  problem  to watch out for occurs in the context of broadcasting where sometimes dummy dimensions
       are created implicitly during the broadcast loop (see below).

   Reasons for the parent/child (or "pointer") concept
       It is very memory efficient, with only metadata existing in the child.  It provides a very  flexible  and
       powerful way of accessing portions of ndarray data.

   How to make things physical again
       A "virtual affine" (or "vaffine") ndarray will be automatically "physicalised" (have memory allocated and
       its  data copied in) if that is requested with the "make_physical" in PDL::Core method, if it is severed,
       or if it is passed to an operation as a parameter that is marked as "[phys]". See PDL::PP for  more.   It
       will still have dataflow except in the "sever" case.

Broadcasting

       In  the previous paragraph on indexing we have already mentioned the term occasionally but now its really
       time to talk explicitly about "broadcasting" with ndarrays: within the framework of PDL it could probably
       be loosely defined as an implicit looping facility. It is implicit because  you  don't  specify  anything
       like  enclosing  for-loops but rather the loops are automatically (or 'magically') generated by PDL based
       on the dimensions of the ndarrays involved. This should give you a first  idea  why  the  index/dimension
       manipulating functions you have met in the previous paragraphs are especially important and useful in the
       context  of  broadcasting.  The other ingredient for broadcasting (apart from the ndarrays involved) is a
       function that is broadcasting aware (generally, these  are  PDL::PP  compiled  functions)  and  that  the
       ndarrays  are  "broadcast"  over.   So much about the terminology and now let's try to shed some light on
       what it all means.

   Implicit broadcasting - a first example
       There are two slightly different  variants  of  broadcasting.  We  start  with  what  we  call  "implicit
       broadcasting". Let's pick a practical example that involves looping of a function over many elements of a
       ndarray. Suppose we have an RGB image that we want to convert to grey-scale. The RGB image is represented
       by  a  3-dim  ndarray  "im(3,x,y)"  where the first dimension contains the three color components of each
       pixel and "x" and "y" are width and height of the image, respectively. Next we need  to  specify  how  to
       convert  a color-triple at a given pixel into a grey-value (to be a realistic example it should represent
       the relative intensity with which our color insensitive eye cells would detect that color to achieve what
       we would call a natural conversion from color to grey-scale). An approximation that works quite  well  is
       to compute the grey intensity from each RGB triplet (r,g,b) as a weighted sum

        grey-value = 77/256*r + 150/256*g + 29/256*b =
            inner([77,150,29]/256, [r,g,b])

       where  the  last form indicates that we can write this as an inner product of the 3-vector comprising the
       weights for  red,  green  and  blue  components  with  the  3-vector  containing  the  color  components.
       Traditionally, we might have written a function like the following to process the whole image:

        my @dims=$im->dims;
        # here normally check that first dim has correct size (3), etc
        $grey=zeroes(@dims[1,2]);   # make the ndarray for the resulting grey image
        $w = pdl [77,150,29] / 256; # the vector of weights
        for ($j=0;$j<dims[2];$j++) {
           for ($i=0;$i<dims[1];$i++) {
               # compute the pixel value
               $tmp = inner($w,$im->slice(':,(i),(j)'));
               set($grey,$i,$j,$tmp); # and set it in the grey-scale image
           }
        }

       Now we write the same using broadcasting (noting that "inner" is a broadcasting aware function defined in
       the PDL::Primitive package)

        $grey = inner($im,pdl([77,150,29]/256));

       We  have ended up with a one-liner that automatically creates the ndarray $grey with the right number and
       size of dimensions and performs the loops automatically (these loops are implemented as fast  C  code  in
       the internals of PDL).  Well, we still owe you an explanation how this 'magic' is achieved.

   How does the example work ?
       The  first  thing  to note is that every function that is broadcasting aware (these are without exception
       functions compiled from concise descriptions by  PDL::PP,  later  just  called  PP-functions)  expects  a
       defined (minimum) number of dimensions (we call them core dimensions) from each of its ndarray arguments.
       The  inner  function  expects  two  one-dimensional  (input)  parameters from which it calculates a zero-
       dimensional (output) parameter.  We  write  that  symbolically  as  "inner((n),(n),[o]())"  and  call  it
       "inner"'s signature, where n represents the size of that dimension. n being equal in the first and second
       parameter  means  that those dimensions have to be of equal size in any call. As a different example take
       the outer product which  takes  two  1D  vectors  to  generate  a  2D  matrix,  symbolically  written  as
       "outer((n),(m),[o](n,m))".  The  "[o]"  in  both examples indicates that this (here third) argument is an
       output argument. In the latter example the dimensions of first and second argument don't  have  to  agree
       but you see how they determine the size of the two dimensions of the output ndarray.

       Here  is the point when broadcasting finally enters the game. If you call PP-functions with ndarrays that
       have more than the required core dimensions the first dimensions of the ndarray arguments are used as the
       core dimensions and the additional extra dimensions are broadcast over. Let  us  demonstrate  this  first
       with our example above

        $grey = inner($im,$w); # w is the weight vector from above

       In  this  case  $w is 1D and so supplied just the core dimension, $im is 3D, more specifically "(3,x,y)".
       The first dimension (of size 3) is the required core dimension that matches (as required  by  inner)  the
       first (and only) dimension of $w. The second dimension is the first broadcast dimension (of size "x") and
       the  third  is  here  the  second  broadcast dimension (of size "y"). The output ndarray is automatically
       created (as requested by setting $grey to "null" prior to invocation). The output dimensions are obtained
       by appending the loop dimensions (here "(x,y)") to the core output dimensions  (here  0D)  to  yield  the
       final dimensions of the auto-created ndarray (here "0D+2D=2D" to yield a 2D output of size "(x,y)").

       So the above command calls the core functionality that computes the inner product of two 1D vectors "x*y"
       times  with  $w  and all 1D slices of the form "(':,(i),(j)')" of $im and sets the respective elements of
       the output ndarray "$grey(i,j)" to the result of each computation. We could write that symbolically as

        $grey(0,0) = f($w,$im(:,(0),(0)))
        $grey(1,0) = f($w,$im(:,(1),(0)))
            .
            .
            .
        $grey(x-2,y-1) = f($w,$im(:,(x-2),(y-1)))
        $grey(x-1,y-1) = f($w,$im(:,(x-1),(y-1)))

       But this is done automatically by PDL without writing any explicit Perl loops.  We see that  the  command
       really  creates an output ndarray with the right dimensions and sets the elements indeed to the result of
       the computation for each pixel of the input image.

       When even more ndarrays and extra dimensions are involved things get a  bit  more  complicated.  We  will
       first  give  the  general  rules  how the broadcast dimensions depend on the dimensions of input ndarrays
       enabling you to figure out the dimensionality of an auto-created output ndarray (for  any  given  set  of
       input  ndarrays  and  core dimensions of the PP-function in question). The general rules will most likely
       appear a bit confusing on first sight so that we'll set out to illustrate the usage with a set of further
       examples (which will hopefully also demonstrate that there are indeed  many  practical  situations  where
       broadcasting comes in extremely handy).

   A call for coding discipline
       Before  we  point  out the other technical details of broadcasting, please note this call for programming
       discipline when using broadcasting:

       In order to preserve human readability, PLEASE comment any nontrivial expression in your  code  involving
       broadcasting.   Most importantly, for any subroutine, include information at the beginning about what you
       expect the dimensions to represent (or ranges of dimensions).

       As a warning, look at this undocumented function and try to guess what might be going on:

        sub lookup {
          my ($im,$palette) = @_;
          $palette->xchg(0,1)->index($im->indx->dummy(0,$palette->dim(0)));
        }

       Would you agree that it might be difficult to figure out expected dimensions, purpose of the routine, etc
       ?  (If you want to find out what this piece of code does, see below)

   How to figure out the loop dimensions
       There are a couple of rules that allow you to figure out number and size of loop dimensions (and  if  the
       size  of  your input ndarrays comply with the broadcasting rules). Dimensions of any ndarray argument are
       broken down into two groups in the following:  Core  dimensions  (as  defined  by  the  PP-function,  see
       Appendix A for a list of PDL primitives) and extra dimensions which comprises all remaining dimensions of
       that  ndarray.  For  example  calling  a  function "func" with the signature "func((n,m),[o](n))" with an
       ndarray "$x(2,4,7,1,3)" as "f($x,($o = null))" results in the semantic splitting of x's dimensions  into:
       core dimensions "(2,4)" and extra dimensions "(7,1,3)".

       R0    Core,  or "active", dimensions are identified with the first N dimensions of the respective ndarray
             argument (and are required). Any further dimensions are extra dimensions and used to determine  the
             loop dimensions.

       R1    The  number  of (implicit) loop dimensions is equal to the maximal number of extra dimensions taken
             over the set of ndarray arguments.

       R2    The size of each of the loop dimensions is derived from the size of the  respective  dimensions  of
             the  ndarray  arguments.  The size of a loop dimension is given by the maximal size found in any of
             the ndarrays having this extra dimension.

       R3    If an ndarray doesn't have a loop dimension, in the broadcast loop this ndarray is "promoted" as if
             having a dummy dimension of size equal to the size of that loop dimension.

       R4    For all ndarrays that have a given extra dimension the size must be equal to the size of  the  loop
             dimension  (as  determined  by the previous rule) or 1; otherwise you raise a runtime exception. If
             the size of the extra dimension in an ndarray is 1 it is implicitly treated as a dummy dimension of
             size equal to ("inflated" to) that loop dim size when performing the broadcast loop.

       R5    If output auto-creation is used (by setting the relevant ndarray to "PDL->null" before  invocation)
             the  number  of  dimensions of the created ndarray is equal to the sum of the number of core output
             dimensions + number of loop dimensions. The size of the core output dimensions is derived from  the
             relevant dimension of input ndarrays (as specified in the function definition) and the sizes of the
             other  dimensions are equal to the size of the loop dimension it is derived from. The automatically
             created ndarray will be physical (unless dataflow is in operation).

       In this context, note that you can run into the problem with assignment to ndarrays  containing  greater-
       than-one  dummy  dimensions  (see  above).   Although  your  output  ndarray(s)  didn't contain any dummy
       dimensions in the first place they may end up with implicitly created dummy dimensions according to R3.

       As an example, suppose we have a (here unspecified) PP-function with the signature:

        func((m,n),(m,n,o),(m),[o](m,o))

       and you call it with 3 ndarrays "$x(5,3,10,11)", "$y(5,3,2,10,1,12)", and "$z(5,1,11,12)" as

        func($x,$y,$z,($d=null))

       then the number of loop dimensions is 3 (by "R0+R1" from $y and $z) with sizes "(10,11,12)" (by R2);  the
       two  output  core dimensions are "(5,2)" (from the signature of func) resulting in a 5-dimensional output
       ndarray $c of size "(5,2,10,11,12)"  (see  R5)  and  (the  automatically  created)  $d  is  derived  from
       "($x,$y,$z)" in a way that can be expressed in pdl pseudo-code as

        $d(:,:,i,j,k) .= func($x(:,:,i,j),$y(:,:,:,i,0,k),$z(:,0,j,k))
           with 0<=i<10, 0<=j<=11, 0<=k<12

       If  we  analyze  the  color to grey-scale conversion again with these rules in mind we note another great
       advantage of implicit broadcasting.  We can call the conversion with  an  ndarray  representing  a  pixel
       (im(3)),  a  line  of  rgb pixels ("im(3,x)"), a proper color image ("im(3,x,y)") or a whole stack of RGB
       images ("im(3,x,y,z)"). As long as $im is of the form "(3,...)" the automatically created output  ndarray
       will  contain  the  right  number  of dimensions and contain the intensity data as we expect it since the
       loops have been implicitly performed thanks to implicit broadcasting. You can  easily  convince  yourself
       that  calling  with  a color pixel $grey is 0D, with a line it turns out 1D grey(x), with an image we get
       "grey(x,y)" and finally we get a converted image stack "grey(x,y,z)".

       Let's fill these general rules with some more life by going through a couple  of  further  examples.  The
       reader  may  try  to  figure  out  equivalent  formulations  with  explicit  for-looping  and compare the
       flexibility of those routines using implicit  broadcasting  to  the  explicit  formulation.  Furthermore,
       especially when using several broadcast dimensions it is a useful exercise to check the relative speed by
       doing some benchmark tests (which we still have to do).

       First in the row is a slightly reworked centroid example, now coded with broadcasting in mind.

        # broadcast mult to calculate centroid coords, works for stacks as well
        $xc = sumover(($im*xvals($im->dim(0))->clump(2)) /
              sumover($im->clump(2));

       Let's analyze what's going on step by step. First the product:

        $prod = $im*xvals($im->dim(0))

       This  will actually work for $im being one, two, three, and higher dimensional. If $im is one-dimensional
       it's just an ordinary product (in the sense that every element of $im is multiplied with  the  respective
       element  of  xvals(...)),  if  $im has more dimensions further broadcasting is done by adding appropriate
       dummy dimensions to xvals(...)  according to R3.  More importantly, the two  sumover  operations  show  a
       first  example  of  how  to  make  use  of the dimension manipulating commands. A quick look at sumover's
       signature will remind you that it will only "gobble up" the first dimension of a given input ndarray. But
       what if we want to really compute the sum over all elements of the first two  dimensions?  Well,  nothing
       keeps  us  from passing a virtual ndarray into sumover which in this case is formed by clumping the first
       two dimensions of the "parent ndarray" into one. From the point of view of the parent ndarray the sum  is
       now  computed  over  the first two dimensions, just as we wanted, though sumover has just done the job as
       specified by its signature. Got it ?

       Another little finesse of writing the code like that:  we  intentionally  used  "sumover($pdl->clump(2))"
       instead of sum($pdl) so that we can either pass just an image "(x,y)" or a stack of images "(x,y,t)" into
       this routine and get either just one x-coordinate or a vector of x-coordinates (of size t) in return.

       Another set of common operations are what one could call "projection operations". These operations take a
       N-D  ndarray as input and return a (N-1)-D "projected" ndarray. These operations are often performed with
       functions like sumover, prodover, minimum and maximum.  Using again images as examples we might  want  to
       calculate the maximum pixel value for each line of an image or image stack. We know how to do that

        # maxima of lines (as function of line number and time)
        maximum($stack,($ret=null));

       But  what  if  you want to calculate maxima per column when implicit broadcasting always applies the core
       functionality to the first dimension and broadcasts over all others? How can we achieve that instead  the
       core  functionality  is applied to the second dimension and broadcasting is done over the others. Can you
       guess it? Yes, we make a virtual ndarray that has the second dimension of the  "parent  ndarray"  as  its
       first dimension using the "mv" command.

        # maxima of columns (as function of column number and time)
        maximum($stack->mv(1,0),($ret=null));

       and calculating all the sums of sub-slices over the third dimension is now almost too easy

        # sums of pixels in time (assuming time is the third dim)
        sumover($stack->mv(2,0),($ret=null));

       Finally,  if  you want to apply the operation to all elements (like max over all elements or sum over all
       elements) regardless of the dimensions of the ndarray in question "clump" comes in handy. As  an  example
       look at a definition of "sum" (summarised from Basic/Ufunc/ufunc.pd):

        sub sum {
          PDL::Ufunc::sumover($name->clump(-1),($tmp=null));
          return $tmp; # return a 0D ndarray
        }

       We  have already mentioned that all basic operations support broadcasting and assignment is no exception.
       So here are a couple of broadcasted assignments

        pdl> $im = zeroes(byte, 10,20)
        pdl> $line = exp(-rvals(10)**2/9)
        # broadcasted assignment
        pdl> $im .= $line      # set every line of $im to $line
        pdl> $im2 .= 5         # set every element of $im2 to 5

       By now you probably see how it works and what it does, don't you?

       To finish the examples in this paragraph here is a function to create an RGB image from what is called  a
       palette image. The palette image consists of two parts: an image of indices into a color lookup table and
       the  color  lookup  table  itself. [ describe how it works ] We are going to use a PP-function we haven't
       encoutered yet in the previous examples. It is the aptly named index function, signature "((n),(),[o]())"
       (see Appendix A) with the core functionality that "index(pdl (0,2,4,5),2,($ret=null))"  will  return  the
       element with index 2 of the first input ndarray. In this case, $ret will contain the value 4.  So here is
       the example:

        # a broadcasted index lookup to generate an RGB, or RGBA or YMCK image
        # from a palette image (represented by a lookup table $palette and
        # an color-index image $im)
        # you can say just dummy(0) since the rules of broadcasting make it fit
        pdl> index($palette->xchg(0,1),
                      $im->indx->dummy(0,$palette->dim(0)),
                      $res=null);

       Let's  go  through  it  and  explain the steps involved. Assuming we are dealing with an RGB lookup-table
       $palette is of size "(3,x)". First we exchange the dimensions of the palette so that looping is done over
       the first dimension of $palette (of size 3 that represent r, g, and b components). Now looking at $im, we
       add a dummy dimension of size equal to the length of the  number  of  components  (in  the  case  we  are
       discussing  here  we  could  have  just used the number 3 since we have 3 color components). We can use a
       dummy dimension since for red, green and blue color components we use the same index  from  the  original
       image, e.g.  assuming a certain pixel of $im had the value 4 then the lookup should produce the triple

        [palette(0,4),palette(1,4),palette(2,4)]

       for  the  new  red, green and blue components of the output image. Hopefully by now you have some sort of
       idea what the above piece of code is supposed to do (it is often actually quite complicated  to  describe
       in  detail  how  a  piece  of broadcasting code works; just go ahead and experiment a bit to get a better
       feeling for it).

       If you have read the broadcasting rules carefully, then you might have noticed that  we  didn't  have  to
       explicitly  state  the size of the dummy dimension that we created for $im; when we create it with size 1
       (the default) the rules of broadcasting make it automatically fit to the desired size (by rule R4, in our
       example the size would be 3 assuming a palette of size "(3,x)"). Since  situations  like  this  do  occur
       often  in  practice  this  is actually why rule R4 has been introduced (the part that makes dimensions of
       size 1 fit to the broadcast loop dim size). So we can just say

        pdl> index($palette->xchg(0,1),$im->indx->dummy(0),($res=null));

       Again, you can convince yourself that this routine will create the right output if called  with  a  pixel
       ($im  is 0D), a line ($im is 1D), an image ($im is 2D), ..., an RGB lookup table (palette is "(3,x)") and
       RGBA lookup table (palette is "(4,x)", see e.g. OpenGL). This flexibility is achieved  by  the  rules  of
       broadcasting which are made to do the right thing in most situations.

       To wrap it all up once again, the general idea is as follows. If you want to achieve looping over certain
       dimensions  and  have  the  core functionality applied to another specified set of dimensions you use the
       dimension manipulating commands to create a (or several) virtual ndarray(s) so that  from  the  point  of
       view  of  the  parent  ndarray(s)  you  get what you want (always having the signature of the function in
       question and R1-R5 in mind!). Easy, isn't it ?

       See "dice" in PDL::Slices, and "dice_axis" in PDL::Slices for more  indexed  lookups,  and  "indexND"  in
       PDL::Slices, "whereND" in PDL::Slices, "whichND" in PDL::Slices for even more powerful constructs.

   Output auto-creation and PP-function calling conventions
       At  this  point  we  have  to  divert  to  some  technical detail that has to do with the general calling
       conventions of PP-functions and the automatic creation of output arguments.   Basically,  there  are  two
       ways of invoking PDL routines, namely

        $result = func($x,$y);

       and

        func($x,$y,$result);

       If you are only using implicit broadcasting then the output variable can be automatically created by PDL.
       You  flag  that  to  the  PP-function by setting the output argument to a special kind of ndarray that is
       returned from a call to the function "PDL->null" that returns an essentially "empty" ndarray  (for  those
       interested  in  details  there  is a flag in the C pdl structure for this). The dimensions of the created
       ndarray are determined by the rules of implicit broadcasting: the first dimensions are  the  core  output
       dimensions  to  which  the  broadcasting  dimensions  are  appended  (which are in turn determined by the
       dimensions of the input ndarrays as described above).  So you can say

        func($x,$y,($result=PDL->null));

       or

        $result = func($x,$y)

       which are exactly equivalent.

       Be warned that you can not use  output  auto-creation  when  using  explicit  broadcasting  (for  reasons
       explained in the following section on explicit broadcasting, the second variant of broadcasting).

       In  "tight" loops you probably want to avoid the implicit creation of a temporary ndarray in each step of
       the loop that comes along with the "functional" style but rather say

        # create output ndarray of appropriate size only at first invocation
        $result = null;
        for (0...$n) {
             func($x,$y,$result); # in all but the first invocation $result
             func2($y);           # is defined and has the right size to
                                  # take the output provided $y's dims don't change
             twiddle($result,$x); # do something from $result to $x for iteration
        }

       The take-home message of this section once more: be aware of the limitation on output creation when using
       explicit broadcasting.

   Explicit broadcasting
       Having so far only talked about the first flavour of broadcasting it is now about time to  introduce  the
       second  variant. Instead of shuffling around dimensions all the time and relying on the rules of implicit
       broadcasting to get it all right you sometimes might want to specify  in  a  more  explicit  way  how  to
       perform  the  broadcast  loop.  It is probably not too surprising that this variant of the game is called
       explicit broadcasting.  Now, before we create  the  wrong  impression:  it  is  not  either  implicit  or
       explicit; the two flavours do mix. But more about that later.

       The  two  most used functions with explicit broadcasting are broadcast and unbroadcast.  We start with an
       example that illustrates typical usage of the former:

        [ # ** this is the worst possible example to start with ]
        #  but can be used to show that $mat += $line is different from
        #                               $mat->broadcast(0) += $line
        # explicit broadcasting to add a vector to each column of a matrix
        pdl> $mat  = zeroes(4,3)
        pdl> $line = pdl (3.1416,2,-2)
        pdl> $mat->broadcast(0) += $line

       In this example, "$mat->broadcast(0)" tells PDL that you want the second dimension of this ndarray to  be
       broadcast over first leading to a broadcast loop that can be expressed as

        for (j=0; j<3; j++) {
           for (i=0; i<4; i++) {
               mat(i,j) += src(j);
           }
        }

       "broadcast"  takes  a list of numbers as arguments which explicitly specify which dimensions to broadcast
       over first. With the introduction of explicit broadcasting the dimensions of an ndarray are  conceptually
       split  into  three  different  groups  the  latter  two  of  which we have already encountered: broadcast
       dimensions, core dimensions and extra dimensions.

       Conceptually, it is best to think of those dimensions of an ndarray that have been specified in a call to
       "broadcast" as being taken away from the set of normal  dimensions  and  put  on  a  separate  stack.  So
       assuming we have an ndarray "x(4,7,2,8)" saying

        $y = $x->broadcast(2,1)

       creates  a  new  virtual ndarray of dimension "y(4,8)" (which we call the remaining dims) that also has 2
       broadcast dimensions of size "(2,7)". For the purposes of this document we  write  that  symbolically  as
       "y(4,8){2,7}". An important difference to the previous examples where only implicit broadcasting was used
       is  the  fact  that  the  core  dimensions  are  matched  against  the remaining dimensions which are not
       necessarily the first dimensions of the ndarray. We will  now  specify  how  the  presence  of  broadcast
       dimensions changes the rules R1-R5 for broadcast loops (which apply to the special case where none of the
       ndarray arguments has any broadcast dimensions).

       B0  Core  dimensions  are  matched against the first n remaining dimensions of the ndarray argument (note
           the difference to R1). Any further  remaining  dimensions  are  extra  dimensions  and  are  used  to
           determine the implicit loop dimensions.

       B1a The  number of implicit loop dimensions is equal to the maximal number of extra dimensions taken over
           the set of ndarray arguments.

       B1b The number of explicit loop dimensions is equal to the maximal number of broadcast  dimensions  taken
           over the set of ndarray arguments.

       B1c The total number of loop dimensions is equal to the sum of explicit loop dimensions and implicit loop
           dimensions.  In  the  broadcast loop, explicit loop dimensions are broadcasted over first followed by
           implicit loop dimensions.

       B2  The size of each of the loop dimensions is derived from the size of the respective dimensions of  the
           ndarray  arguments.  It  is  given  by  the  maximal size found in any ndarrays having this broadcast
           dimension (for explicit loop dimensions) or extra dimension (for implicit loop dimensions).

       B3  This rule applies to any explicit loop dimension as well as any  implicit  loop  dimension.  For  all
           ndarrays  that  have  a  given  broadcast/extra  dimension  the size must be equal to the size of the
           respective explicit/implicit loop dimension or 1; otherwise you raise a  runtime  exception.  If  the
           size  of  a  broadcast/extra  dimension  of  an  ndarray  is  one it is implicitly treated as a dummy
           dimension of size equal to the explicit/implicit loop dimension.

       B4  If an ndarray doesn't have a broadcast/extra dimension that corresponds to an explicit/implicit  loop
           dimension, in the broadcast loop this ndarray is treated as if having a dummy dimension of size equal
           to the size of that loop dimension.

       B4a All ndarrays that do have broadcast dimensions must have the same number of broadcast dimensions.

       B5  Output  auto-creation  cannot  be  used if any of the ndarray arguments has any broadcast dimensions.
           Otherwise R5 applies.

       The same restrictions apply with regard to implicit dummy dimensions (created by application  of  B4)  as
       already mentioned in the section on implicit broadcasting: if any of the output ndarrays has an (explicit
       or implicitly created) greater-than-one dummy dimension a runtime exception will be raised.

       Let  us  demonstrate  these  rules  at  work in a generic case.  Suppose we have a (here unspecified) PP-
       function with the signature:

        func((m,n),(m),(),[o](m))

       and you  call  it  with  3  ndarrays  "a(5,3,10,11)",  "b(3,5,10,1,12)",  c(10)  and  an  output  ndarray
       "d(3,11,5,10,12)" (which can here not be automatically created) as

        func($x->broadcast(1,3),$y->broadcast(0,3),$c,$d->broadcast(0,1))

       From the signature of func and the above call the ndarrays split into the following groups of core, extra
       and broadcast dimensions (written in the form "pdl(core dims){broadcast dims}[extra dims]"):

        a(5,10){3,11}[] b(5){3,1}[10,12] c(){}[10] d(5){3,11}[10,12]

       With this to help us along (it is in general helpful to write the arguments down like this when you start
       playing  with  broadcasting and want to keep track of what is going on) we further deduce that the number
       of explicit loop dimensions is 2 (by B1b from $a and $b) with sizes "(3,11)" (by  B2);  2  implicit  loop
       dimensions  (by  B1a  from $b and $d) of size "(10,12)" (by B2) and the elements of are computed from the
       input ndarrays in a way that can be expressed in pdl pseudo-code as

        for (l=0;l<12;l++)
         for (k=0;k<10;k++)
          for (j=0;j<11;j++)         effect of treating it as dummy dim (index j)
           for (i=0;i<3;i++)                         |
              d(i,j,:,k,l) = func(a(:,i,:,j),b(i,:,k,0,l),c(k))

       Ugh, this example was really not easy in terms of bookkeeping. It serves mostly  as  an  example  how  to
       figure out what's going on when you encounter a complicated looking expression. But now it is really time
       to show that broadcasting is useful by giving some more of our so called "practical" examples.

       [  The following examples will need some additional explanations in the future. For the moment please try
       to live with the comments in the code fragments. ]

       Example 1:

        *** inverse of matrix represented by eigvecs and eigvals
        ** given a symmetrical matrix M = A^T x diag(lambda_i) x A
        **    =>  inverse M^-1 = A^T x diag(1/lambda_i) x A
        ** first $tmp = diag(1/lambda_i)*A
        ** then  A^T * $tmp by broadcasted inner product
        # index handling so that matrices print correct under pdl
        $inv .= $evecs*0;  # just copy to get appropriately sized output
        $tmp .= $evecs;    # initialise, no back-propagation
        $tmp->broadcast(0) /= $evals;    #  broadcasted division
        # and now a matrix multiplication in disguise
        PDL::Primitive::inner($evecs->xchg(0,1)->broadcast(-1,1),
                              $tmp->broadcast(0,-1),
                              $inv->broadcast(0,1));
        # alternative for matrix mult using implicit broadcasting,
        # first xchg only for transpose
        PDL::Primitive::inner($evecs->xchg(0,1)->dummy(1),
                              $tmp->xchg(0,1)->dummy(2),
                              ($inv=null));

       Example 2:

        # outer product by broadcasted multiplication
        # stress that we need to do it with explicit call to my_biop1
        # when using explicit broadcasting
        $res=zeroes($x->dim(0),$y->dim(0));
        my_biop1($x->broadcast(0,-1),$y->broadcast(-1,0),$res->(0,1),"*");
        # similar thing by implicit broadcasting with auto-created ndarray
        $res = $x->dummy(1) * $y->dummy(0);

       Example 3:

        # different use of broadcast and unbroadcast to shuffle a number of
        # dimensions in one go without lots of calls to ->xchg and ->mv

        # use broadcast/unbroadcast to shuffle dimensions around
        # just try it out and compare the child ndarray with its parent
        $trans = $x->broadcast(4,1,0,3,2)->unbroadcast;

       Example 4:

        # calculate a couple of bounding boxes
        # $bb will hold BB as [xmin,xmax],[ymin,ymax],[zmin,zmax]
        # we use again broadcast and unbroadcast to shuffle dimensions around
        pdl> $bb = zeroes(double, 2,3 );
        pdl> minimum($vertices->broadcast(0)->clump->unbroadcast(1), $bb->slice('(0),:'));
        pdl> maximum($vertices->broadcast(0)->clump->unbroadcast(1), $bb->slice('(1),:'));

       Example 5:

        # calculate a self-rationed (i.e. self normalized) sequence of images
        # uses explicit broadcasting and an implicitly broadcasted division
        $stack = read_image_stack();
        # calculate the average (per pixel average) of the first $n+1 images
        $aver = zeroes([stack->dims]->[0,1]);  # make the output ndarray
        sumover($stack->slice(":,:,0:$n")->broadcast(0,1),$aver);
        $aver /= ($n+1);
        $stack /= $aver;  # normalize the stack by doing a broadcasted division
        # implicit versus explicit
        # alternatively calculate $aver with implicit broadcasting and auto-creation
        sumover($stack->slice(":,:,0:$n")->mv(2,0),($aver=null));
        $aver /= ($n+1);
        #

   Implicit versus explicit broadcasting
       In this paragraph we are going to illustrate when  explicit  broadcasting  is  preferable  over  implicit
       broadcasting  and vice versa. But then again, this is probably not the best way of putting the case since
       you already know: the two flavours do mix. So, it's more about how to get the best of  both  worlds  and,
       anyway, in the best of Perl traditions: TIMTOWTDI !

       [ Sorry, this still has to be filled in in a later release; either refer to above examples or choose some
       new ones ]

       Finally,  this  may be a good place to justify all the technical detail we have been going on about for a
       couple of pages: why broadcasting ?

       Well, code that uses broadcasting should be (considerably) faster than code that uses explicit  for-loops
       (or similar Perl constructs) to achieve the same functionality. Especially on supercomputers (with vector
       computing  facilities/parallel  processing) PDL broadcasting is implemented in a way that takes advantage
       of the additional facilities of these machines.  Furthermore,  it  is  a  conceptually  simple  construct
       (though  technical details might get involved at times) and can greatly reduce the syntactical complexity
       of PDL code (but keep the admonition for documentation in  mind).  Once  you  are  comfortable  with  the
       broadcasting  way of thinking (and coding) it shouldn't be too difficult to understand code that somebody
       else has written than (provided they gave you an idea what expected input dimensions  are,  etc.).  As  a
       general  tip to increase the performance of your code: if you have to introduce a loop into your code try
       to reformulate the problem so that you can use broadcasting to perform the loop (as with  anything  there
       are  exceptions to this rule of thumb; but the authors of this document tend to think that these are rare
       cases ;).

PDL::PP

   An easy way to define functions that are aware of indexing and broadcasting (and the universe and everything)

       PDL:PP is part of the PDL distribution. It is used to generate functions that are aware of  indexing  and
       broadcasting rules from very concise descriptions. It can be useful for you if you want to write your own
       functions  or  if you want to interface functions from an external library so  that they support indexing
       and broadcasting (and maybe dataflow as well, see PDL::Dataflow). For further details check PDL::PP.

Appendix A

   signatures of standard PDL::PP compiled functions
       A selection of signatures of PDL primitives to show how many dimensions PP compiled functions  gobble  up
       (and  therefore  you can figure out what will be broadcasted over). Most of those functions are the basic
       ones defined in "primitive.pd"

        # functions in primitive.pd
        #
        sumover        ((n),[o]())
        prodover       ((n),[o]())
        axisvalues     ([o](n))                                inplace
        inner          ((n),(n),[o]())
        outer          ((n),(m),[o](n,m))
        innerwt        ((n),(n),(n),[o]())
        inner2         ((m),(m,n),(n),[o]())
        inner2t        ((j,n),(n,m),(m,k),[o]())
        index          (1D,0D,[o])
        minimum        (1D,[o])
        maximum        (1D,[o])
        wstat          ((n),(n),(),[o],())
        assgn          ((),())

        # basic operations
        binary operations ((),(),[o]())
        unary operations  ((),[o]())

AUTHOR & COPYRIGHT

       Copyright   (C)   1997   Christian    Soeller    (c.soeller@auckland.ac.nz)    &    Tuomas    J.    Lukka
       (lukka@fas.harvard.edu).

perl v5.40.0                                       2025-02-04                                 PDL::Indexing(3pm)