Provided by: perl-doc_5.38.2-3.2ubuntu0.1_all bug

NAME

       perlclassguts - Internals of how "feature 'class'" and class syntax works

DESCRIPTION

       This document provides in-depth information about the way in which the perl interpreter implements the
       "feature 'class'" syntax and overall behaviour.  It is not intended as an end-user guide on how to use
       the feature. For that, see perlclass.

       The reader is assumed to be generally familiar with the perl interpreter internals overall. For a more
       general overview of these details, see also perlguts.

DATA STORAGE

   Classes
       A class is fundamentally a package, and exists in the symbol table as an HV with an aux structure in
       exactly the same way as a non-class package. It is distinguished from a non-class package by the fact
       that the HvSTASH_IS_CLASS() macro will return true on it.

       Extra information relating to it being a class is stored in the "struct xpvhv_aux" structure attached to
       the stash, in the following fields:

           HV          *xhv_class_superclass;
           CV          *xhv_class_initfields_cv;
           AV          *xhv_class_adjust_blocks;
           PADNAMELIST *xhv_class_fields;
           PADOFFSET    xhv_class_next_fieldix;
           HV          *xhv_class_param_map;

       •   "xhv_class_superclass"  will  be "NULL" for a class with no superclass. It will point directly to the
           stash of the parent class if one has been set with the :isa() class attribute.

       •   "xhv_class_initfields_cv" will contain a "CV *" pointing to a function to be invoked as part  of  the
           constructor  of  this  class or any subclass thereof. This CV is responsible for initializing all the
           fields defined by this class for a new instance. This CV will be an anonymous real  function  -  i.e.
           while it has no name and no GV, it is not a protosub and may be directly invoked.

       •   "xhv_class_adjust_blocks"  may  point  to an AV containing CV pointers to each of the "ADJUST" blocks
           defined on the class. If the class has a superclass, this array will additionally  contain  duplicate
           pointers of the CVs of its parent class. The AV is created lazily the first time an element is pushed
           to it; it is valid for there not to be one, and this pointer will be "NULL" in that case.

           The CVs are stored directly, not via RVs. Each CV will be an anonymous real function.

       •   "xhv_class_fields"  will point to a "PADNAMELIST" containing "PADNAME"s, each being one defined field
           of the class. They are stored in order of declaration. Note however, that the index into  this  array
           will  not necessarily be equal to the "fieldix" of each field, because in the case of a subclass, the
           array will begin at zero but the index of the first field in it will be non-zero if its parent  class
           contains any fields at all.

           For more information on how individual fields are represented, see "Fields".

       •   "xhv_class_next_fieldix" gives the field index that will be assigned to the next field to be added to
           the class. It is only useful at compile-time.

       •   "xhv_class_param_map" may point to an HV which maps field ":param" attribute names to the field index
           of  the field with that name. This mapping is copied from parent classes; each class will contain the
           sum total of all its parents in addition to its own.

   Fields
       A field is still fundamentally a lexical variable declared in a scope, and exists in the "PADNAMELIST" of
       its corresponding CV. Methods and other method-like CVs can still capture them exactly as they  can  with
       regular  lexicals.  A  field  is distinguished from other kinds of pad entry in that the PadnameIsFIELD()
       macro will return true on it.

       Extra information relating to it being a field is stored in an additional structure  accessible  via  the
       PadnameFIELDINFO() macro on the padname. This structure has the following fields:

           PADOFFSET  fieldix;
           HV        *fieldstash;
           OP        *defop;
           SV        *paramname;
           bool       def_if_undef;
           bool       def_if_false;

       •   "fieldix"  stores  the  "field  index" of the field; that is, the index into the instance field array
           where this field's value will be stored. Note that the first index in  the  array  is  not  specially
           reserved. The first field in a class will start from field index 0.

       •   "fieldstash" stores a pointer to the stash of the class that defined this field. This is necessary in
           case  there are multiple classes defined within the same scope; it is used to disambiguate the fields
           of each.

               {
                   class C1; field $x;
                   class C2; field $x;
               }

       •   "defop" may store a pointer to a defaulting expression optree for this field.  Defaulting expressions
           are optional; this field may be "NULL".

       •   "paramname" may point to a regular string SV containing the ":param"  name  attribute  given  to  the
           field. If none, it will be "NULL".

       •   One  of "def_if_undef" and "def_if_false" will be true if the defaulting expression was set using the
           "//=" or "||=" operators respectively.

   Methods
       A method is still fundamentally a CV, and has the same basic representation as one. It has an optree  and
       a  pad,  and  is  stored via a GV in the stash of its containing package. It is distinguished from a non-
       method CV by the fact that the CvIsMETHOD() macro will return true on it.

       (Note: This macro should not be confused with the one that was previously  called  CvMETHOD().  That  one
       does not relate to the class system, and was renamed to CvNOWARN_AMBIGUOUS() to avoid this confusion.)

       There  is currently no extra information that needs to be stored about a method CV, so the structure does
       not add any new fields.

   Instances
       Object instances are represented by an entirely new SV type, whose base type is "SVt_PVOBJ". This  should
       still be blessed into its class stash and wrapped in an RV in the usual manner for classical object.

       As these are their own unique container type, distinct from hashes or arrays, the core "builtin::reftype"
       function returns a new value when asked about these. That value is "OBJECT".

       Internally,  such  an object is an array of SV pointers whose size is fixed at creation time (because the
       number of fields in a class is known after compilation). An object instance stores the  max  field  index
       within  it  (for  basic  error-checking  on  access),  and  a fixed-size array of SV pointers storing the
       individual field values.

       Fields of array and hash type directly store AV or HV pointers into the array; they are not stored via an
       intervening RV.

API

       The data structures described above are supported by the following API functions.

   Class Manipulation
       class_setup_stash

           void class_setup_stash(HV *stash);

       Called by the parser on encountering the "class" keyword. It upgrades the stash into being  a  class  and
       prepares it for receiving class-specific items like methods and fields.

       class_seal_stash

           void class_seal_stash(HV *stash);

       Called  by  the  parser  at  the  end  of a "class" block, or for unit classes its containing scope. This
       function performs various finalisation activities that are required before instances of the class can  be
       constructed,  but  could  not  have been done until all the information about the members of the class is
       known.

       Any additions to or modifications of the class under compilation must  be  performed  between  these  two
       function calls. Classes cannot be modified once they have been sealed.

       class_add_field

           void class_add_field(HV *stash, PADNAME *pn);

       Called  by  pad.c  as part of defining a new field name in the current pad.  Note that this function does
       not create the padname; that must already be done by pad.c. This API function simply  informs  the  class
       that the new field name has been created and is now available for it.

       class_add_ADJUST

           void class_add_ADJUST(HV *stash, CV *cv);

       Called by the parser once it has parsed and constructed a CV for a new "ADJUST" block. This gets added to
       the list stored by the class.

   Field Manipulation
       class_prepare_initfield_parse

           void class_prepare_initfield_parse();

       Called  by the parser just before parsing an initializing expression for a field variable. This makes use
       of a suspended compcv to combine all the field initializing expressions into the same CV.

       class_set_field_defop

           void class_set_field_defop(PADNAME *pn, OPCODE defmode, OP *defop);

       Called by the parser after it has parsed an initializing expression for the field.  Sets  the  defaulting
       expression  and  mode  of  application.  "defmode"  should  either  be  zero,  or one of "OP_ORASSIGN" or
       "OP_DORASSIGN" depending on the defaulting mode.

       padadd_FIELD

           #define padadd_FIELD

       This flag constant tells the "pad_add_name_*" family of functions that the new name should be added as  a
       field. There is no need to call class_add_field(); this will be done automatically.

   Method Manipulation
       class_prepare_method_parse

           void class_prepare_method_parse(CV *cv);

       Called by the parser after start_subparse() but immediately before doing anything else. This prepares the
       "PL_compcv"  for  parsing  a  method;  arranging  for  the "CvIsMETHOD" test to be true, adding the $self
       lexical, and any other activities that may be required.

       class_wrap_method_body

           OP *class_wrap_method_body(OP *o);

       Called by the parser at the end of parsing a method body into an optree but just before  wrapping  it  in
       the eventual CV. This function inserts extra ops into the optree to make the method work correctly.

   Object Instances
       SVt_PVOBJ

           #define SVt_PVOBJ

       An SV type constant used for comparison with the SvTYPE() macro.

       ObjectMAXFIELD

           SSize_t ObjectMAXFIELD(sv);

       A  function-like  macro  that  obtains  the  maximum  valid  field  index  that  can be accessed from the
       "ObjectFIELDS" array.

       ObjectFIELDS

           SV **ObjectFIELDS(sv);

       A function-like macro that obtains the fields array directly out of an object  instance.  Fields  can  be
       accessed by their field index, from 0 up to the maximum valid index given by "ObjectMAXFIELD".

OPCODES

   OP_METHSTART
           newUNOP_AUX(OP_METHSTART, ...);

       An  "OP_METHSTART" is an "UNOP_AUX" which must be present at the start of a method CV in order to make it
       work properly. This is inserted by class_wrap_method_body(), and even appears before any optree  fragment
       associated with signature argument checking or extraction.

       This  op  is  responsible for shifting the value of $self out of the arguments list and binding any field
       variables that the method requires access to into the pad. The AUX vector will  contain  details  of  the
       field/pad index pairings required.

       This  op  also  performs sanity checking on the invocant value. It checks that it is definitely an object
       reference of a compatible class type. If not, an exception is thrown.

       If the "op_private" field includes the "OPpINITFIELDS" flag,  this  indicates  that  the  op  begins  the
       special  "xhv_class_initfields_cv" CV. In this case it should additionally take the second value from the
       arguments list, which should be a plain HV pointer (directly, not via RV). and bind it to the second  pad
       slot, where the generated optree will expect to find it.

   OP_INITFIELD
       An  "OP_INITFIELD"  is  only  invoked as part of the "xhv_class_initfields_cv" CV during the construction
       phase of an instance. This is the time that the individual SVs that make up the  mutable  fields  of  the
       instance   (including   AVs   and   HVs)  are  actually  assigned  into  the  "ObjectFIELDS"  array.  The
       "OPpINITFIELD_AV" and "OPpINITFIELD_HV" private flags indicate whether it is creating an  AV  or  HV;  if
       neither is set then an SV is created.

       If  the op has the "OPf_STACKED" flag it expects to find an initializing value on the stack. For SVs this
       is the topmost SV on the data stack. For AVs and HVs it expects a marked list.

COMPILE-TIME BEHAVIOUR

   "ADJUST" Phasers
       During compiletime, parsing of an "ADJUST" phaser is handled in a  fundamentally  different  way  to  the
       existing perl phasers ("BEGIN", etc...)

       Rather  than  taking  the  usual  route,  the tokenizer recognises that the "ADJUST" keyword introduces a
       phaser block. The parser then parses the  body  of  this  block  similarly  to  how  it  would  parse  an
       (anonymous) method body, creating a CV that has no name GV. This is then inserted directly into the class
       information by calling "class_add_ADJUST", entirely bypassing the symbol table.

   Attributes
       During compilation, attributes of both classes and fields are handled in a different way to existing perl
       attributes on subroutines and lexical variables.

       The  parser  still  forms  an  "OP_LIST"  optree  of  "OP_CONST"  nodes,  but  these  are  passed  to the
       "class_apply_attributes" or "class_apply_field_attributes" functions. Rather than using  a  class  lookup
       for  a  method  in  the  class  being  parsed,  a fixed internal list of known attributes is used to find
       functions to apply the attribute to the  class  or  field.  In  future  this  may  support  user-supplied
       extension attribute, though at present it only recognises ones defined by the core itself.

   Field Initializing Expressions
       During compilation, the parser makes use of a suspended compcv when parsing the defaulting expression for
       a  field.  All  the expressions for all the fields in the class share the same suspended compcv, which is
       then compiled up into the same internal CV called  by  the  constructor  to  initialize  all  the  fields
       provided by that class.

RUNTIME BEHAVIOUR

   Constructor
       The  generated  constructor for a class itself is an XSUB which performs three tasks in order: it creates
       the instance SV itself,  invokes  the  field  initializers,  then  invokes  the  ADJUST  block  CVs.  The
       constructor  for  any  class  is  always  the  same  basic  shape,  regardless of whether the class has a
       superclass or not.

       The field initializers are collected into a generated optree-based CV called the  field  initializer  CV.
       This  is  the  CV  which  contains  all the optree fragments for the field initializing expressions. When
       invoked, the field initializer CV might make a chained call to the superclass initializer if one  exists,
       before  invoking all of the individual field initialization ops. The field initializer CV is invoked with
       two items on the stack; being the instance SV and a direct HV containing the constructor parameters. Note
       carefully: this HV is passed directly, not via an RV reference. This is permitted because both the caller
       and the callee are directly generated code and not arbitrary pure-perl subroutines.

       The ADJUST block CVs are all collected into a single flat list, merging all of the ones  defined  by  the
       superclass as well. They are all invoked in order, after the field initializer CV.

   $self Access During Methods
       When  class_prepare_method_parse() is called, it arranges that the pad of the new CV body will begin with
       a lexical called $self. Because the pad should be freshly-created at this point, this will have  the  pad
       index of 1.  The function checks this and aborts if that is not true.

       Because  of this fact, code within the body of a method or method-like CV can reliably use pad index 1 to
       obtain the invocant reference. The "OP_INITFIELD" opcode also relies on this fact.

       In similar fashion, during the "xhv_class_initfields_cv" the next pad slot is  relied  on  to  store  the
       constructor parameters HV, at pad index 2.

AUTHORS

       Paul Evans

perl v5.38.2                                       2025-04-08                                   PERLCLASSGUTS(1)