Provided by: freebsd-manpages_12.2-1_all bug

NAME

       PCBGROUP — Distributed Protocol Control Block Groups

SYNOPSIS

       options PCBGROUP

       #include <sys/param.h>
       #include <netinet/in.h>
       #include <netinet/in_pcb.h>

       void
       in_pcbgroup_init(struct inpcbinfo *pcbinfo, u_int hashfields, int hash_nelements);

       void
       in_pcbgroup_destroy(struct inpcbinfo *pcbinfo);

       struct inpcbgroup *
       in_pcbgroup_byhash(struct inpcbinfo *pcbinfo, u_int hashtype, uint32_t hash);

       struct inpcbgroup *
       in_pcbgroup_byinpcb(struct inpcb *inp);

       void
       in_pcbgroup_update(struct inpcb *inp);

       void
       in_pcbgroup_update_mbuf(struct inpcb *inp, struct mbuf *m);

       void
       in_pcbgroup_remove(struct inpcb *inp);

       int
       in_pcbgroup_enabled(struct inpcbinfo *pcbinfo);

       #include <netinet6/in6_pcb.h>

       struct inpcbgroup *
       in6_pcbgroup_byhash(struct inpcbinfo *pcbinfo, u_int hashtype, uint32_t hash);

DESCRIPTION

       This  implementation  introduces  notions of affinity for connections and distribute work so as to reduce
       lock contention, with  hardware  work  distribution  strategies  such  as  RSS.   In  this  construction,
       connection  groups  supplement,  rather  than replace, existing reservation tables for protocol 4-tuples,
       offering CPU-affine lookup tables with minimal cache line migration and  lock  contention  during  steady
       state operation.

       Internet  protocols  like  UDP  and  TCP register to use connection groups by providing an ipi_hashfields
       value other than IPI_HASHFIELDS_NONE.  This indicates to the connection group code whether a  2-tuple  or
       4-tuple  is  used  as an argument to hashes that assign a connection to a particular group.  This must be
       aligned with any hardware-offloaded distribution model, such  as  RSS  or  similar  approaches  taken  in
       embedded  network boards.  Wildcard sockets require special handling, as in Willmann 2006, and are shared
       between connection groups while being protected  by  group-local  locks.   Connection  establishment  and
       teardown  can  be  signficantly  more  expensive  than  without  connection groups, but that steady-state
       processing can be significantly faster.

       Enabling PCBGROUP in the kernel only provides the infrastructure required to create and  manage  multiple
       PCB  groups.  An implementation needs to fill in a few functions to provide PCB group hash information in
       order for PCBs to be placed in a PCB group.

   Operation
       By default, each PCB info block (struct pcbinfo) has a single hash for all  PCB  entries  for  the  given
       protocol  with  a  single lock protecting it.  This can be a significant source of lock contention on SMP
       hardware.  When a PCBGROUP is created, an array of separate hash tables are created, each  with  its  own
       lock.   A separate table for wildcard PCBs is provided.  By default, a PCBGROUP table is created for each
       available CPU.  The PCBGROUP code attempts to calculate a hash value from the  given  PCB  or  mbuf  when
       looking  up  a  PCBGROUP.   While  processing  a  received  frame,  in_pcbgroup_byhash()  can  be used in
       conjunction with either a hardware-provided hash value (eg the RSS(9) calculated hash value  provided  by
       some  NICs)  or  a  software-provided  hash value in order to choose a PCBGROUP table to query.  A single
       table lock is held while performing a wildcard match.  However, all  of  the  table  locks  are  acquired
       before  modifying  the wildcard table.  The PCBGROUP tables operate in conjunction with the normal single
       PCB list in a PCB info block.  Thus, inserting and removing a PCB will still  incur  the  same  costs  as
       without  PCBGROUP.   A  protocol  which uses PCBGROUP should fall back to the normal PCB list lookup if a
       call to the PCBGROUP layer does not yield a lookup hit.

   Usage
       Initialize a PCBGROUP in a PCB info block (struct pcbinfo) by calling in_pcbgroup_init().

       Add  a  connection  to  a  PCBGROUP  with  in_pcbgroup_update().   Connections  are   removed   by   with
       in_pcbgroup_remove().   These  in  turn will determine which PCBGROUP bucket the given PCB is placed into
       and calculate the hash value appropriately.

       Wildcard PCBs are hashed differently and placed in a single wildcard PCB list.  If RSS(9) is enabled  and
       in  use, RSS-aware wildcard PCBs are placed in a single PCBGROUP based on RSS information.  Protocols may
       look  up  the  PCB  entry  in  a  PCBGROUP  by  using  the  lookup  functions  in_pcbgroup_byhash()   and
       in_pcbgroup_byinpcb().

IMPLEMENTATION NOTES

       The PCB code in sys/netinet and sys/netinet6 is aware of PCBGROUP and will call into the PCBGROUP code to
       do PCBGROUP assignment and lookup, preferring a PCBGROUP lookup to the default global PCB info table.

       An  implementor  wishing  to  experiment  or  modify  the  PCBGROUP  assignment should modify this set of
       functions:

             in_pcbgroup_getbucket() and in6_pcbgroup_getbucket()
                       Map  a  given  32  bit  hash  value  to  a  PCBGROUP.   By  default  this   is   hash   %
                       number_of_pcbgroups.  However, this distribution may not align with NIC receive queues or
                       the netisr(9) configuration.

             in_pcbgroup_byhash() and in6_pcbgroup_byhash()
                       Map  a  32  bit  hash  value  and a hash type identifier to a PCBGROUP.  By default, this
                       simply  returns  NULL.   This  function  is  used  by  the  mbuf(9)   receive   path   in
                       sys/netinet/in_pcb.c to map an mbuf to a PCBGROUP.

             in_pcbgroup_bytuple() and in6_pcbgroup_bytuple()
                       Map  the source and destination address and port details to a PCBGROUP.  By default, this
                       does a very simple XOR hash.  This function is used by both the PCB lookup code and as  a
                       fallback in the mbuf(9) receive path in sys/netinet/in_pcb.c.

SEE ALSO

       mbuf(9), netisr(9), RSS(9)

       Paul  Willmann, Scott Rixner, and Alan L. Cox, “An Evaluation of Network Stack Parallelization Strategies
       in     Modern     Operating     Systems”,     2006     USENIX      Annual      Technical      Conference,
       http://www.ece.rice.edu/~willmann/pubs/paranet_usenix.pdf, 2006.

HISTORY

       PCBGROUP first appeared in FreeBSD 9.0.

AUTHORS

       The  PCBGROUP  implementation  was written by Robert N. M. Watson <rwatson@FreeBSD.org> under contract to
       Juniper Networks, Inc.

       This manual page written by Adrian Chadd <adrian@FreeBSD.org>.

NOTES

       The RSS(9) implementation currently uses #ifdef blocks to tie into PCBGROUP.  This is a sign that a  more
       abstract programming API is needed.

       There  is  currently  no  support  for re-balancing the PCBGROUP assignment, nor is there any support for
       overriding which PCBGROUP a socket/PCB should be in.

       No statistics are kept to indicate how often PCBGROUP lookups succeed or fail.

Debian                                            July 23, 2014                                      PCBGROUP(9)