Provided by: libchemistry-opensmiles-perl_0.9.0-1_all bug

NAME

       Chemistry::OpenSMILES - OpenSMILES format reader and writer

SYNOPSIS

           use Chemistry::OpenSMILES::Parser;

           my $parser = Chemistry::OpenSMILES::Parser->new;
           my @moieties = $parser->parse( 'C#C.c1ccccc1' );

           $\ = "\n";
           for my $moiety (@moieties) {
               #  $moiety is a Graph::Undirected object
               print scalar $moiety->vertices;
               print scalar $moiety->edges;
           }

           use Chemistry::OpenSMILES::Writer qw(write_SMILES);

           print write_SMILES( \@moieties );

DESCRIPTION

       Chemistry::OpenSMILES provides support for SMILES chemical identifiers conforming to OpenSMILES v1.0
       specification (<http://opensmiles.org/opensmiles.html>).

       Chemistry::OpenSMILES::Parser reads in SMILES strings and returns them parsed to arrays of
       Graph::Undirected objects. Each atom is represented by a hash.

       Chemistry::OpenSMILES::Writer performs the inverse operation. Generated SMILES strings are by no means
       optimal.

   Molecular graph
       Disconnected parts of a compound are represented as separate Graph::Undirected objects. Atoms are
       represented as vertices, and bonds are represented as edges.

       Atoms

       Atoms, or vertices of a molecular graph, are represented as hash references:

           {
               "symbol"    => "C",
               "isotope"   => 13,
               "chirality" => "@@",
               "hcount"    => 3,
               "charge"    => 1,
               "class"     => 0,
               "number"    => 0,
           }

       Except for "symbol", "class" and "number", all keys of hash are optional. Per OpenSMILES specification,
       default values for "hcount" and "class" are 0.

       For chiral atoms, the order of its neighbours in input is preserved in an array added as value for
       "chirality_neighbours" key of the atom hash.

       Bonds

       Bonds, or edges of a molecular graph, rely completely on Graph::Undirected internal representation. Bond
       orders other than single ("-", which is also a default) are represented as values of edge attribute
       "bond". They correspond to the symbols used in OpenSMILES specification.

   Options
       "parse" accepts the following options for key-value pairs in an anonymous hash for its second parameter:

       "max_hydrogen_count_digits"
           In  OpenSMILES  specification  the  number of attached hydrogen atoms for atoms in square brackets is
           limited  to  9.  IUPAC  SMILES+  has   increased   this   number   to   99.   With   the   value   of
           "max_hydrogen_count_digits"  the  parser could be instructed to allow other than 1 digit for attached
           hydrogen count.

       "raw"
           With "raw" set to anything evaluating to true, the parser  will  not  convert  neither  implicit  nor
           explicit hydrogen atoms in square brackets to atom hashes of their own. Moreover, it will not attempt
           to  unify  the  representations of chirality. It should be noted, though, that many of subroutines of
           Chemistry::OpenSMILES expect  non-raw  data  structures,  thus  processing  raw  output  may  produce
           distorted results.

CAVEATS

       Element  symbols  in square brackets are not limited to the ones known to chemistry. Currently any single
       or two-letter symbol is allowed.

       Deprecated charge notations ("--" and "++") are supported.

       OpenSMILES specification mandates a strict order of ring bonds and branches:

           branched_atom ::= atom ringbond* branch*

       Chemistry::OpenSMILES::Parser supports both the mandated, and inverted structure, where ring bonds follow
       branch descriptions.

       Whitespace is not supported yet. SMILES descriptors must be cleaned of it before attempting reading  with
       Chemistry::OpenSMILES::Parser.

       The  derivation  of  implicit  hydrogen  counts  for  aromatic  atoms is not unambiguously defined in the
       OpenSMILES specification. Thus only aromatic carbon is accounted for as if having valence of 3.

       Chiral atoms with three neighbours are interpreted as having a lone  pair  of  electrons  as  the  fourth
       chiral  neighbour.  The  lone  pair  is  always  understood as being the second in the order of neighbour
       enumeration, except when the atom with the lone pair starts a chain. In that case lone pair is the first.

SEE ALSO

       perl(1)

AUTHORS

       Andrius Merkys, <merkys@cpan.org>

perl v5.36.0                                       2023-10-26                         Chemistry::OpenSMILES(3pm)