Provided by: libbtparse-dev_0.89-1build3_amd64 bug

NAME

       bt_misc - miscellaneous BibTeX-like string-processing utilities

SYNOPSIS

          void bt_purify_string (char * string, btshort options);
          void bt_change_case (char transform, char * string, btshort options);

DESCRIPTION

       bt_purify_string()
              void bt_purify_string (char * string, btshort options);

           "Purifies"  a  "string"  in  the  BibTeX  way  (usually  used for generating sort keys).  "string" is
           modified in-place.  "options" is currently unused; just set it  to  zero  for  future  compatibility.
           Purification  consists  of  copying  alphanumeric  characters,  converting hyphens and ties to space,
           copying spaces, and skipping (almost) everything else.

           "Almost" because "special characters"  (used  for  accented  and  non-English  letters)  are  handled
           specially.  Recall that a BibTeX special character is any brace-group that starts at brace-depth zero
           whose first character is a backslash.  For instance, the string

              {\foo bar}Herr M\"uller went from {P{\r r}erov} to {\AA}rhus

           contains  two  special  characters:  "{\foo  bar}"  and  "\AA".  Neither the "\"u" nor the "\r r" are
           special characters, because they are not at the right brace depth.

           Special characters are handled as follows: if the control sequence (the TeX command that follows  the
           backslash) is recognized as one of LaTeX's "foreign letters" ("\oe", "\ae", "\o", "\l", "\ae", "\ss",
           plus uppercase versions), then it is converted to a reasonable English approximation by stripping the
           backslash  and  converting  the  second  character  (if any) to lowercase; thus, "{\AA}" in the above
           example would become simply "Aa".  All other control sequences in a special character  are  stripped,
           as are all non-alphabetic characters.

           For example the above string, after "purification," becomes

              barHerr Muller went from Pr rerov to Aarhus

           Obviously,  something has gone wrong with the word "P{\r r}erov" (a town in the Czech Republic).  The
           accented `r' should be a special character, starting at brace-depth zero.   If  the  original  string
           were instead

              {\foo bar}Herr M\"uller went from P{\r r}erov to {\AA}rhus

           then the purified result would be more sensible:

              barHerr Muller went from Prerov to Aarhus

           Note  the use of a "nonsense" special character "{\foo bar}": this trick is often used to put certain
           text in a string solely for generating sort keys; the text is  then  ignored  when  the  document  is
           processed  by TeX (as long as "\foo" is defined as a no-op TeX macro).  This assumes, of course, that
           the output is eventually processed by TeX; if not, then this trick will backfire on you.

           Also, bt_purify_string() is adequate for generating sort keys when you  want  to  sort  according  to
           English-language  conventions.   To  follow  the  conventions  of  other  languages,  though,  a more
           sophisticated approach will be needed; hopefully,  future  versions  of  btparse  will  address  this
           deficiency.

       bt_change_case()
              void bt_change_case (char transform, char * string, btshort options);

           Converts a string to lowercase, uppercase, or "non-book title capitalization", with special attention
           paid  to BibTeX special characters and other brace-groups.  The form of conversion is selected by the
           single character "transform": 'u' to convert to uppercase, 'l' for  lowercase,  and  't'  for  "title
           capitalization".   "string"  is  modified in-place, and "options" is currently unused; set it to zero
           for future compatibility.

           Lowercase and uppercase conversion are obvious, with the proviso  that  text  in  braces  is  treated
           differently  (explained  below).   Title  capitalization simply means that everything is converted to
           lowercase, except the first letter of the first word, and words  immediately  following  a  colon  or
           sentence-ending punctuation.  For instance,

              Flying Squirrels: Their Peculiar Habits. Part One

           would be converted to

              Flying squirrels: Their peculiar habits. Part one

           Text  within  braces  is  handled  as  follows.   First,  in  a  "special  character"  (see above for
           definition), control sequences that constitute one  of  LaTeX's  non-English  letters  are  converted
           appropriately---e.g., when converting to lowercase, "\AE" becomes "\ae").  Any other control sequence
           in  a  special  character  (including  accents)  is  preserved,  and all text in a special character,
           regardless  of  depth  and  punctuation,  is  converted  to  lowercase  or  uppercase.   (For  "title
           capitalization," all text in a special character is converted to lowercase.)

           Brace  groups that are not special characters are left completely untouched: neither text nor control
           sequences within non-special character braces are touched.

           For example, the string

              A Guide to \LaTeXe: Document Preparation ...

           would, when "transform" is 't' (title capitalization), be converted to

              A guide to \latexe: Document preparation ...

           which is probably not the desired result.  A better attempt is

              A Guide to {\LaTeXe}: Document Preparation ...

           which becomes

              A guide to {\LaTeXe}: Document preparation ...

           However, if you go back and re-read the  description  of  bt_purify_string(),  you'll  discover  that
           "{\LaTeXe}"  here is a special character, but not a non-English letter: thus, the control sequence is
           stripped.  Thus, a sort key generated from this title would be

              A Guide to  Document Preparation

           ...oops!  The right solution (and this applies to any title with a TeX command  that  becomes  actual
           text) is to bury the control sequence at brace-depth two:

              A Guide to {{\LaTeXe}}: Document Preparation ...

SEE ALSO

       btparse

AUTHOR

       Greg Ward <gward@python.net>

btparse, version 0.89                              2024-03-31                           btparse::doc::bt_misc(3)