Provided by: libtext-capitalize-perl_1.5-2_all bug

NAME

       Text::Capitalize - capitalize strings ("to WORK AS titles" becomes "To Work as Titles")

SYNOPSIS

          use Text::Capitalize;

          print capitalize( "...and justice for all" ), "\n";
             ...And Justice For All

          print capitalize_title( "...and justice for all" ), "\n";
             ...And Justice for All

          print capitalize_title( "agent of SFPUG", PRESERVE_ALLCAPS=>1 ), "\n";
             Agent of SFPUG

          print capitalize_title( "the ring:  symbol or cliche?",
                                  PRESERVE_WHITESPACE=>1 ), "\n";
             The Ring:  Symbol or Cliche?
             (Note, double-space after colon is still there.)

          # To work on international characters, may need to set locale
          use Env qw( LANG );
          $LANG = "en_US";
          print capitalize_title( "ueber maus" ), "\n";
             Ueber Maus

          use Text::Capitalize qw( scramble_case );
          print scramble_case( 'It depends on what you mean by "mean"' );
             It dEpenDS On wHAT YOu mEan by "meAn".

ABSTRACT

         Text::Capitalize is for capitalizing strings in a manner
       suitable for use in titles.

DESCRIPTION

       Text::Capitalize provides some routines for title-like formatting of strings.

       The simple capitalize function just makes the initial character of each word uppercase, and forces the
       rest to lowercase.

       The capitalize_title function applies English title case rules (discussed below) where only the
       "important" words are supposed to be capitalized.  There are also some customization features provided to
       allow the user to choose variant rules.

       Comparing capitalize and captialize_title:

         Input:             "lost watches of splitsville"
         capitalize:        "Lost Watches Of Splitsville"
         capitalize_title:  "Lost Watches of Splitsville"

       Some examples of formatting with capitalize_title:

         Input:             "KiLLiNG TiMe"
         capitalize_title:  "Killing Time"

         Input:             "we have come to wound the autumnal city"
         capitalize_title:  "We Have Come to Wound the Autumnal City"

         Input:             "ask for whom they ask for"
         captialize_title:  "Ask for Whom They Ask For"

       Text::Capitalize also provides some functions for special effects such as scramble_case, which typically
       would be used for this sort of transformation:

         Input:            "get whacky"
         scramble_case:    "gET wHaCkY"  (or something similar)

EXPORTS

   default exports
       capitalize
           Makes the initial character of each word uppercase, and forces the rest to lowercase.

           The original routine by Stanislaw Y. Pusep.

       capitalize_title
           Applies English title case rules (See BACKGROUND) where only the "important" words are supposed to be
           capitalized.

           The one required argument is the string to be capitalized.

           Some  customization  options  may  be  passed  in as pairs of names and values following the required
           argument.

           The following customizations are allowed:

           Boolean:

             PRESERVE_WHITESPACE
             PRESERVE_ALLCAPS
             PRESERVE_ANYCAPS

           Array reference:

             NOT_CAPITALIZED

           See "Customizing the Exceptions to Capitalization".

   optional exports
       @exceptions
           The list of minor words that don't usually get capitalized  in  titles  (used  by  capitalize_title).
           Defaults to:

                a an the
                and or nor for but so yet
                to of by at for but in with has
                de von

       %defaults_capitalize_title
           Defines the default arguments for the capitalize_title function Initially, this is set-up to shut off
           the  features  PRESERVE_WHITESPACE, PRESERVE_ALLCAPS and PRESERVE_ANYCAPS; it also has @exceptions as
           the NOT_CAPITALIZED list.

       scramble_case
           This routine provides a special effect: sCraMBliNg tHe CaSe

           The algorithm here uses a modified probability distribution to get  a  weirder  looking  effect  than
           simple randomization such as with random_case.

           For a discussion of the algorithm, see "SPECIAL EFFECTS".

       random_case
           Randomizes the case of each character with a 50-50 chance of each one becoming upper or lower case.

       zippify_case
           Function to provide a special effect: "RANDOMLY upcasing WHOLE WORDS at a TIME".

           This uses a similar algorithm to scramble_case, though it also ignores words on the @exceptions list,
           just as capitalize_title does.

BACKGROUND

       The  capitalize_title  function tries to do the right thing by default: adjust an arbitrary chunk of text
       so that it can be used as a title.  But as with many aspects of the  human  languages,  it  is  extremely
       difficult to come up with a set of programmatic rules that will cover all cases.

   Words that don't get capitalized
       This web page:

         http://www.continentallocating.com/World.Literature/General2/LiteraryTitles2.htm

       presents some admirably clear rules for capitalizing titles:

         ALL words in EVERY title are capitalized except
         (1) a, an, and the,
         (2) two and three letter conjunctions (and, or, nor, for, but, so, yet),
         (3) prepositions.
         Exceptions:  The first and last words are always capitalized even
         if they are among the above three groups.

       But consider the case:

         "It Waits Underneath the Sea"

       Should  the  word  "underneath"  be downcased because it's a preposition?  Most English speakers would be
       surprised to see it that way.  Consequently, the default list of exceptions  to  capitalization  in  this
       module only includes the shortest of the common prepositions (to of by at for but in).

       The default entries on the exception list are:

            a an the
            and or nor for but so yet
            to of by at for but in with has
            de von

       The  observant  may  note that the last row is not composed of English words.  The honorary "de" has been
       included in honor of "Honore de Balzac".  And "von" was added for the sake of equal time.

   Customizing the Exceptions to Capitalization
       If you have different ideas about the "rules" of English (or perhaps if you're trying to  use  this  code
       with  another  language  with  different rules) you might like to substitute a new exception list of your
       own:

         capitalize_title( "Dude, we, like, went to Old Slavy, and uh, they didn't have it",
                            NOT_CAPITALIZED => [ qw( uh duh huh wha like man you know ) ] );

       This should return:

          Dude, We, like, Went To Old Slavy, And uh, They Didn't Have It

       Less radically, you might like to simply add a word to the list, for example "from":

          use Text::Capitalize 0.2 qw( capitalize_title @exceptions );
          push @exceptions, "from";

          print capitalize_title( "fungi from yuggoth",
                                  NOT_CAPITALIZED => \@exceptions);

       This should output:

           Fungi from Yuggoth

   All Uppercase Words
       In order to work with a wide range of input strings, by default capitalize_title presumes that upper-case
       input needs to be adjusted (e.g. "DOOM APPROACHES!" would become "Doom Approaches!").  But, this  doesn't
       allow  for  the possibilities such as an acronym in a title (e.g. "RAM Prices Plummet" ideally should not
       become "Ram Prices Plummet").  If the PRESERVE_ALLCAPS option is set, then it will be  presumed  that  an
       all-uppercase word is that way for a reason, and will be left alone:

          print capitalize_title( "ram more RAM down your throat",
                                  PRESERVE_ALLCAPS => 1 );

       This should output:

             Ram More RAM Down Your Throat

   Preserving Any Usage of Uppercase for Mixed-case Words
       There  are  some  other  odd  cases  that  are difficult to handle well, notably mixed-case words such as
       "iMac", "CHiPs", and so on.  For these purposes,  a  PRESERVE_ANYCAPS  option  has  been  provided  which
       presumes  that  any  usage  of  uppercase  is there for a reason, in which case the entire word should be
       passed through untouched.  With PRESERVE_ANYCAPS on, only the case of all lowercase words  will  ever  be
       adjusted:

          print capitalize_title( "TLAs i have known and loved",
                              PRESERVE_ANYCAPS => 1 );

       This should output:

          TLAs I Have Known and Loved

          print capitalize_title( "the next iMac: just another NeXt?",
                                   PRESERVE_ANYCAPS => 1);

       This should output:

          The Next iMac: Just Another NeXt?

   Handling Whitespace
       By  default,  the  capitalize_title  function  presumes  that  you're  trying to clean up potential title
       strings. As an extra feature it collapses multiple spaces and tabs into single spaces.  If  this  feature
       doesn't  seem desirable and you want it to literally restrict itself to adjusting capitalization, you can
       force that behavior with the PRESERVE_WHITESPACE option:

          print capitalize_title( "it came from texas:  the new new world order?",
                                  PRESERVE_WHITESPACE => 1);

       This should output:

             It Came From Texas:  The New New World Order?

       (Note: the double-space after the colon is still there.)

   Comparison to Text::Autoformat
       As you might expect, there's more than one way to do this, and these two  pieces  of  code  perform  very
       similar functions:

          use Text::Capitalize 0.2;
          print capitalize_title( $t ), "\n";

          use Text::Autoformat;
          print autoformat { case => "highlight", right => length( $t ) }, $t;

       Note:  with  autoformat,  supplying  the  length  of the string as the "right margin" is much faster than
       plugging in an arbitrarily large number.  There doesn't seem to be any other way  of  turning  off  line-
       breaking (e.g. by using the "fill" parameter) though possibly there will be in the future.

       As of this writing, "capitalize_title" has some advantages:

       1.  It  works  on  characters  outside  the English 7-bit ASCII range, for example with my locale setting
           (en_US) the ISO-8859-1 International characters are handled correctly, so that "ueber  maus"  becomes
           "Ueber Maus".

       2.  Minor words following leading punctuation become upper case:

              "...And Justice for All"

       3.  It  works  with  multiple  sentence input (e.g. "And sooner. And later."  should probably not be "And
           sooner. and later.")

       4.  The list of minor words is more extensive (i.e. includes: so, yet, nor), and is also customizable.

       5.  There's a way of preserving acronyms via the PRESERVE_ALLCAPS option and similarly, mixed-case  words
           ("iMac", "NeXt", etc") with the PRESERVE_ANYCAPS option.

       6.  capitalize_title is roughly ten times faster.

       Another  difference  is  that  Text::Autoformat's  "highlight" always preserves whitespace something like
       capitalize_title does with the PRESERVE_WHITESPACE option set.

       However, it should be pointed out that Text::Autoformat is under active maintenance by Damian Conway.  It
       also does far more than this module, and you may want to use it for other reasons.

   Still more ways to do it
       Late breaking news: The second edition of the  Perl  Cookbook  has  just  come  out.   It  now  includes:
       "Properly Capitalizing a Title or Headline" as recipe 1.14.  You should familiarize yourself with this if
       you want to become a true master of all title capitalization routines.

       (And  I  see  that  recipe  1.13  includes  a  "randcap"  program as an example, which as it happens does
       something like the random_case function described below...)

SPECIAL EFFECTS

       Some functions have been provided to make strings look weird by scrambling  their  capitalization  ("lIKe
       tHiS"):  random_case and scramble_case.  The function "random_case" does a straight-forward randomization
       of capitalization so that each letter has a 50-50 chance of being upper  or  lower  case.   The  function
       "scramble_case"  performs  a very similar function, but does a slightly better job of producing something
       "weird-looking".

       The difficulty is  that  there  are  differences  between  human  perception  of  randomness  and  actual
       randomness.   Consider  the  fact  that  of  the  sixteen  ways  that  the four letter word "word" can be
       capitalized, three of them are rather boring: "word", "Word" and "WORD".  To make  it  less  likely  that
       scramble_case  will produce dull output when you want "weird" output, a modified probability distribution
       has been used that records the history of previous outcomes,  and  tweaks  the  likelihood  of  the  next
       decision  in the opposite direction, back toward the expected average.  In effect, this simulates a world
       in which the Gambler's Fallacy is correct ("Hm... red has come up a lot, I bet that  black  is  going  to
       come up now."). "Streaks" are much less likely with scramble_case than with random_case.

       Additionally, with scramble_case the probability that the first character of the input string will become
       upper-case  has been tweaked to less than 50%.  (Future versions may apply this tweak on a per-word basis
       rather than just on a per-string basis).

       There is also a function that scrambles capitalization on a  word-by-word  basis  called  "zippify_case",
       which should produce output like: "In my PREVIOUS life i was a LATEX-novelty REPAIRMAN!"

EXPORT

       By  default,  this  version  of  the  module  provides the two functions capitalize and capitalize_title.
       Future versions will have no further additions to the default export list.

       Optionally, the following functions may also be exported:

       scramble_case
           A function to scramble capitalization in a wEiRD loOOkInG wAy.  Supposed to look  a  little  stranger
           than the simpler random_case output

       random_case
           Function to randomize capitalization of each letter in the string.  Compare to "scramble_case"

       zippify_case
           A function like "scramble_case" that acts on a word-by-word basis (Somewhat LIKE this, YOU know?).

       It is also possible to export the following variables:

       @exceptions
           The  list  of  minor  words  that  capitalize_title  uses  by  default to determine the exceptions to
           capitalization.

       %defaults-capitalize_title
           The hash of allowed arguments (with defaults) that the capitalize_title function uses.

BUGS

       1. In capitalize_title, quoted sentence terminators are treated as actual sentence breaks, e.g.  in  this
       case:

            'say "yes but!" and "know what?"'

       The  program  sees  the  !  and effectively treats this as two separate sentences: the word "but" becomes
       "But" (under the rule that last words must always be uppercase, even if they're on  the  exception  list)
       and the word "and" becomes "And" (under the first word rule).

       2.  There's  no  good  way  to  automatically  handle  names  like  "McCoy".   Consider the difficulty of
       disambiguating "Macadam Roads" from "MacAdam Rode".  If you need to solve problems  like  this,  consider
       using the case_surname function of Lingua::En::NameParse.

       3. In general, Text::Capitalize is a very parochial English oriented module that looks like it belongs in
       the "Lingua::En::*" tree.

       4.  There's  currently  no way of doing a PRESERVE_ANYCAPS that *also* adjusts capitalization of words on
       the exception list, so that "iMac Or iPod" would become "iMac or iPod".

SEE ALSO

       Text::Autoformat

       "The Perl Cookbook", second edition, recipes 1.13 and 1.14

       Lingua::En::NameParse

       About "scramble_case": <http://obsidianrook.com/devnotes/talks/esthetic_randomness/>

VERSION

       Version 0.9

AUTHORS

          Joseph M. Brenner
             E-Mail:   doom@kzsu.stanford.edu
             Homepage: http://obsidianrook.com/map

          Stanislaw Y. Pusep  (who wrote "capitalize")
             E-Mail:   stanis@linuxmail.org
             ICQ UIN:  11979567
             Homepage: http://sysdlabs.hypermart.net/

       And many thanks (for feature suggestions and code examples) to:

           Belden Lyman, Yary Hcluhan, Randal Schwartz

COPYRIGHT AND LICENSE

       Copyright 2003 by Joseph Brenner. All rights reserved.

       This library is free software; you can redistribute it and/or modify it under  the  same  terms  as  Perl
       itself.

perl v5.36.0                                       2023-02-03                              Text::Capitalize(3pm)