Ubuntu Manpage: MIME::Head - MIME message header (a subclass of Mail::Header)

Provided by: libmime-tools-perl_5.514-1_all

NAME

       MIME::Head - MIME message header (a subclass of Mail::Header)

SYNOPSIS

       Before reading further, you should see MIME::Tools to make sure that you understand where this module
       fits into the grand scheme of things.  Go on, do it now.  I'll wait.

       Ready?  Ok...

   Construction
           ### Create a new, empty header, and populate it manually:
           $head = MIME::Head->new;
           $head->replace('content-type', 'text/plain; charset=US-ASCII');
           $head->replace('content-length', $len);

           ### Parse a new header from a filehandle:
           $head = MIME::Head->read(\*STDIN);

           ### Parse a new header from a file, or a readable pipe:
           $testhead = MIME::Head->from_file("/tmp/test.hdr");
           $a_b_head = MIME::Head->from_file("cat a.hdr b.hdr |");

   Output
           ### Output to filehandle:
           $head->print(\*STDOUT);

           ### Output as string:
           print STDOUT $head->as_string;
           print STDOUT $head->stringify;

   Getting field contents
           ### Is this a reply?
           $is_reply = 1 if ($head->get('Subject') =~ /^Re: /);

           ### Get receipt information:
           print "Last received from: ", $head->get('Received', 0);
           @all_received = $head->get('Received');

           ### Print the subject, or the empty string if none:
           print "Subject: ", $head->get('Subject',0);

           ### Too many hops?  Count 'em and see!
           if ($head->count('Received') > 5) { ...

           ### Test whether a given field exists
           warn "missing subject!" if (! $head->count('subject'));

   Setting field contents
           ### Declare this to be an HTML header:
           $head->replace('Content-type', 'text/html');

   Manipulating field contents
           ### Get rid of internal newlines in fields:
           $head->unfold;

           ### Decode any Q- or B-encoded-text in fields (DEPRECATED):
           $head->decode;

   Getting high-level MIME information
           ### Get/set a given MIME attribute:
           unless ($charset = $head->mime_attr('content-type.charset')) {
               $head->mime_attr("content-type.charset" => "US-ASCII");
           }

           ### The content type (e.g., "text/html"):
           $mime_type     = $head->mime_type;

           ### The content transfer encoding (e.g., "quoted-printable"):
           $mime_encoding = $head->mime_encoding;

           ### The recommended name when extracted:
           $file_name     = $head->recommended_filename;

           ### The boundary text, for multipart messages:
           $boundary      = $head->multipart_boundary;

DESCRIPTION

       A class for parsing in and manipulating RFC-822 message headers, with some methods geared towards
       standard (and not so standard) MIME fields as specified in the various Multipurpose Internet Mail
       Extensions RFCs (starting with RFC 2045)

PUBLIC INTERFACE

   Creation, input, and output
       new [ARG],[OPTIONS]
           Class  method,  inherited.   Creates  a  new  header  object.  Arguments are the same as those in the
           superclass.

       ambiguous_content
           Instance method.

           Returns true if this header has any the following properties:

               More than one Content-Type, Content-ID, Content-Transfer-Encoding or Content-Disposition header.

               A Content-Type or Content-Disposition header contains a repeated parameter.

           Messages with ambiguous content should be treated as a security risk.  In particular,  if  MIME-tools
           is used in an email security tool, ambiguous messages should not be delivered to end-users.

       from_file EXPR,OPTIONS
           Class  or  instance method.  For convenience, you can use this to parse a header object in from EXPR,
           which may actually be any expression that  can  be  sent  to  open()  so  as  to  return  a  readable
           filehandle.  The "file" will be opened, read, and then closed:

               ### Create a new header by parsing in a file:
               my $head = MIME::Head->from_file("/tmp/test.hdr");

           Since this method can function as either a class constructor or an instance initializer, the above is
           exactly equivalent to:

               ### Create a new header by parsing in a file:
               my $head = MIME::Head->new->from_file("/tmp/test.hdr");

           On success, the object will be returned; on failure, the undefined value.

           The OPTIONS are the same as in new(), and are passed into new() if this is invoked as a class method.

           Note:  This  is  really  just  a  convenience  front-end  onto read(), provided mostly for backwards-
           compatibility with MIME-parser 1.0.

       read FILEHANDLE
           Instance (or class) method.  This initializes a header object by reading it  in  from  a  FILEHANDLE,
           until  the  terminating  blank  line  is encountered.  A syntax error or end-of-stream will also halt
           processing.

           Supply this routine with a reference to a filehandle glob; e.g., "\*STDIN":

               ### Create a new header by parsing in STDIN:
               $head->read(\*STDIN);

           On success, the self object will be returned; on failure, a false value.

           Note: in the MIME world, it is perfectly legal for a header to be empty, consisting  of  nothing  but
           the terminating blank line.  Thus, we can't just use the formula that "no tags equals error".

           Warning:  as  of  the  time  of this writing, Mail::Header::read did not flag either syntax errors or
           unexpected end-of-file conditions (an EOF before the terminating blank line).  MIME::ParserBase takes
           this into account.

   Getting/setting fields
       The following are methods related to retrieving and modifying the header fields.  Some are inherited from
       Mail::Header, but I've kept the documentation around for convenience.

       add TAG,TEXT,[INDEX]
           Instance method, inherited.  Add a new occurrence of the field named TAG, given by TEXT:

               ### Add the trace information:
               $head->add('Received',
                          'from eryq.pr.mcs.net by gonzo.net with smtp');

           Normally, the new occurrence will be appended to the existing occurrences.  However, if the  optional
           INDEX  argument  is  0,  then the new occurrence will be prepended.  If you want to be explicit about
           appending, specify an INDEX of -1.

           Warning: this method always adds new occurrences; it doesn't overwrite any existing occurrences... so
           if you just want to change the value of a field (creating it if necessary), then you  probably  don't
           want to use this method: consider using replace() instead.

       count TAG
           Instance method, inherited.  Returns the number of occurrences of a field; in a boolean context, this
           tells you whether a given field exists:

               ### Was a "Subject:" field given?
               $subject_was_given = $head->count('subject');

           The  TAG  is treated in a case-insensitive manner.  This method returns some false value if the field
           doesn't exist, and some true value if it does.

       decode [FORCE]
           Instance method, DEPRECATED.  Go through all the header fields, looking for RFC 1522 / RFC 2047 style
           "Q" (quoted-printable, sort  of)  or  "B"  (base64)  encoding,  and  decode  them  in-place.   Fellow
           Americans,  you probably don't know what the hell I'm talking about.  Europeans, Russians, et al, you
           probably do.  ":-)".

           This method has been deprecated.  See "decode_headers" in MIME::Parser for the full reasons.  If  you
           absolutely must use it and don't like the warning, then provide a FORCE:

              "I_NEED_TO_FIX_THIS"
                     Just shut up and do it.  Not recommended.
                     Provided only for those who need to keep old scripts functioning.

              "I_KNOW_WHAT_I_AM_DOING"
                     Just shut up and do it.  Not recommended.
                     Provided for those who REALLY know what they are doing.

           What this method does.  For an example, let's consider a valid email header you might get:

               From: =?US-ASCII?Q?Keith_Moore?= <moore@cs.utk.edu>
               To: =?ISO-8859-1?Q?Keld_J=F8rn_Simonsen?= <keld@dkuug.dk>
               CC: =?ISO-8859-1?Q?Andr=E9_?= Pirard <PIRARD@vm1.ulg.ac.be>
               Subject: =?ISO-8859-1?B?SWYgeW91IGNhbiByZWFkIHRoaXMgeW8=?=
                =?ISO-8859-2?B?dSB1bmRlcnN0YW5kIHRoZSBleGFtcGxlLg==?=
                =?US-ASCII?Q?.._cool!?=

           That basically decodes to (sorry, I can only approximate the Latin characters with 7 bit sequences /o
           and 'e):

               From: Keith Moore <moore@cs.utk.edu>
               To: Keld J/orn Simonsen <keld@dkuug.dk>
               CC: Andr'e  Pirard <PIRARD@vm1.ulg.ac.be>
               Subject: If you can read this you understand the example... cool!

           Note:  currently,  the  decodings  are done without regard to the character set: thus, the Q-encoding
           "=F8" is simply translated to the octet (hexadecimal "F8"), period.  For piece-by-piece decoding of a
           given field, you want the array context of MIME::Words::decode_mimewords().

           Warning: the CRLF+SPACE separator that splits up long encoded words into shorter sequences  (see  the
           Subject:  example above) gets lost when the field is unfolded, and so decoding after unfolding causes
           a spurious space to be left in the field.  THEREFORE:  if  you're  going  to  decode,  do  so  BEFORE
           unfolding!

           This method returns the self object.

           Thanks to Kent Boortz for providing the idea, and the baseline RFC-1522-decoding code.

       delete TAG,[INDEX]
           Instance method, inherited.  Delete all occurrences of the field named TAG.

               ### Remove some MIME information:
               $head->delete('MIME-Version');
               $head->delete('Content-type');

       get TAG,[INDEX]
           Instance method, inherited.  Get the contents of field TAG.

           If a numeric INDEX is given, returns the occurrence at that index, or undef if not present:

               ### Print the first and last 'Received:' entries (explicitly):
               print "First, or most recent: ", $head->get('received', 0);
               print "Last, or least recent: ", $head->get('received',-1);

           If no INDEX is given, but invoked in a scalar context, then INDEX simply defaults to 0:

               ### Get the first 'Received:' entry (implicitly):
               my $most_recent = $head->get('received');

           If  no  INDEX  is  given,  and  invoked  in  an  array context, then all occurrences of the field are
           returned:

               ### Get all 'Received:' entries:
               my @all_received = $head->get('received');

           NOTE: The header(s) returned may end with a newline.  If you don't want this, then chomp  the  return
           value.

       get_all FIELD
           Instance method.  Returns the list of all occurrences of the field, or the empty list if the field is
           not present:

               ### How did it get here?
               @history = $head->get_all('Received');

           Note: I had originally experimented with having get() return all occurrences when invoked in an array
           context... but that causes a lot of accidents when you get careless and do stuff like this:

               print "\u$field: ", $head->get($field);

           It also made the intuitive behaviour unclear if the INDEX argument was given in an array context.  So
           I opted for an explicit approach to asking for all occurrences.

       print [OUTSTREAM]
           Instance  method,  override.   Print the header out to the given OUTSTREAM, or the currently-selected
           filehandle if none.  The OUTSTREAM may be a filehandle, or any object  that  responds  to  a  print()
           message.

           The  override actually lets you print to any object that responds to a print() method.  This is vital
           for outputting MIME entities to scalars.

           Also, it defaults to the currently-selected filehandle if none is  given  (not  STDOUT!),  so  please
           supply a filehandle to prevent confusion.

       stringify
           Instance method.  Return the header as a string.  You can also invoke it as "as_string".

           If  you  set  the variable $MIME::Entity::BOUNDARY_DELIMITER to a string, that string will be used as
           line-end delimiter.  If it is not set, the line ending will be a newline character (\n)

       unfold [FIELD]
           Instance method, inherited.  Unfold (remove newlines in) the text of all  occurrences  of  the  given
           FIELD.  If the FIELD is omitted, all fields are unfolded.  Returns the "self" object.

   MIME-specific methods
       All of the following methods extract information from the following fields:

           Content-type
           Content-transfer-encoding
           Content-disposition

       Be  aware that they do not just return the raw contents of those fields, and in some cases they will fill
       in sensible (I hope) default values.  Use get() or mime_attr() if you need to grab and  process  the  raw
       field text.

       Note:  some  of  these  methods  are provided both as a convenience and for backwards-compatibility only,
       while others (like recommended_filename()) really do have to be in MIME::Head  to  work  properly,  since
       they  look  for their value in more than one field.  However, if you know that a value is restricted to a
       single field, you should really use the Mail::Field interface to get it.

       mime_attr ATTR,[VALUE]
           A quick-and-easy interface to set/get the attributes in structured MIME fields:

               $head->mime_attr("content-type"         => "text/html");
               $head->mime_attr("content-type.charset" => "US-ASCII");
               $head->mime_attr("content-type.name"    => "homepage.html");

           This would cause the final output to look something like this:

               Content-type: text/html; charset=US-ASCII; name="homepage.html"

           Note that the special empty sub-field tag indicates the anonymous first sub-field.

           Giving VALUE as undefined will cause the contents of the named subfield to be deleted:

               $head->mime_attr("content-type.charset" => undef);

           Supplying no VALUE argument just returns the attribute's value, or undefined if it isn't there:

               $type = $head->mime_attr("content-type");      ### text/html
               $name = $head->mime_attr("content-type.name"); ### homepage.html

           In all cases, the new/current value is returned.

           The special sub-field tag @duplicate_parameters (which can never be a real tag) returns  an  arrayref
           of  tags  that  were  duplicated  in the header, or "undef" if no such tags were found.  For example,
           given the header:

               Content-Type: multipart/mixed; boundary="foo"; boundary="bar"

           Then:

               $head->mime_attr('content-type.@duplicate_parameters')

           would return:

               [ 'boundary' ]

           A duplicate "boundary" tag should be treated as a security risk,  as  should  duplicate  Content-Type
           headers in a message.  Since such messages cannot be parsed unambiguously, we strongly recommend that
           they never be delivered to end-users.

       mime_encoding
           Instance  method.   Try  real  hard  to  determine  the  content  transfer  encoding (e.g., "base64",
           "binary"), which is returned in all-lowercase.

           If no encoding could be found, the default of "7bit" is returned I quote from RFC 2045 section 6.1:

               This is the default value -- that is, "Content-Transfer-Encoding: 7BIT"
               is assumed if the Content-Transfer-Encoding header field is not present.

           I do one other form of fixup: "7_bit", "7-bit", and "7 bit" are corrected  to  "7bit";  likewise  for
           "8bit".

       mime_type [DEFAULT]
           Instance  method.   Try  "real  hard" to determine the content type (e.g., "text/plain", "image/gif",
           "x-weird-type", which is returned in all-lowercase.  "Real hard" means that if no content type  could
           be found, the default (usually "text/plain") is returned.  From RFC 2045 section 5.2:

              Default RFC 822 messages without a MIME Content-Type header are
              taken by this protocol to be plain text in the US-ASCII character
              set, which can be explicitly specified as:

                 Content-type: text/plain; charset=us-ascii

              This default is assumed if no Content-Type header field is specified.

           Unless  this  is a part of a "multipart/digest", in which case "message/rfc822" is the default.  Note
           that you can also set the default, but you  shouldn't:  normally  only  the  MIME  parser  uses  this
           feature.

       multipart_boundary
           Instance  method.   If  this is a header for a multipart message, return the "encapsulation boundary"
           used to separate the parts.  The boundary is returned exactly as given in the "Content-type:"  field;
           that is, the leading double-hyphen ("--") is not prepended.

           Well, almost exactly... this passage from RFC 2046 dictates that we remove any trailing spaces:

              If a boundary appears to end with white space, the white space
              must be presumed to have been added by a gateway, and must be deleted.

           Returns  undef  (not  the  empty  string)  if  either  the message is not multipart or if there is no
           specified boundary.

       recommended_filename
           Instance method.  Return the recommended external filename.  This is used when  extracting  the  data
           from  the  MIME  stream.   The filename is always returned as a string in Perl's internal format (the
           UTF8 flag may be on!)

           Returns undef if no filename could be suggested.

NOTES

       Why have separate objects for the entity, head, and body?
           See the documentation for the MIME-tools distribution for the rationale behind this decision.

       Why assume that MIME headers are email headers?
           I quote from Achim Bohnet, who gave feedback on v.1.9 (I think he's using the word "header"  where  I
           would use "field"; e.g., to refer to "Subject:", "Content-type:", etc.):

               There is also IMHO no requirement [for] MIME::Heads to look
               like [email] headers; so to speak, the MIME::Head [simply stores]
               the attributes of a complex object, e.g.:

                   new MIME::Head type => "text/plain",
                                  charset => ...,
                                  disposition => ..., ... ;

           I  agree  in principle, but (alas and dammit) RFC 2045 says otherwise.  RFC 2045 [MIME] headers are a
           syntactic subset of RFC-822 [email] headers.

           In my mind's eye, I see an abstract class, call it MIME::Attrs, which does what Achim suggests...  so
           you could say:

                my $attrs = new MIME::Attrs type => "text/plain",
                                            charset => ...,
                                            disposition => ..., ... ;

           We  could  even  make it a superclass of MIME::Head: that way, MIME::Head would have to implement its
           interface, and allow itself to be initialized from a MIME::Attrs object.

           However, when you read RFC 2045, you begin to see how much  MIME  information  is  organized  by  its
           presence  in particular fields.  I imagine that we'd begin to mirror the structure of RFC 2045 fields
           and subfields to such a degree that this might not  give  us  a  tremendous  gain  over  just  having
           MIME::Head.

       Why all this "occurrence" and "index" jazz?  Isn't every field unique?
           Aaaaaaaaaahh....no.

           Looking  at a typical mail message header, it is sooooooo tempting to just store the fields as a hash
           of strings, one string per hash entry.  Unfortunately, there's the little matter of  the  "Received:"
           field, which (unlike "From:", "To:", etc.) will often have multiple occurrences; e.g.:

               Received: from gsfc.nasa.gov by eryq.pr.mcs.net  with smtp
                   (Linux Smail3.1.28.1 #5) id m0tStZ7-0007X4C;
                    Thu, 21 Dec 95 16:34 CST
               Received: from rhine.gsfc.nasa.gov by gsfc.nasa.gov
                    (5.65/Ultrix3.0-C) id AA13596;
                    Thu, 21 Dec 95 17:20:38 -0500
               Received: (from eryq@localhost) by rhine.gsfc.nasa.gov
                    (8.6.12/8.6.12) id RAA28069;
                    Thu, 21 Dec 1995 17:27:54 -0500
               Date: Thu, 21 Dec 1995 17:27:54 -0500
               From: Eryq <eryq@rhine.gsfc.nasa.gov>
               Message-Id: <199512212227.RAA28069@rhine.gsfc.nasa.gov>
               To: eryq@eryq.pr.mcs.net
               Subject: Stuff and things

           The  "Received:"  field  is used for tracing message routes, and although it's not generally used for
           anything other than human debugging, I didn't want to inconvenience anyone who actually wanted to get
           at that information.

           I also didn't want to make this a special case; after all, who knows what  other  fields  could  have
           multiple  occurrences in the future?  So, clearly, multiple entries had to somehow be stored multiple
           times... and the different occurrences had to be retrievable.

AUTHOR

       Eryq (eryq@zeegee.com), ZeeGee Software Inc (http://www.zeegee.com).  Dianne Skoll (dianne@skoll.ca)

       All rights reserved.  This program is free software; you can redistribute it and/or modify it  under  the
       same terms as Perl itself.

       The more-comprehensive filename extraction is courtesy of Lee E. Brotzman, Advanced Data Solutions.

perl v5.38.2                                       2024-02-09                                    MIME::Head(3pm)

NAME

SYNOPSIS

DESCRIPTION

PUBLIC INTERFACE

NOTES

SEE ALSO

AUTHOR