mime.txt 4.3 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108
  1. SquirrelMail MIME Support Introduction
  2. ======================================
  3. The intended audience for this document are people who want to understand how
  4. the MIME code works. This is a technical documentation of how mime.php
  5. works and how it parses a MIME encoded message.
  6. Object Structure
  7. ----------------
  8. There are two objects that are used: "message" and "msg_header". Here is a
  9. brief overview of what each object contains.
  10. msg_header
  11. Contains variables for all the necessary parts of the header of a
  12. message. This includes (but is not limited to) the following: to, from,
  13. subject, type (type0), subtype (type1), filename ...
  14. message
  15. This contains the structure for the message. It contains two parts:
  16. $header and $entities[]. $header is of type msg_header, and $entities[]
  17. is an array of type $message. The $entities[] array is optional. If
  18. it does not exist, then we are at a leaf node, and have an actual
  19. attachment (entity) that can be displayed. Here is a tree view of how
  20. this object functions.
  21. header
  22. entities
  23. |
  24. +--- header
  25. |
  26. +--- header
  27. | entities
  28. | |
  29. | +--- header
  30. | |
  31. | +--- header
  32. |
  33. +--- header
  34. Getting the Structure
  35. ---------------------
  36. Previously (version 0.4 and below), SquirrelMail handled all the parsing of
  37. the email message. It would read the entire message in, search for
  38. boundaries, and create an array similar to the $message object described
  39. above. This was very inefficient.
  40. Currently, all the parsing of the body of the message takes place on the
  41. IMAP server itself. According to RFC 2060 section 7.4.2, we can use the
  42. BODYSTRUCTURE function which will return the structure of the body (imagine
  43. that). It goes into detail of how the bodystructure should be formatted,
  44. and we have based our new MIME support on this specification.
  45. A simple text/plain message would have a BODYSTRUCTURE similar to the
  46. following:
  47. ("TEXT" "PLAIN" ("CHARSET" "US-ASCII") NIL NIL "7BIT" 1152 23)
  48. A more complicated multipart message with an attachment would look like:
  49. (("TEXT" "PLAIN" ("CHARSET" "US-ASCII") NIL NIL "7BIT" 1152 23)("TEXT"
  50. "PLAIN" ("CHARSET" "US-ASCII" "NAME" "cc.diff")
  51. "<960723163407.20117h@cac.washington.edu>" "Compiler diff" "BASE64"
  52. 4554 73) "MIXED"))
  53. Our MIME functionality implements different functions that recursively
  54. run through this text and parses out the structure of the message. If you
  55. want to learn more about how the structure of a message is returned with
  56. the BODYSTRUCTURE function, please see RFC 2060 section 7.4.2.
  57. NOTE: SquirrelMail passes the MIME Torture Test written by Mark
  58. Crispin (author of the IMAP protocol). This message is crazy! It
  59. has about 30 parts nested inside each other. A very good test,
  60. and SquirrelMail passed it. It can be found here:
  61. ftp://ftp.lysator.liu.se/mirror/unix/imapd/mime/torture-test.mbox
  62. Getting the Body
  63. ----------------
  64. Once all the structure of the message has been read into the $message
  65. object, we then need to display the body of one entity. There are a number
  66. of ways we decide which entity to display at a certain time, and I won't go
  67. into that here.
  68. Each entity has its own ID. Entity IDs look something like "1.2.1", or
  69. "4.1", or just "2". You can find a detailed description of how entities
  70. should be identified by reading RFC 2060 section 6.4.5. To fetch the body
  71. of a particular entity, we use the function "BODY[<section>]". For
  72. instance, if we were wanting to return entity 1.2.1, we would send the
  73. IMAP server the command: "a001 FETCH <msg_id> BODY[1.2.1]".
  74. This returns a string of the entire body. Based upon what is in the header,
  75. we may need to decode it or do other things to it.
  76. Closing Notes
  77. -------------
  78. That is basically how it works. There is a variable in mime.php called
  79. $debug_mime that is defined at the top of that file. If you set it to true,
  80. it will output all kinds of valuable information while it tries to decode
  81. the MIME message.
  82. The code in mime.php is pretty well documented, so you might want to poke
  83. around there as well to find out more details of how this works.
  84. If you have questions about this, please direct them to our mailing list:
  85. squirrelmail-users@lists.sourceforge.net