Message IDs in Mail and Content IDs (cid) in HTML Mail

19 Jun 2010 - 15:34

Message IDs in Mail and Content IDs (cid) in HTML Mail

This article provides a very short and simple PHP snippet. The purpose is to save you from the necessity to read several Internet mail standards (particularly about multipart/related and Internet Message Protocol, RFC 2822).

Message IDs and Content IDs have the same structure, so the same snippet serves both.

Message IDs are part of the e-mail headers. They should form a unique identifier for every single e-mail. Mail servers forwarding a mail, keep them without change. List servers, distributing mail keep them as well (at least the reasonably designed ones). But if you write a new mail (or reply to one), a new message id is created.

These IDs are used among other things to allow threading of mail in your client program. If you reply to a mail, your mail program adds a header, to which message id you did reply. (At least, it should do so.)

For scripts, sending out mail, sometimes you need to have the message id under code control. Archiving the mail in a database may be one reason. For news letters, I used self created message IDs to keep track of address record numbers in case of bounced mail.

Uniqueness of message IDs is basically guaranteed by two steps: One part of the ID is a domain name that is under your control. Since I have the domain “j-schell.de”, I would use it for this part of the ID. Nobody else in the world has that domain. If I would need to split work among some friends or colleagues, we would simply create subdomain names: ”a.j-schell.de”, “b.j-schell.de” and so on. It is not required, that these names do correspond to existing machines. It is just necessary that each of us has his/hers own name, so we do not interfere with each other.

A other part of the message ID is created be each of us. I just need to care that it is unique among all the mail that I do create. The domain name part makes sure I have not to care about others. Let me call this part the “local part”.

Message IDs look much like mail addresses. But the are none. They have an “@” sign, preceded by the local part and followed by the domain name part. For the local part, RFC 2822 suggests to use some mixture of a time stamp and a random value.

The concept of message IDs has been recycled for HTML mails to embed pictures. In the standard HTML tag like “<img href="…”, the href value uses a content ID indicating an attachment to the mail. This will work only, if the ID is unique within the mail and appears both within the img tag and in the attachment. Although a world wide uniqueness is not really necessary, the RFC simply requires it to be of the same type as a message ID.

For having embedded pictures in an HTML mail, a mail of the type multipart/related must be created. The pictures are simply attached, having their content ID in the header and that ID is used in the HTML part of the message.

Script tools like PHPMailer allow to create HTML with embedded pictures in a simple way. I found several tutorials on the Web how to do that. But in many cases, the cid used was plainly not build conforming to the Internet standards. The following PHP snippet should create a proper Message ID or Content ID.

$proc_name is a string chosen by you. If it is specific for the script, sending the mail, two different scripts will never create the same id. And that is the purpose. $dom_name is some domain name or machine name in the Internet naming scheme, that is under your control. Often, it is the name of the mail server used, but again, the purpose is uniqueness. The name must be under your control so you can be sure, nobody else is using it, like in the “b.j-schell.de” example.

$proc_name may also be used to add some information like the address record number, as mentioned at the beginning.


    function make_mid ($proc_name, $dom_name) {
        $mid = $proc_name;
        $rn = mt_rand(100000000,999999999);
        if ( function_exists('microtime') ) {
            list($mtm, $tm) = explode(" ", microtime());
        } else {
            $tm = time();
            $mtm =0;
        }
        $mtm = (int) ($mtm * 1000000);
        $mid = sprintf('%s-%x-%x-%x@%s',$proc_name,$tm,$mtm,$rn,$dom_name);
        return $mid;
    }

Important Mail headers must not exceed 76 characters. In case of message ids and content ids, the name of the header has 12 characters already. This leaves 64 characters for the id itself. The time stamp is hex encoded just to save characters in this part of the id. This should leave 39 characters for your domain name and the proc_name (summed up).

PHPMailer is available at
http://phpmailer.sourceforge.net/

In recent code I switched to Swiftmailer, which is quite powerful:
http://www.swiftmailer.org/