Dan's Mail Format Site | Attachments | Files

Dan's Mail Format Site:

Attachments: Files

[<== Previous] | [Up] | [Next ==>]

If regular e-mail is the electronic equivalent of a letter, then e-mail with a file attachment is electronic parcel-post. With file attachments you can send pretty much anything you can store on your computer, including pictures, music, movies, programs, spreadsheets, databases, and more. Here's how it works, and some of the pros and cons of using e-mail to send files.

How Attachments Work

Normally, an attachment to an e-mail message is included by MIME (Multimedia Internet Mail Extensions), which is described more fully in an article in the Headers section. Specifically, a message with attachments is sent with an overall MIME type of multipart/mixed, indicating that it contains items of several different types, and within it the first part is the main message body. Note that this is different from multipart/alternative, which denotes multiple parts of which the mail reader should choose one (such as a plain-text and an HTML version of the message), or multipart/related, which encloses multiple items intended to be rendered inline as part of a single document. It's possible that all three types of multipart MIME messages might be nested, one inside another, in a single message, if it's a message with both text and HTML versions, inline images attached to the HTML version, and file attachments added in addition.

Each attachment has its own MIME headers, giving the content type and, usually, other parameters such as the filename and perhaps a description. Different mail readers have different ways of presenting attachments, showing varying amounts of this header information along with an interface to view or save an attachment. Some mail readers can open various sorts of attachments (e.g., images) directly, while others launch external programs to open an attachment.

The sending end has the responsibility of properly tagging the attachments with a MIME type that represents what they are. Normally the mail program will do that automatically when you add a file; however, like all point-and-drool interfaces, this may sometimes conceal badly incorrect program actions beneath a seemingly-simple user interface. If you're attaching an unusual file type, or the file is named with an unusual extension not matching the usual naming scheme for the type of data it represents, then there's a good chance your program will get it wrong. Some mail programs give you a pull-down list to let you specify the data type if you know it; it's a good idea to use that instead of trusting the program to make its own guess.

Since the e-mail protocols were designed originally for sending plain ASCII text only, dumping raw binary data into them is not reliable; they could get messed up in transit by programs that mistake them for a sequence of control characters. Hence, the content of an attachment needs to be encoded in a transfer protocol that converts it to a sequence of normal text characters; usually, base64 encoding is used. The header line Content-transfer-encoding: base64 is used to indicate this. The other common transfer encoding, quoted-printable, is used more often for mostly plain-text messages with a few unsafe characters, though it's technically possible to encode binary files this way too (but less efficient). Base64-encoded data uses 64 different characters (upper and lower case letters are considered different characters for this purpose) to represent the digits of the huge number represented by the binary data. It looks like this in its raw form:

mgFUAQAF/iAgjmRpnmiqrmzrvnAsz3Rt33iu73eSUsCgcEgsGo/IpHLJbDqf0Kh0Sq1ar9gs1YfS
er/gsHhMLpvPZu4JzW673/C4fL5Um4KJvH7P7/v/gIGCg4SFhoeIiYqLjI2Oj5CRkoYAeD9ACRWa
m5ydnp+goaKjpKWmp6ipqqusra6vsLGys6aVmJcUmbS7vL2+v8DBwsPEtLa5uLrFy8zNzs/Q0dKb
xwkB19jX1dPc3d7f4OHB1QLl5uXb4urr7O3u0dUD8vMDAunv+Pn6+/yh1QQAA9a716+gwYMIn1Ur

Some Times Not To Use Attachments

File attachments can be a very useful feature. However, they also have their problems. The next article will discuss a very big problem, viruses. In addition to that, attachments tend to be bulky, and are often in proprietary formats requiring the recipient to have particular software to read them. Not all users have high-bandwidth broadband connections; some are still using slow dialups, and there are parts of the world where you still must pay by the minute to connect. Just because you've got a very fast connection for a flat monthly fee, you shouldn't forget that others aren't so fortunate. (When I was your age, I connected to long-distance dialup BBSs with a 300 baud modem... and I liked it! Well, actually, I have no idea what your age is, so this may or may not be true... but it's the "geek" equivalent of old lines like "I had to walk ten miles through six feet of snow to get to school... uphill both ways!")

Additionally (and not very surprisingly to me), AOL's mail program tends to screw up pretty badly on file attachments, both inbound and outbound, between AOLers and anybody on non-AOL portions of the Internet. As I explained in another article, AOL began as a proprietary service unconnected to anything else, and its mail program still sometimes behaves accordingly. They devised their own proprietary way of adding attachments to mail messages, and then halfheartedly converted them to and from normal Internet attachment style when dealing with messages entering and leaving their service. This means that attached files will sometimes come through OK, but other times get "munged" in wild and wacky ways. Problems seem to be greater when sending multiple attachments in one message; a single attachment seems to be dealt with correctly, but more than one triggers weird actions such as everything getting shoved into a single ZIP archive file. Anyway, though this shows once again that AOL sucks, it also shows that attachments can cause problems for some recipients that normal plain-text message bodies don't have.

A while back, a company I do business with sent e-mail to its customers to inform them of a change in their pricing structure. This is a perfectly reasonable thing for them to do... except that they chose to do it by a MS Word attachment instead of the "Keep It Simple, Stupid" method of giving the information in a plain-text message body. And it wasn't like the information was the sort that required a fancy format... there were no elaborate charts and graphs here, just a fairly brief list of services and their associated prices. It could have been done fine in plain text, which would take an order of magnitude less bandwidth to transmit and be viewable without any special software (Not everyone has MS Word, not all of them have the latest version -- Microsoft takes great pains to make each version's file format incompatible with the ones that went before -- and some are even using operating systems where this software is not available.)

So don't use attachments for things that could and should be sent using the main message body... reserve them for things that can't be done with plain text, like sending family pictures to your grandparents. And even then be aware of what your recipient wants to receive and is able to deal with; it's always best to ask first before sending any really massive files.

UUEncoding

Most attachments these days are done using MIME headers as explained above. However, there's an older form of file attachment that's sometimes still encountered, called "UUEncoding". It predates the existence of structured message formats and content-type headers, so it's simply a way to embed a bunch of binary data in the middle of a plain text message, without anything actually indicating what type of data it is; it's up to the sender and recipient to communicate this to one another somehow, like with comments "Here's a picture of my kids, as a UUEncoded JPEG."

If you find something like the following in a message body, you've probably received a UUEncoded file:

begin 666 image.gif
M1TE&.#EA`0`!`'<`,2'^&E-O9G1W87)E.B!-:6-R;W-O9G0@3V9F:6-E`"'Y
M! $`````+ `````!``$`@X&!@0$"`P$"`P$"`P$"`P$"`P$"`P$"`P$"`P$"
9`P$"`P$"`P$"`P$"`P$"`P$"`P("1 $`.P``
`
end

Who put the "666" at the top of this... could it be... SATAN????? No, actually, it's a Unix-style file mode, indicating who has privileges to read, write, or execute it... that's a normal part of a Unix directory listing, but it's pretty pointless in a file being transferred as plain text. (The sender has no business telling the recipient what file permission settings to give the file once it's decoded; that's the recipient's own business.)

Some reader programs automatically find UUEncoded files embedded within e-mail and newsgroup messages and present the files as attachments to be opened or saved. While sometimes convenient, this can also be perilous. I've remarked elsewhere on the problems of trying to detect and specially render special content within a plain-text message body, as with AOL's rendering of some HTML tags and some mail readers' conversion of emoticons to graphics. It's possible that some piece of normal text might be mistakenly identified as something to be specially rendered, causing problems. It's not like MIME-encoded messages where the parts are clearly delineated and marked as to their format; it takes "rule-of-thumb" heuristics to try to figure out just what something really is when it's embedded within a message body, and these can fail.

Notably, some versions of Microsoft's mail programs mistakenly think they see UUEncoded data when a message has a line starting with "begin" (a common enough English word that might turn up in messages); perhaps it checks some other stuff, like whether it's followed by a number, but I'm not entirely certain. (Apparently, following it with two spaces will suffice.) Thus, a sequence like this might trigger UUDecoding:

...and it's about time that we
begin 1001 different campaigns,
to win on every battlefront...

If the program had any sense, it would notice that what follows is not remotely in the format of a UUEncoded file, and there's no "end" in sight. In fact, the first character of each UUEncoded line indicates the number of characters in the line, and if this doesn't match the actual line length it's a good indication that it's either a malformed UUEncoding, or not UUEncoded at all. But this is Microsoft we're talking about, whose idea of "smart" is the "smart quotes" that violate character set standards, so you shouldn't expect much of them. Thus, people have on a number of occasions been surprised to find a plain text message terminating abruptly in the middle as their mail program asks them what they want to do with the binary attachment that is mysteriously present.

For this reason, as well as the fact that UUEncoding may sometimes be used to sneak in viruses and other malicious attachments (the lack of MIME headers lets the attachments get through filtering proxies that are sensitive to dangerous data types), don't be surprised if fewer and fewer mail programs support UUEncoding or UUDecoding at all. So if you receive an attachment in this format you might have to save the message to a file and find an external UUDecoder to run it through, which is easier to find in a Unix environment than under Windows.

What the heck is WINMAIL.DAT?

Some of you may have received messages containing an attached file named winmail.dat, and wondered just what it is and what to do with it. It's attached with MIME type application/ms-tnef (even though the official MIME type list shows no entry for this, only one for application/vnd.ms-tnef, with the "vnd." part indicating that it's a vendor-specific format), which might lead you to decide that this is some kind of Microsoftism. You'd be correct; this attachment is added to messages sent through Microsoft Exchange servers, if configured to do so (and I think, as usual for Microsoft, that this is the default configuration; usually, anything annoying or standards-violating that a Microsoft program is capable of doing is made the default behavior). What it does is to encode various "enhanced" message features in an entirely proprietary way which only Microsoft programs can make any sense of. HTML e-mail, for all its problems, is at least a nonproprietary format that's supported by many programs; not so for this winmail.dat abomination. If you get attachments like this from anybody, tell them to cut it out and use standards-compliant methods if they need to send you anything other than plain text. This would preferably involve uninstalling everything from Microsoft and getting more reasonable software, but if they're too unreasonable to do this, there is a way to turn off winmail.dat attachments.

If you insist on dealing with such attachments rather than demanding that the senders not send them, there is a program that supposedly lets you open them even if your mail program can't deal with them.

MacOS Attachments

Microsoft isn't unique, however, in trying to impose proprietary attachment formats. MacOS has its own unique file-system stuff, most notably a "resource fork" for every file, which it attempts to preserve when sending files by e-mail (or, for that matter, when exporting to a non-Mac-format disk), causing problems for non-Mac recipients who get some complicated multi-part nested structure instead of a simple attachment. Reports are that the Mail application on MacOS X produces particularly problematic output. Ironically, Microsoft's Entourage mail program is much more standards-compliant in that regard.

Hall of Shame

Do better yourself by looking at others who show, by example, what not to do!

NOTE: The inclusion of a site in my "Hall of Shame" links should not be construed as any sort of personal attack on the site's creator, who may be a really great person, or even an attack on the linked Web site as a whole, which may be a source of really great information and/or entertainment. Rather, it is simply to highlight specific features (intentional or accidental) of the linked sites which cause problems that could have been avoided by better design. If you find one of your sites is linked here, don't get offended; improve your site so that I'll have to take down the link!

  • Contractors using the Danbro umbrella company receive regular payslips in the form of a PDF packaged in a winmail.dat file. A colleague who was having trouble reading these attachments emailed the company to complain, and in doing so gave links explaining how to correct the software misconfiguration that leads to it (and asking that the info be forwarded to the company's technical department). However, they refused to listen, and instead switched to using snail-mail to send this particular victim his payslips. (Contributed by Stewart Gordon)

Links

Next: Watch out... the attachments you receive by e-mail might be harmful to your computer! Learn about computer viruses, and how to avoid them.

[<== Previous] | [Up] | [Next ==>]

 

This page was first created 14 Jun 2003, and was last modified 23 Oct 2016.
Copyright © 2003-2016 by Daniel R. Tobias. All rights reserved.

webmaster@mailformat.dan.info