Dan's Mail Format Site:
Body: HTML Mail
[<== Previous] | [Up] | [Next ==>]
E-mail has traditionally been a plain-text medium, ever since it was introduced on the ARPAnet in the 1970s (and possibly even earlier than that on individual time-sharing mainframe computers). However, some people wanted a way to use fancier formatting in their messages. Various proprietary formats were tried, but HTML ended up as the "standard" manner of doing this. How does HTML e-mail work, and is it a good idea or a bad one? This article discusses the hows, whys, and why-nots.
HTML E-mail: The Basics
HTML (HyperText Markup Language) is, of course, the format used for Web pages. It was invented in 1990 by
Tim Berners-Lee, the creator of the Web. E-mail had existed for over a decade before that, so obviously HTML
e-mail is a latecomer. However, it became possible to use HTML for the main body of an e-mail message once
MIME headers were introduced. These headers (which I discuss more elsewhere)
are able to specify what data format is being used, so the receiving program knows whether the message is in plain
text or HTML. Starting in the late 1990s, mail programs began to support the sending and receiving of HTML messages,
using the MIME type
Obviously, when the first HTML-format mail started going out, it faced a problem in that most users at the time
were still using programs that didn't support the reading of HTML. This was resolved by using a multipart message,
with both plain-text and HTML versions. The content type header of the message as a whole is
Mail programs that support the sending of HTML e-mail generally have a configuration setting to determine whether to send outbound messages in plain text or HTML form. Many default to HTML form (as discussed and criticized below) but can be configured to send only plain text at the user's option. A few, however, send in HTML form and are difficult or impossible to configure any other way. Some smarter programs decide which format to use based on what is needed for the current message; their message editor has the ability to add special formatting (bold, italics, headers, etc.) and hyperlinks, and uses HTML format if any of these features are used, but plain text if they are not.
What is HTML e-mail good for?
Some will reply "Absolutely nothing!" The next section below gives some reasons for this position. However, HTML e-mail wouldn't have become as popular as it now is if it had no advantages at all. It can have a useful purpose. A long message with a complex structure can be more readable and understandable if there are headers, emphasized passages, italicized citations, bulleted lists, and other structural elements made possible by HTML. True hyperlinks in HTML messages may work better than inserted URLs in a plain text message (which might get broken in the middle if they are too long to fit on a line). Charts, graphs, and illustrations added via inline images may be an essential part of the information content of an article or report. Even the more exotic things one can do in HTML, such as the embedding of sound or video data, can have their uses; an electronic greeting card just wouldn't be the same in plain text.
Then why do so many people hate it?
Unfortunately, for every message that uses HTML effectively, there are hundreds that use it in a useless or counterproductive way. While the use of well-structured, valid HTML can enhance the readability and understandability of a message, few e-mail writers have any interest in taking the time to do this; e-mail is generally a medium of quick comments tossed off without much effort. Usually, the writer will just type in some text and hit "Send", without any attempt at special formatting or structural elements such as headers. If the writer's program defaults to sending mail in HTML form, the resulting message will just consist of plain text with some pointless HTML tags wrapped around it. Often, such messages will actually be less readable than normal non-HTML text; the reader's mail program will be configured to display plain text in a sensible font, while HTML e-mail contains font tags that try to force the display into a font face, size, or color that is harder to make out. For this reason, many people who use mail programs that give them the option to see the plain or fancy versions of a multipart/alternative message opt to see the plain version.
There is one category of e-mail senders that actually does take the time and effort to craft carefully an HTML message that takes advantage of the strengths of this medium -- but it's likely you don't want to see the results of their work. These are the advertisers and marketers who clog your inbox with spam promoting the junk they're selling. Just like TV commercials are among the most slickly and expensively produced things on the air, and junk paper mail is much slicker and more colorful than ordinary personal letters, junk e-mail makes much more use of any fancy formatting that it's possible to wring out of today's mail reader programs than any other sort of e-mail. This, in fact, is probably a major factor that's driving the development of "enhanced" e-mail, and the reason vendors like Microsoft turn HTML e-mail on by default; the better for advertisers to make their pitches more intrusive and annoying. After all, MS and their big-business friends have their own marketing mail they want to send you if they can con you into "opting-in". That other marketers with fewer scruples follow by deluging everybody (whether opted in, out, or none-of-the-above) with tons of HTML-formatted pitches for herbal remedies, porn, gambling, and hot investments isn't their problem.
And, also, a multipart text-and-HTML message is likely to be at least three times the size of the same message as plain text; after all, it includes the plain text version, plus an HTML version that repeats all the same text plus a whole mess of code like this:
Hence, HTML messages are wasteful of bandwidth and disk space. If they used clean, logical, valid HTML, they'd be nowhere near as wasteful, but in practice many mail programs generate incredibly messy and standards-noncompliant code. And in some cases, if you turn on HTML mail, even the alternative plain text version that accompanies it is malformatted; several programs screw up the line length of messages when HTML is enabled.
But on the other hand...
...there are online newsletters that go out to willing subscribers (in some cases they even pay to subscribe!), some of which use carefully-crafted HTML to present useful things like headers, emphasis, and illustrations. Just like paper mail, where you might subscribe to some magazines which come out on slick paper with fancy layouts, like junk mail, but you want to receive them. So HTML e-mail isn't always evil. Still, if you're publishing an e-mail newsletter, you should give your recipients a choice of whether to get it in text or HTML form; some may prefer plain text or have a mail program that doesn't deal well with HTML. And for your normal non-newsletter correspondence, stick to plain text (configuring your mail program away from defaulting to HTML if you're using a program that does this) unless you actually use the enhancements of HTML for something that helps your message (putting the whole thing in a cutesy script-style font or with a background image that looks like notepaper probably doesn't qualify).
There are two ways to include images in HTML e-mail. One way is to include the images as file attachments associated
with the HTML message (to give some more technical detail, this calls for the message to have content type
Whew... quite a bit of technical stuff, but fortunately you don't generally have to know it unless you're creating a program or script to generate this sort of mail (something I've actually done myself)... as an end user, you probably just have to drag the image into the message you're composing and the program does it all for you... hopefully correctly (though you never know, especially when it's a program from Microsoft).
The other way, sometimes termed "Lazy HTML", is not to attach the images to the message, but instead include references
to images on the Web with normal
As you can see, there are arguments to be made for both approaches, but on the whole, attached images usually work better than remote ones.
Again, as usual, Microsoft mail clients have their own nonstandard ways of doing things, providing yet another way
images get attached: the "outbind:" URI scheme. I haven't been able to find any actual documentation of this
(apparently unregistered) scheme, but it seems to be prefixed to a URI of an image on the Web, like
Single-Part HTML-Only Messages
There are a few mail programs (Hotmail seems to be the main offender) that send HTML mail as a single part, not a multipart message with both text and HTML versions. Their creators probably justified this on the grounds that hardly any mail program these days doesn't support HTML, so there's no need to waste space attaching a text version too. However, doing this is a bad idea for a number of reasons:
Thus, you should avoid this format. If your mail program only sends HTML mail this way, it's all the more reason' to switch to plain text.
Unfortunately, some mail programs that send multi-part messages with a plain-text version along with an HTML version do the plain-text one badly, and you never notice if your own mail program shows you only the HTML version while viewing messages. Sometimes, the plain-text message has no clear separation between quoted material and responses, if this distinction in the HTML version was made through things like colors and fonts that go away when the HTML tags are stripped. Other bizarre things sometimes show up in the plain-text version, like the word "Message" being added awkwardly at the beginning of the text because that was the TITLE element of the HTML version and the part of the mail program that creates the text version stupidly grabs it as part of the text. But, even worse, there are some messages (usually part of bulk mailings, but this doesn't mean it's just spam; it happens in legitimate bulk mailings such as subscribed-to newsletters) that have a completely empty plain-text version, so that if your mail program is configured to show plain text in preference to HTML, you see nothing at all. This is apparenly the result of a program that's set up to include both formats, but require the sender to set up the contents of each version separately (not a bad idea for bulk mailings, as it allows the sender to create well-formatted versions for each instead of having the text version created automatically, and often badly, from the HTML version), but the sender failed to supply any plain text, so that part ended up empty. If you're going to do that, you shouldn't include a plain text version at all. Some mail programs can cope with the lack of a plain text version better than an empty one; when you choose to display plain text in preference to HTML, it still displays HTML if that's all there is, but displays a plain text version (even an empty one) instead if present.
Recipients who Complain about Plain Text?
I thought I'd heard everything, pro and con, about text vs. HTML e-mail, but in this forum somebody actually said that recipients of his business-related mails got "disgusted" by plain-text e-mails he sent. That's an experience I've never had. He later clarified that he had recipients who experienced difficulties when they replied to plain-text messages; their HTML-format signature blocks came out as a mess of ugly code. This sounds to me like either a broken mail reader or a misconfigured one; perhaps it has the ability to create and specify separate signature blocks for plain text and HTML messages but the user foolishly configured the HTML version of his signature for plain-text use. Anyway, that shouldn't be the sender's problem.
Email Rejection: An Amusing Example
As I've noted, some recipients won't accept HTML-formatted e-mail or other mail with non-text attachments, because it triggers filters designed to keep out spam or viruses. Among those who bounce non-text messages are some companies' technical support and customer service departments, who will send back messages with attachments and tell you to resend them as plain text. One amusing example of such is Bonzi, which has a free download that supposedly "enhances" your PC experience (I don't recommend you install it; it's reputed to be annoying adware, and maybe "spyware" too; once it gets into your system, it won't go away, and might also be sending personal info of yours to its manufacturer). Anyway, their automated response they send to anybody who e-mails them in non-text form (which I found out because apparently some virus e-mailed itself to them forging my address as the "From" line, triggering this response to me even though I had never e-mailed them in my life), includes this passage:
BonziMAIL messages include attachments and are not accepted by our mail system. Please open your regular e-mail program and write us a message using only plain text.
So, apparently, their own e-mail program, that's included as part of the software you download from them, produces mail of a format that they, themselves, reject!
And Now for Something Even More Amusing...
Even worse than ordinary HTML e-mail is the heaps of somewhat HTML-like garbage spewed out by some mail programs,
especially the ones from Microsoft. A particularly bizarre part of this is the peculiar, pointless, series of
proprietary conditional comments such as
And, speaking of Microsoft's mail programs, just when you think they couldn't get any worse, they've "downgraded" the HTML support in the newest version of Outlook (in 2007) so that it renders things less compatibly with normal browsers (even Microsoft's own one); instead, it uses the crippled MS Word HTML renderer. This has the effect of breaking a lot of fancy-formatted HTML messages for users of that crappy program, and is one more reason for senders to stick to plain text. (More info; still more here and here.)
Next: What do you call a mail reader that doesn't handle real HTML, but tries to render a limited subset of HTML tags -- even in plain text messages? AOL calls it "HTML Lite", but "Half-Assed HTML" is a better name for it.
[<== Previous] | [Up] | [Next ==>]
This page was first created 26 May 2003, and was last modified 25 Aug 2009.