Dan's Mail Format Site:Headers: From / To / CC / BCC[<== Previous] | [Up] | [Next ==>]
Where did it come from, and where did it go? Those are the questions that are answered, where an e-mail message is
concerned, by the About E-Mail Addresses
The contents of the Let's start with the simplest thing that can go into an address header, a single e-mail address. (We'll get into the more advanced things, like multiple addresses and addresses with names, later.) You probably already know that an e-mail address commonly looks like this: jsmith@example.net Mailbox Names
The thing to the left of the at sign (@) is a mailbox name, which is often (but not always) the same as the
username or userid the recipient uses to log in to the server (don't call it a "screen name"; that brands you
as an AOLer!). It could also be a mail alias that forwards to a different username, on the same server or a different
one. Only the server to which it's sent knows whether a particular name is valid and to whom it goes; there's no universal Internet standard of mailbox
naming (though there are a few names, like Hostnames
To the right of the at sign is a domain name or hostname. In the "old days", when most e-mail users got their mail
by logging into a big multi-user timesharing computer with a "dumb terminal", then running a mail program on the
remote computer from the command line, the hostname usually named the specific computer they were logging into, and
might have several dots in it representing levels of subdomains, like vaxb.cs.example.edu. Nowadays,
however, people usually get their mail on their own personal computers using programs (either a mail program or a
browser connecting to a webmail interface) that connect to a mail server, and the addresses have a simple domain name
like example.net without a server hostname in it. This is accomplished by the DNS (Domain Name System) feature called "MX records" (for
"Mail eXchange"), allowing a domain name to be associated with a mail server that handles mail to addresses in that
domain, possibly a different server than is used for other Internet activity (e.g., Web sites) at that domain.
Really high-traffic domains, like Abbreviated Addresses
People can sometimes get away with leaving out parts of an e-mail address for "local mail" within one system or network
(just as you can, in most places, dial a local phone call without including the area code). If the part of the destination
to the right of the at sign is exactly the same as the hostname of the server to which you're connecting to send the mail
(e.g., you're connecting to mail.example.net and the address is user@mail.example.net), it'll
usually work to leave out the at sign and the stuff to the right of it and just send to user. It might also
sometimes be possible to leave out parts at the right end of the domain name; if you're on a university network, sending
your mail through a server within the example.edu domain, you might be able to send mail to
joe@cs.example.edu by typing the address as joe@cs. These things were much more
common back in the days when people did their e-mailing on big multiuser mainframes, and many of their messages were
"local" to the same machine or another one in the same local area network. Now that most people connect to remote ISPs,
this is much less likely to be useful (except within a few big online services like AOL, where you can leave out the
As another form of abbreviation, many mail programs let you type a name or nickname into an address field in order to send to the corresponding person. This can be a dangerous feature if the program doesn't show you the actual address it's sending to and give you a chance to change it before actually sending; many people have unintentionally sent private things to the wrong person, or even to an entire mailing list, to their embarrassment. It's even possible for a mail program to get the mistaken idea that a full, correct e-mail address, like user@example.com, is actually an alias that "really means" some other address or set of addresses entirely, depending on what happens to be in your address book. Not very surprisingly to me, Microsoft programs are particularly prone to this sort of thing. So be very careful who you're sending to! Nonetheless, this has no impact on mail header syntax, because the mail program must convert such "shortcuts" to the actual address to send to, and put this address in the headers, before transmitting the message. Common Errors in AddressingUnfortunately, Internet users these days aren't particularly good at typing an e-mail address correctly, whether it's one they're trying to send to, or even when it's their own address which they're trying to type into a Web form or their e-mail program's configuration. In both my personal and work experience, I've had to deal many times with the problems caused by faulty e-mail entry, when customers and Web site users just can't get it right.
One particularly rampant error these days is to put Another common error is to put spaces and punctuation in the middle of the address -- e.g., if your username is johnsmith, typing it as John Smith -- yes, that's how you type the name in plain-English documents, but as a mailbox name it is generally one word. (As I'll show later, there's actually a way to get spaces into the middle of perfectly legal e-mail addresses, but that takes some special syntax.)
Still another error is to leave out all or part of the hostname portion of the address. AOLers especially do this,
omitting the Be sure to find out exactly what your e-mail address is (ask your ISP if you're not sure), and type it in correctly every time you need to provide it. And be sure you pay careful attention to anybody else's e-mail address you want to send to, in order to remember and type it correctly. You'll save yourself, and your recipients, much grief. Address Validity CheckingBecause of the above-mentioned problem of widespread errors in e-mail address entry, many sites make a variety of validity checks before accepting a user-supplied address on a form. Unfortunately, this too can cause problems if not done well. There are plenty of cases where perfectly valid addresses get rejected due to improperly implemented "validity checks". RFC 3696 covers this issue.
One type of valid address that's sometimes improperly rejected is an address using one of the
new top level domains that is four or more letters
long, like
I'm using a Another site that wouldn't accept the address is the signup page for the "Anything Points" promotion at Ebay, and my attempts to complain about that hit a bureaucratic maze of twisty little passages, all different. The customer-service Web form in that site yielded a canned response saying I had to go to PayPal's site and use their customer service form; that, in turn, yielded a canned response saying that I had to go to EBay and use their form; and that gave a canned response saying that they're having temporary server problems and I need to wait a while and try again. None of these responses gave the slightest indication that any clueful human being had ever actually read my complaint, and weeks later, their site still won't accept my address. The lesson here is to be very careful when programming address validators, to ensure you don't end up unnecessarily turning away customers. This site has an example of the sort of "validation" you don't want to use; it rejects my .name address. Baby, remember my name...I mentioned that you can include a name as well as an address. It's done like this: "John Q. Smith" <jqsmith@example.net> The name (with or without quotes around it) is followed by the address enclosed in angle brackets. The quotes are required if the name has anything other than letters, numbers, and a few other permitted characters. I used to mistakenly believe (based on a misreading of the RFC) that, if there's a space in the name (as there usually is except in the case of pop stars who go "surnameless"), you must put quotes around the name. This is not the case; multi-word names are allowed without quotes, though if you include punctuation such as commas or periods, then the quotes are required; hence, in the above example, you need the quotes because of the period after the initial "Q". It's never invalid to include the quotes even if they're not required in a given case. Another form of name-and-address combination that's sometimes encountered looks like this: jqsmith@example.net (John Q. Smith) This makes use of the feature in the standards that lets parenthesized comments be inserted into headers. Hence, "John Q. Smith" in the above line is a comment, which should properly be ignored by programs that process the message. Most mail programs, however, treat a string in this position as a person's name, to be displayed in the same manner as one in the more proper format of the previous example.
Usually, of course, the end user does not type any of these formats into their mail program when sending a message;
they rely on the program to generate it somehow. In the Note: One thing that will put you at risk of having your mail program inflict nonstandard header lines on your messages is to attempt to include quotation marks within your name, like Jesse "The Body" Ventura. If inserted directly into the header, within double quotes, you'd get "Jesse "The Body" Ventura", which actually parses into two quoted strings, "Jesse " and " Ventura", with The Body sitting in the middle with uncertain purpose. The correct way to include quotes within quotes would be to put a backslash before each of the inner quotes, like "Jesse \"The Body\" Ventura". Well-behaved mail programs ought to insert the backslashes when sending messages with such names, and remove them when displaying the name to the recipient. However, you can't count on mail programs being well-behaved, so if you type in "naked" quotes, they may be sent as-is (violating the standards); and if you insert backslashes (by hand if necessary), this may cause some of your recipients to see the name with backslashes in it, which is rather ugly. It's best to avoid the double-quote character within names; use a single quote (') instead: Jesse 'The Body' Ventura. (And don't use so-called "smart quote" characters, discussed further in the article on character sets; they're not part of the standard ASCII character set.) When displaying a message, various programs deal with names and addresses in different ways. Some of them (particularly those from Microsoft, which like to hide "technical details" from users) tend to show only the personal name, and not the associated address; this can cause problems (such as people sending messages without knowing who they're actually sending to, as with the address-book shortcuts noted above). Sometimes a forwarded message will have its headers mangled in this way -- I've sometimes received messages that have gone through one or more "hands" before reaching me, in which the name of the original sender is shown, but no address at all -- and the message tells me that I have to send some sort of reply to the original sender, something that I have no way of doing. Very annoying. Perhaps to try to get around this problem, I've occasionally observed mail programs using a bizarre "double-addressed" form like this: "John Q. Smith <jqsmith@example.net>" <jqsmith@example.net> Yahoo Groups did this for a while when you sent messages through its Web interface, though they gave it up eventually, apparently not wishing to keep on in one of those counterproductive "arms races" that periodically break out (the ongoing fight between webmasters cluelessly blocking browsers based on user-agent string sniffing, and browser makers making their user agent strings "spoof" more popular browsers, is a good example of such a race). If the battle were to be seriously joined, one would expect Microsoft to release a new mail program that parses out the second embedded address too, leaving only the name once again, and then Yahoo responding by finding a way to embed a third copy of the address. I've also, on a few occasions, seen addresses with different e-mail addresses in the name section, like: "jjones@example.org" <ksmith@example.net>
This usually indicates either an intentional scam (they're trying to make you think the message came from
When you're having more than one...If you need to put more than one e-mail address in a header, for instance when you wish to send a message to several people at once, the addresses are separated with commas, like this (with "simple" addresses): jsmith@example.net, mjones@example.org Or this (with names and addresses): "John Smith" <jsmith@example.net>, "Mark Jones" <mjones@example.org> The space after the comma is optional, but improves the readability. (The whitespace characters, space and tab, are allowed before or after each address within a header.)
Note that the comma (,) is the proper separator between multiple addresses, not the semicolon (;) or any
other character. There are some mail programs that display lists of addresses with semicolons, and accept this format
when taking input of addresses, but any standards-compliant program will actually use commas when transmitting the
address list in a message header. There are reports that some mail programs launched by Web browsers in response
to The From Header
Now that we've given the syntax of addresses as used in headers, describing the specific headers is pretty simple.
The Address MungingRegrettably, there's a growing trend for normally-honest people to intentionally make their e-mail address unuseable as written... usually not in the "From" field of regular e-mails, but in things like newsgroup and Web forum posts. This practice, referred to as "munging", is done to avoid spam "harvesters" that lift addresses from such places to add to their spamming lists. Unfortunately, it also creates a pain-in-the-butt for legitimate correspondents who wish to make contact. Use of a false return address in either e-mail or newsgroups is against the Internet standards, and is basically doing the same thing that spammers themselves do when they falsify their return addresses. It's something I refuse to do myself; every Web page and newsgroup posting I make has a valid e-mail address on it, even though I know this means I'll be deluged with spam (most of which I manage to filter with my mail program's filtering rules). I also refuse to jump through hoops to de-munge addresses of others; if they won't provide a valid address in their site or posting, I won't even try to contact them by e-mail.
Sometimes I've found cases of people "munging" their address within a newsgroup or e-mail list message,
like "...If you have any opinions about this, email me at johndoeMYSHIRT@example.comMYPANTS -- remove
my shirt and pants first." -- but at the same time their messages were posted with their full, untrammeled email
address in the However, the silliest "address munging" I've seen is in the text of articles in print publications -- that is, on paper. Does the author think that spammers will use OCR scanners to harvest their address? Actually, the likely cause is that articles are often written for both electronic and paper publication, and the author wasn't taking any chances of letting his un-munged address get online, even if this had the side effect of silliness in the paper version. Addresses that Aren't Munged, but Look Like It
Occasionally, somebody will attempt to do something "cleverer" than address-munging; they'll aim to come up with
a perfectly valid address that looks invalid, so that spam harvesters ignore it. As I've said above, something
as simple as using an uncommon top level domain like
The plus sign, quotes, spaces, colon, and slashes are all part of the mailbox portion of his address, which is handled
by a server that is set up to allow users to put whatever they want after a plus sign and deliver it to the mailbox
named prior to that sign, but allow spam-filtering rules to operate based on the contents of the part after the plus.
He regularly changes his address (the "00" gets changed based on the current year) and spam-filters obsolete versions
of the address, so even if a spam harvester actually accepts his dubious-looking address, it stops working eventually.
(This "plus" trick actually works on many servers these days, since it's a standard feature of newer versions of the
Unix It may, however, be hard to convince legitimate correspondents that such a weird address is actually real, or to get them to type it in correctly when they attempt to use it. Another thing you might, on rare occasions, see in an e-mail address is a percent sign (%); this indicates a message intended to be forwarded to its destination by another server instead of being sent directly. This was more common in the "old days" when there were lots of computer networks that weren't directly on the Internet but could be reached via gateway systems; you might send to user%example.theothernet@gateway.example.net, where the actual non-Internet destination address is user@example.theothernet (in those days, many networks used pseudo-domain-style addresses ending in things like .bitnet or .uucp even though no such top level domains existed on the Internet) and the messages got there by way of gateway.example.net. Now that pretty much everything is directly connected to the Internet, you don't see percent-sign-based addresses very often, but it's still a legal syntactic form and might have specialized applications. The stuff after the percent sign and before the at sign might be in a very un-Internet-like format as might be needed for a remote network; for instance, the Unix-based UUCP network traditionally used what they called "bang paths", consisting of a string of node names separated by exclamation points, like this!that!other!theend; the message would get passed in turn to each server on the list, in the order stated, finally getting to the destination at the end (maybe days later). Why's There an Angle Bracket Before a "From"?Perhaps you've noticed this in an e-mail message occasionally. You reach a point in the message where the word "From" appears at the beginning of a line, and it's inexplicably got an angle bracket (greater-than sign) before it, like this: The lyrics go like this: >From the halls of Montezuma to the shores of Tripoli...
Usually a greater-than sign at the start of a line signifies quoted material (and, if your mail program shows quotes in a different
color from other text, it might have even colored in that line above), but in this case the line with the angle bracket
is not quoted from another message; it's new material. So why is the angle bracket there? A weird typo? No, certain
(mostly Unix-based) mail server programs will insert the greater-than sign before a leading "From", to "protect" it from
being interpreted as a header. But the header which it's in danger of being misinterpreted as is not the
This was a rather poor design decision by whoever created the Unix mailbox format in the first place; when they chose a boundary marker, they should have picked something much less likely to turn up in the middle of a message than a common English word like "From". But we seem to be stuck with it now; even if you don't get your mail from a Unix server, it's likely to be "mangled" by some server it passes through during its route from sender to recipient anyway. It just goes to show that Microsoft and AOL don't have a monopoly on inflicting screwy and annoying "standards" on the computer world; even the Unix geeks have done some of it. The To Header
The person or persons the message is intended for is/are addressed in the The CC Header
The The BCC Header
The
In addition to privacy and security, the Other Related HeadersThere are a few more address-related headers:
Links
Next: Another important header is the one that tells you what the message is about, the Subject line. [<== Previous] | [Up] | [Next ==>]
This page was first created 29 Jun 2003, and was last modified 11 Jul 2010.
|