Email Theory

Categories: Linux

(Back to the main article)

Introduction

This article describes how email works, and what various email-related expressions mean. The intention is to make documentation and articles related to email configuration easier to understand. In particular, the info here should help anyone setting up their own personal email server.

History of Email

Email is one of the oldest networking technologies. In fact, it was one of the driving reasons why the internet was invented - for researchers at various universities to exchange messages with each other.

Email predates much of the internet as we know it. In particular, it was designed at a time when many computers (even those at universities) had dial-up connections; the mailserver at university A would periodically phone the mailserver at university B and then exchange any queued emails. Repeat for each “direct peer”. Email addressed to a system which is not a “direct peer” would instead be delegated to one of those direct peers for forwarding (multi-hop delivery).

Email also predated all the problems that now plague the internet - spammers, those who want to read our email (eg to find credit-card numbers), and those who want to modify our email (eg to attach trojans to valid emails).

The result of old design, old features, and new security problems, means that email servers are unfortunately some of the most complicated beasts around. There are some very good people working on the major mailserver programs, and the various programs can usually be made to work in a safe and sensible manner. However the configuration process is often very cryptic and old-fashioned, and the manuals full of advice on how to do things that are not relevant for most people.

Mailservers are also heavily used by large companies and ISPs, so often advice for the “small user” case gets lost.

Email Server Infrastructure

Here is the rough structure that is necessary to send and receive email. I’ll then discuss why each part is there, how the parts communicate, and what alternatives exist.

This diagram assumes a common configuration consisting of Postfix (for core email send/receive) + Dovecot (for POP/IMAP access to mailboxes) + Roundcube (for html-based access to mailboxes). However the description is sufficiently high-level that it should apply to most alternatives. The spamassassin application (or an equivalent) is also often used, but is not shown in this diagram.

An Example Message Flow

To put the diagram above into context, let’s look at how an email gets sent and received. For just about every step in this process there are options and quirks which are discussed in more detail later.

Outgoing Mail Example

On a laptop, an email client application (eg Thunderbird) is used to write an email. When the “send” button is clicked:

  • the client application connects to the mailserver host on port 587 (aka “submission port”).
  • the server accepts the network connection, and sends a “greeting” in the SMTP protocol
  • the client sends an SMTP EHLO command stating its “name” (which is not very important)
  • the client sends an SMTP STARTTLS command, and the client/server then exchange a few datapackets to set up encryption. After this is completed, all other data on the socket is unreadable to snoopers.
  • the client sends an SMTP AUTH command, providing a username and credentials (eg password). There are various kinds of authentication possible here.
  • the client sends MAIL, RCPT and DATA commands to provide the “return address”, “destination address” and “email body” information respectively.
  • the client closes its socket.

The mailserver computes a DKIM signature for the email (using a local private key), adds it to the email headers, and puts the received email on an “outbound queue”. DKIM is optional, but recommended.

At some later time the email server then processes its output queue. For each email it:

  • extracts the destination address(es) of the email, eg “coyote@acme.example”
  • performs a DNS lookup of the MX record for the domain-part (“acme.example”). This returns the name of the host for that address.
  • performs another DNS lookup of the A record for the hostname to get an IP address.
  • opens a socket to port 25 (the “smtp” port) on that server, and passes the emails over to the destination email server using the SMTP protocol.

The receiving server will perform some security checks on incoming mail (sadly necessary in the modern network world) which are described below - they are the same checks our server will do on its incoming mail. Hopefully all checks will pass, and the email is stored for viewing by the intended recipient.

Note that the mailserver needs to know all valid (username, credential) pairs in order to verify the user during the AUTH step - but does not need to know anything else about the user (eg where their incoming email is stored). The topic of storing users and credentials is discussed further below, as there are many possibilities here.

The mailserver must also have an SSL certificate for the STARTTLS step.

Incoming Mail Example

On the mailserver, a service (daemon process) waits for incoming connections (“listens”) on port 25 (“smtp port”). When a connection arrives, a well-configured smtpd server typically:

  • Makes a DNS PTR-record check (reverse map from incoming IP address to DNS name) to detect people using desktops as email-servers (almost certainly spammers); if reverse lookup fails then the connection is rejected (see later for more info on PTR records).
  • Checks the incoming hostname (from PTR lookup) against lists of “known spammers” (blocklist check); if test fails then connection is rejected.
  • Checks the destination domain of the emails passed in. In our case, we are not running a “mail relay” so any incoming email for anything but the domain we want to receive email for are immediately rejected. We should never receive email here for anything but the local domain, so any such sender is almost certainly a spammer hoping this is an “open relay”.
  • Certifies the recipient (to-address) is a “known user”; if not found then the email is rejected as “undeliverable”.
  • Checks SPF, DKIM and DMARC records for the incoming mail - any suspect emails are dropped.
  • Stores the email on the “incoming queue” and returns success

Later:

  • Mail is passed through an external spamassassin daemon, ie emails are passed to it and they are passed back - with modified headers that indicate the estimated probability that the email is spam.
  • Mail is passed to a “delivery agent” - in this case, something that communicates with the the Dovecot lmtp daemon via the LMTP protocol (a variant of SMTP designed for local delivery on one host only).
  • Dovecot stores the email in a location that depends on the account-name and the desired storage-format.

Note that the smtpd server needs to know all valid users (so it can reject email with invalid to-addresses), but nothing more. The Dovecot lmtpd server needs additional information about users (such as where to store their email and in which format). However neither need to know passwords for the users to handle incoming mail.

Viewing Mail

  • user clicks “load mail”/refresh/etc in their desktop client (or the client does it automatically on a timer)
  • desktop application connects to dovecot POP3 port, uses STARTTLS to enable encryption, and then authenticates with a username/password.
  • desktop application then sends a “list all emails” and dovecot returns a list of message-ids, using the account’s local store of emails.
  • desktop application repeatedly issues “get {id}” (and optionally “delete {id}”) to get any messages it doesn’t already have locally.
  • desktop application then displays the messages to the user.

The client could alternatively use the IMAP protocol, ie talk to Dovecot on a different port and with a more powerful set of commands - but that doesn’t change the flow much.

A Web Interface

The process for writing/viewing emails via a web-interface is fairly similar to the desktop-client flow. A webserver on the mailserver includes a bunch of CGI pages for some application like “roundcube”. User requests for a URL execute the corresponding script on the server. For reading of emails, the scripts typically communicate with Dovecot via IMAP and then renders HTML back to the user.

When the user has written an email using the web interface and clicks “send”, the server-side script that is triggered does not communicate with Dovecot but instead with the mailserver directly, via the “sendmail” application. The mailserver will recognise that the email is coming from the localhost, and applies quite different checks - it doesn’t apply RBL checks, and doesn’t limit the destination address to local addresses only. If the destination is local, then the email is delivered locally as with the normal incoming email flow, but in the more common case it just places the submitted mail on its outgoing mail queue.

Direct Delivery (the death of relayed mail) and Smart Hosts

Originally, mail was a multi-hop process.

Now that the internet connects any server with any other server, mail is usually directly delivered from the server that first accepted it from the client app to the server responsible for the destination address domain. The old multi-hop approach is therefore mostly dead - with a few exceptions. A mailserver which runs as a mail relay (ie acts as middle-man in a multi-hop delivery) is nowadays called a “smart host” (see later) - and is always (or should be) very carefully configured to relay only under specific conditions - eg when the source server has a specific IP address, or the destination domain for the email is a specific domain.

A mailserver configured to relay email without strict constraints is called an “open relay”, and will almost immediately be detected and used by spammers to relay their email. As soon as you register an MX domain record for your email address, spammers will start probing your email server to see if it can be used as a relay. If it can, your server will then soon after be registered on an anti-spam blocklist and it is time to buy a new domainname..

One exception to the “direct delivery” rule is that a large institution (company, university, etc) might have a mailserver-per-department which then forwards to a central mailserver. And reverse, email may be delivered to the central mailserver and then forwarded to the appropriate per-department server.

Another exception is ISP-specific “smart hosts”. ISPs offer a mailserver which desktop applications can connect directly to, as in the flow described above. However a few of their customers might also be running their own mailservers. ISP firewalls often prevent their customers from connecting to port 25 on any server (see below). Some ISPs allow customers who pay for a static IP address to disable this rule, allowing customer mailservers to deliver “directly”, as described earlier. Other ISPs refuse to relax this rule for any customer, in which case the mailserver must be configured to relay mail via the ISP’s mailserver - which must then be configured to allow that.

While talking about “direct delivery”, why do email clients need a local mailserver at all?

Client email applications don’t typically deliver email directly to the target email server - they could (eg from my desktop direct to @acme.example rather than to my local email server), but:

  • the receiver won’t be able to verify me via (username, password) - and I don’t want to register with every server I might send an email to;
  • the receiver won’t be able to verify me via my network address as my desktop is using a dynamic address (ie is not in DNS);

And in the reverse direction, email servers cannot deliver email directly to desktops, because they have a network address that can’t be derived in any way from the email address, and firewalls will block such connections anyway. And such systems are often turned off for long periods.

Why a Separate Submission Port (587)?

Originally, client applications for submitting email would just connect to port 25 of some server running email software and submit their messages there. Originally, SMTP servers wouldn’t even require passwords - email was accepted from anyone, for anyone. Eventually, authentication was added - but on the same port, which was tricky as for a mailserver some connections were from desktops wanting to deliver “outbound” mail while some connections were from remote SMTP servers wanting to deliver “inbound” mail. The desktops (outbound mail) can authenticate via registered (username, password) pairs - but the inbound servers cannot as there are far too many of them. The solution was to have security-rules that applied different checks depending on whether email addresses indicated they were “outbound” (reject unless AUTH previously done) or “inbound” (reject unless DNS checks of PTR/A records pass). Possible but complex.

And in addition, there were simply too many amateur system-admins out there running “open mail relays” on port25, and too many spammers probing for them. Serious ISPs therefore started blocking outbound connections from their domestic (dynamic-ip) customers to port 25 - and sometimes even for customers with static IP addresses.

The answer was to separate the traffic: server-to-server should use port 25, and client-to-server should use port 587. The SMTP server rules become easier, and ISPs can continue to block port 25 while allowing 587 so that their customers who bring work laptops home can still contact their company mailserver submission port.

Blocklist lookup

DNS Blocklists (DNSBLs aka RBLs) are a nice way to filter out a lot of incoming spam.

A DNS lookup of the PTR record for the incoming IP address is done; this should return a hostname. If no PTR record is found at all, then the incoming connection is not a “serious server” and the connection is dropped. A PTR record maps an IP address back to a hostname (ie a “reverse lookup”). Only the owner of an IP address range is permitted to register a PTR record for an address in their range. An ISP certainly never registers PTR records for the “dynamic IP addresses” it hands out to domestic customers, so this simple check blocks all spammers using domestic ISP accounts to send spam mail. Even more importantly, it also blocks all those spammers who have taken over poorly-secured private desktops connected to domestic ISP accounts, and are using them as a mail botnet. For customers who have paid their ISP for a static IP address, the ISP might provide them with a way to publish a PTR record for that address - or might not.

Unfortunately, the lists of all domains associated with spam is just too large to distribute directly; instead the providers of such blocklist information usually use a clever trick to allow email servers to know if some domain (the result of a PTR lookup on an incoming network connection) is in the list or not - they distribute the information via the DNS registry, by registering a special DNS entry for each spam-domain. For each “blocklist” server the mailserver has been configured with, a DNS lookup of the MX record for name “{hostname}.{bldomain}” is made where {bldomain} is the domain of the blocklist-provider, and {hostname} is the name of the computer trying to send email to our mailserver. If such a record exists then this is a known spammer, and the connection is dropped. In effect, the RBL managers are using DNS as a “distributed database” in which they publish the names of bad actors. The owner of a domain-name can publish any MX records they like as long as they end in their domain-name. So “spamhaus.org” (one of the best-known RBLs) can publish “some.bad.domain.spamhaus.org” to indicate that they consider “some.bad.domain” to be a spammer.

Blocklists can also be implemented at the spamassassin level, but doing it earlier is nice for performance. Actually, on BSD systems, OpenSMTPD works with “spamd” which then communicates with the “pf” firewall to ensure that repeat emails from a bad source are dropped really early in processing. AFAIK this isn’t available on Linux.

Blocklist providers generally provide their service for free (the info in DNS is effectively free). Big email consumers are expected to download a copy of the blocklist instead (for a fee) to lighten the load on DNS.

To reduce network traffic, DNS responses should be cached on your local system - ie the mailserver should be running a caching DNS server.

StartTLS and Certificates

Data passed between a desktop email client application and the “submission port” of an email server should be encrypted, to hide both login passwords and email contents. There are two ways to provide an encrypted connection for SMTP messages:

  • the mailserver listens on a port for incoming connections, and immediately starts setting up TLS encryption, or
  • the mailserver listens on a port for incoming connections, and initially starts exchanging SMTP messages in plain-text. When the client application sends a STARTTLS command, then the TLS encryption is set up and the remainder of the SMTP messages are encrypted.

Setting up TLS encryption usually requires that the server side (the machine being connected to) has an SSL Certificate which includes its identity (servername) and is signed by some other certificate which is signed by some other certificate etc until eventually it is signed by a “well known certificate” that the client application already has installed.

SSL Certificates can be purchased for a reasonable fee. Since 2016, it has also been possible to obtain an SSL certificate for free from the letsencrypt project.

SPF, DKIM and DMARC

SPF (Sender Policy Framework) allows the owner of an email-domain to indicate exactly which email-servers are permitted to send emails with a MAIL-FROM (aka “Return-Path”) address belonging to that domain. This is done by configuring a DNS record of type TXT with special contents. If a spammer then tries to send emails with fake MAIL-FROM data that references your email servers, any email-server which performs SPF checks will see that these emails are “fake” and can reject them. While this is useful, it does not prevent spammers from sending emails with your domain in the “From:” field that the recipient sees - DMARC (paired with either SPF or DKIM) is necessary for that.

DKIM performs a similar task to SPF. You register a public-key for your email domain in DNS, and then as each email is sent via your email server, the server computes a checksum and signs it with the private part of that key, and adds that signature as an extra email header. Any email-server that receives a mail with a signed DKIM header can verify that the header is valid (was originally signed by your server); spammers do not have access to your private key, so cannot generate an email with a signed header for your domain. Unfortunately, like SPF, this alone does not prevent spammers from sending emails with your domain in the “From:” field and simply not including a DKIM header. To protect the “From:” field, DMARC is needed.

DMARC allows the owner of an email-domain to publish a DMARC record for that email domain in DNS. A DMARC-enabled mailserver extracts the domain-name of the “From:” header from each email, then checks whether a DMARC record exists. If one does, then the incoming email with a specific address in the “From:” header must either (a) be sent from a server that has an SPF record matching that domain, or (b) have a valid DKIM header (signed as described above). Or in short, either (DMARC+SPF) or (DMARC+DKIM) are needed to prevent spammers from using your domain in their “From:” headers. Note however that the “From:” header was not originally designed to be used this way, and a few valid usecases are not possible when DMARC is enabled for a domain.

DNS records : A, PTR, MX

DNS is a world-wide distributed database which provides information related to networking. The records in this database are keyed by IP-addresses or domain-names, and theoretically those organisations that have rights to update the DNS database only allow the owner of a domain or ip-address to publish such records (sadly not 100% effective, but nevertheless reasonably true).

  • NS records map a domain-name to a DNS server which can serve up the following records for that domain
  • A records map a hostname to an IPv4 address
  • AAAA records map a hostname to an IPv6 address
  • PTR records map an IPv4 or IPv6 address to a hostname (aka “reverse lookup”)
  • CNAME records map a hostname to another hostname (define “aliases”)
  • MX records map an email-domain to a hostname (ie point to the mailserver host for that email domain)
  • TXT records hold small amounts of arbitrary data associated with a hostname (eg a public key or an SPF declaration)

When setting up a mail-server, it is necessary to register appropriate DNS records. A new MX record is definitely needed, and other records may need to be registered if they do not already exist.

Note that there is no direct link between an MX (email) domain and a hostname domain (A or AAAA). It is common for email addresses of form someone@acme.example to be handled by a server which has name mail.acme.example or similar, but that is not a requirement. There is in fact a very common exception - an MX domain belonging to some company or person often points to a commercially-run email server, eg from Google. The owner of the MX domain must of course first have an account with the email-server provider (otherwise the emails will just be rejected), but this provides “personalized” email addresses without the effort of running an email server.

User Authentication and Information

As mentioned earlier, for outbound email the smtp server needs a (user,credentials) pair to authenticate the user (because sending to non-local addresses should not be permitted for unknown remote users, to prevent spam). No other information about the user is required.

For inbound email, the SMTP server really should reject email for invalid destinations, ie return an error to the remote SMTP server, rather than accept the email and then later fail to deliver it. Rejecting while the SMTP data-transfer is still in progress makes the invalid-destination the sending server’s problem, which is an improvement from the receiver’s point of view. When an email is accepted, then cannot be delivered, then the only options are to either (a) silently discard the email, or (b) send an “undeliverable message” (aka bounce) email back to the sender address. Discarding email is not a good user-experience for a sender who has just mistyped an address. Sending bounce emails is unfortunately a bad idea in this era of spammers, as they often use fake sender-addresses pointing at real but innocent people. This means that an SMTP server handling inbound mail ideally needs a way to determine which email-addresses are valid, ie which users exist on the local system, before handing the email off to a Mail Delivery Agent. Note that the user does not need to be authenticated, ie no password-test is done, just a test for existence. The SMTP servers are therefore usually configured with some kind of “user database”.

When handling inbound email the Mail Delivery Agent (MDA) needs to know the user’s desired email storage format and location, quotas, etc. in order to be able to write the email to the appropriate location. The MDA must therefore also be configured with some kind of “user database”.

In addition, Dovecot needs (username, password, userinfo) in order to support POP3 and IMAP. A webmail interface (eg roundcube) also needs username/password info when a user “logs in” to it.

The primary “userinfo” is: directory to write into, and uid/gid values to assign to the file(s).

I have been careful to separate the concepts of (user,creds) and (user,info) because some tools such as Postfix are also careful to separate these concepts, and allow different sources of information to be configured for each.

There are many possible combinations/solutions here.

The traditional solution is to use /etc/passwd:

  • /etc/passwd can be used for userlist, and pam for credentials – ie require all users to be local
  • userinfo can be assumed to be uid/gid/homedir from /etc/passwd (directories relative to homedir for that user)

A solution using similar files but not directly linked to linux accounts is:

  • one text file for (user,creds) and another for (user,info) which is shared by all components.

Another popular solution is an SQL database shared by all components. Unlike the textfiles, this makes updating of information (eg via Dovecot interface) easier to implement.

And of course there is LDAP.

Some SMTP servers can be configured to delegate username/creds and username/info lookups to an external process via SASL (a protocol originally invented for the Cyrus email server). Postfix supports this, and this can be used to unify the user creds and info with

Transfer Agents and Delivery Agents

Mailservers have had many security-holes over the years; they are exposed to data from the internet and must parse such untrusted data in order to process it. To grant such an application superuser rights in order to deliver mail is obviously a significant escalation in the dangers. Mailservers are therefore often implemented as a set of cooperating processes, each with just the minimum of system-access-rights to do its specific job.

The two primary components in a mailsystem are the Mail Transfer Agents (MTAs) and Mail Delivery Agents (MDAs). As a very brief summary:

  • MTAs accept email over a network and write it to a local queue
  • MDAs take email from a local queue and write it to a user-specific “mailbox”

Mailbox Formats

Over the many decades that email has existed, several ways to represent emails on disk have been developed. The primary ones are:

  • mbox: all email for a user is appended to a single file, with a basic index-structure included.
  • maildir: each email is stored as a separate file within a directory

Undeliverable Mail

When a remote system submits an email and the SMTP server detects that there is no such local user, then an error can immediately be reported. However in some cases email is first accepted, and buffered and the user-check is only done later after the remote system has disconnected. In this case, the choices are to send a “bounce email” to the original sender address, or just ignore the email. Bounces were popular earlier, but with the prevalence of spam which uses faked “from” addresses this just causes more problems than it solves.

Further Reading