The Central Email Scanner
Introduction
The majority of email in Cambridge (including email entering,
leaving, and within the University) passes through a central relay
known as ppswitch. This relay runs software that scans
email to protect the University against spam and viruses. This page
contains details of the policies implemented by the scanner, including
which email is blocked and any modifications made to email that is passed
through. The Computing Service maintains other pages with
general information about junk email
and advice on protecting your computer
from viruses.
The email scanner is only a first line of defence. You should still run a virus scanner on your computer because there are ways of getting infected other than via email. You can get anti-virus software from the Computing Service. Users who receive a lot of email may also benefit from running their own spam filter, since a personalized filter can be more closely tuned to the kinds of email you receive. See FAQ: What is junk mail and what can I do about it? for pointers to further information.
If you have any problems with blocked or filtered email,
please contact the Computing Service help desk,
<help-desk@ucs.cam.ac.uk>
in the first instance.
Anti-spam measures
This section gives details of the way the email scanner identifies and blocks spam. If you just want to know how to set up a filter using the information provided by the scanner, see the page on spam filtering.
The email scanner uses a mixture of techniques to reduce the amount of spam that users have to deal with:
DNS blacklists
We use DNS blacklists to identify the IP addresses of computers on the Internet that we will not accept email from. There are a number of reasons that an IP address may be blacklisted: the computer may be misconfigured or compromised in such a way as to make it open to abuse by spammers; the address may be listed by its owner as one that should never send email; or the address may be allocated to an organization that is known to send spam. We also use DNS blacklists to identify email that appears to come from a domain owned by spammers.
The Computing Service only uses DNS blacklists that have a good
reputation for not gratuitously listing legitimate IP addresses or
domains. Even so there is the occasional communication problem caused
by the blacklists, in which case you can contact
<postmaster@cam.ac.uk>
for assistance (no messages to that email address are blocked).
However note that we do not configure ppswitch with
special exceptions to the DNS blacklists because that would be a
duplication of effort; in the case of an erroneous listing you must
deal with the DNS blacklist administrators via the web sites below.
At the moment ppswitch uses the
Spamhaus ZEN,
URIBL.com, and
SURBL.org blacklists to
block email. They are made available to us via
a national subscription supported by JANET.
Each blacklist is actually a combination of several lists that follow
complementary policies. See the blacklist web pages for details.
SpamAssassin
SpamAssassin is a program that performs a large number of tests on a message to decide if it is spam. These tests look at the content of the message, various technical details in its headers, and query databases on the Internet, including several other DNS blacklists. Many of the tests identify features of the message that are common in spam and some of them identify non-spam features. Each test has an associated score which is positive for spam and negative for non-spam. The scores of all the tests that succeed are added together to produce an aggregate score for the message as a whole. The scores are tuned so that legitimate messages score less than 5 and messages that score 5 or above are almost certainly spam.
Although SpamAssassin is reasonably effective it cannot identify
spam or legitimate email 100% accurately, therefore ppswitch
uses its results conservatively. Messages that score more than a safe
global threshold (currently set to 10) are rejected. Messages that
score less than the global threshold are delivered as usual, with
extra headers added to record the message's score.
These headers can be used to filter spam to a folder other than your
inbox. The way to set this up is described on
another page.
The Computing Service only makes basic changes to the SpamAssassin configuration to tailor it to our local needs; for example, we have configured it to use the JANET blacklists. We do not make more extensive changes to the tests because that would be duplicating the work of the SpamAssassin developers and it would make it harder to keep the software up-to-date. For this reason we are not generally interested in individual messages that score unexpectedly high or low and are erroneously classified as spam or not, since there is little we can do about them.
For more information, see the SpamAssassin FAQ. If you receive a legitimate message that was classified as spam, perhaps you set your filtering threshold too low; see also the FAQ. If you receive some spam that was classified as legitimate email, perhaps you set your filtering threshold too high; see also the FAQ. Though it is a chore to have to go through your spam mailbox every few days to delete messages, SpamAssassin isn't perfect so you would risk losing real email if high-scoring messages were deleted unseen; see also the FAQ.
Anti-virus measures
The scanner blocks any email that contains malware according to ClamAV. In addition to the standard ClamAV malware databases, we also use third-party databases distributed by Sanesecurity. "Malware" includes viruses, worms, trojans, and phishing.
Some of the ClamAV databases also identify spam messages. If a message matches one of these tests, its SpamAssassin score is increased, and the message is only blocked if it scores high enough.
Unfortunately, sometimes new malware is sent to us before the ClamAV database is updated to detect it. If you receive a suspicious message you can submit it to the ClamAV maintainers for inclusion in the next update.
As a further level of protection it also blocks messages that contain potentially-dangerous attachments, based on the name and type of the file they contain. This extra protection helps when there are delays getting a virus database update from the vendor, and it reduces the ways in which malicious email can trick users.
If you want to send a message containing a dangerous file, you can
avoid the file type and name restrictions by putting it in a
zip file before sending. Note that the virus scanner can
look for viruses inside zip files and other types of
archive, but the file type and name restrictions only apply to the
outer wrapping of the attachment.
Message annotations
The email scanner adds some headers to each message that passes through, containing information about what the scanner found. You can see them by viewing the full headers of the message. If a message is scanned more than once (e.g. because it has been re-sent) then it will have more than one set of scanner headers.
Each of the headers starts X-Cam-. The X-
indicates that this is a non-standard header. The -Cam-
is to distinguish the Cambridge scanner installation from other
email scanners that might work on the same message.
The X-Cam-ScannerInfo: header contains the URL of this
web page, so that people can find out the operational details of the
scanner without needing to know anything about Cambridge University or
the Computing Service.
The X-Cam-AntiVirus: header summarizes the findings of
the virus scanner. It usually says "no malware found" if the
message passed the virus filter OK. In some circumstances
(see below)
it may say say "not scanned" if the message was not scanned
for viruses, or "found to be infected with ..." if a virus
was found but the message was not blocked.
The X-Cam-SpamDetails: header contains the results
from SpamAssassin. It will say "not scanned" if the message
comes from within the University (see below);
otherwise it will look something like this:
X-Cam-SpamDetails: score 9.2 from SpamAssassin-3.2.5-668092 * 0.0 MISSING_DATE Missing Date: header * 1.6 TVD_RCVD_IP TVD_RCVD_IP * 1.1 HTML_EXTRA_CLOSE BODY: HTML contains far too many close tags * 0.0 HTML_MESSAGE BODY: HTML included in message * 0.4 HTML_FONT_SIZE_HUGE BODY: HTML font size is huge * 1.7 MIME_HTML_ONLY BODY: Message only has text/html MIME parts * 2.0 URIBL_PH_SURBL Contains an URL listed in the PH SURBL blocklist * [URIs: fyzikskool.com] * 2.0 URIBL_BLACK Contains an URL listed in the URIBL blacklist * [URIs: fyzikskool.com] * 0.1 RDNS_DYNAMIC Delivered to trusted network by host with * dynamic-looking rDNS * 0.3 DYN_RDNS_SHORT_HELO_HTML Sent by dynamic rDNS, short HELO, and HTML
The text includes the overall score assigned to the message, the version of SpamAssassin, and the list of tests that the message matched with the score, codename, and explanation for each test.
If the message has a spam score greater than one, a fourth header
is added. The X-Cam-SpamScore: header contains a sequence
of the letter "s" (for "spam") equal in length to
the message's score rounded down to a whole number, e.g.
sssssssss for a score of 9.2. This header is intended to
make it easy for users to configure their spam filters.
Coverage of the scanner
The following explanation is true for the majority of email in the
University. However, various colleges and departments run their own
email systems independently of the Computing Service. Many of them
"hub" through ppswitch (i.e. send and receive
email via the central scanner), but some do not. The question of which
email will be scanned is therefore not simple to answer, and this just
reflects the organizational complexity of Cambridge University.
You can find out if a message has been scanned by viewing its
full headers. If the
X-Cam-AntiVirus: and X-Cam-SpamDetails:
fields appear and do not say "not scanned" then the message
has been scanned for viruses and spam, respectively. In some cases of
complicated forwarding or re-sending, a message may pass through the
scanner more than once, in which case more than one set of scanner
headers will appear.
Email usually comes into the University from outside via
mx.cam.ac.uk, which is one of ppswitch's
names. This email is subjected to all the tests described above:
DNS blacklists,
SpamAssassin,
and anti-virus scans.
Email from non-hubbed institutions within the University may also
arrive at the scanner via mx.cam.ac.uk, in which case it
is subject to the same checks as email from outside the
University.
Other email from within the University reaches the scanner via
smtp.hermes.cam.ac.uk or ppsw.cam.ac.uk.
This is only subject to the anti-virus checks.
There are a few exceptions to the above. Email that is sent to
special contact addresses, i.e. postmaster@ any local
domain is scanned but never blocked. There is a similar exemption for
certain full-disclosure mailing lists, such as
BugTraq.
In some situations, such as email sent via
lists.cam.ac.uk, the scanner knows it has previously
analysed a message and does not re-scan it.
In a few cases email comes into the University via a non-hubbed
institution and is subsequently relayed via
ppsw.cam.ac.uk with insufficient spam filtering. There is
a special arrangement to scan this email with SpamAssassin, though it is not blocked.
If you have any questions or suggestions about these arrangements,
please contact
<mail-support@ucs.cam.ac.uk>.
The title of this document is:
Central Email Scanner
URL:
http://www.cam.ac.uk/cs/email/scanner/

