Craig Box's journeys, stories and notes...


Posts Tagged ‘email’

Everything about Exim

Wednesday, November 29th, 2006

Courtesy of Daniel, one excellent Exim cheatsheet.

Exipick, and importing Apache certificates into IIS

Thursday, October 12th, 2006

Greig's cool find of the day:

Exim comes with a script called exipick, which lets you see just the parts of the mail queue that match a particular pattern. ie. we want to get notified of messages that are queued on a backup MX, but aren't just bounces to fake addresses that will eventually time out:

exipick '!$local_error_message'

Which makes looking at mail queues much easier:

root@elston:~# exipick | wc -l
96
root@elston:~# exipick '!$local_error_message' | wc -l
0

My find is a little less interesting, and a little more "just googled it", but if you have certificates in Apache crt/key format, and you want to import them into IIS, you can
do so with openssl:

/etc/ssl/site.net.nz# openssl pkcs12 -export -out site.p12 -inkey site.key -in site.crt

Read more at Michael's meanderings, including about the useful SSLDiag utility.

Graphing and analysing SpamAssassin

Friday, July 21st, 2006

Here's something simple that I never thought of - props to my workmate Tom for coming up with this.

SpamAssassin scores plot

This is a gnuplot graph of our SpamAssassin scores. The code used to generate it is on the bottom of the SpamAssassin notes page at the WLUG wiki.

The grouping around -100 is caused by the whitelist rule, which scores messages down 100 points (ensuring they are never marked as spam). Usefully, this rule doesn't count towards the threshold needed to be reached before a message is learnt as ham by the Bayesian categoriser.

We seem to have a reasonably normal distribution of good mail, between about -5 and +5, and a reasonably normal distribution of spam, between 10 and 60. This means our filter is working really well. What I took from this, is that it was safe to up the ham learning threshold - it defaults to -0.1, but I've set ours to 1, as we have a lot of rules that score all messages up quite equally.

Also useful is sa-stats.pl, which generates a summary table of how often rules were hit on messages that were either marked as ham or spam. As of today:

TOP SPAM RULES FIRED
———————————————————————-
RANK RULE NAME                COUNT  %OFMAIL %OFSPAM  %OFHAM
———————————————————————-
   1 RAZOR2_CHECK               153  38.65  76.50   1.00
   2 BAYES_99                   150  37.41  75.00   0.00
   3 RAZOR2_CF_RANGE_51_100     149  37.41  74.50   0.50
   4 RAZOR2_CF_RANGE_E8_51_100  128  31.92  64.00   0.00
   5 URIBL_JP_SURBL             125  31.17  62.50   0.00
   6 URIBL_BLACK                120  29.93  60.00   0.00
   7 URIBL_SC_SURBL             105  26.18  52.50   0.00
   8 URIBL_OB_SURBL             105  26.18  52.50   0.00
   9 HOST_EQ_D_D_D_D            102  28.93  51.00   6.97
  10 RCVD_IN_SORBS_DUL           92  23.19  46.00   0.50
TOP HAM RULES FIRED
———————————————————————-
RANK RULE NAME                COUNT  %OFMAIL %OFSPAM  %OFHAM
———————————————————————-
   1 AWL                        193  57.86  19.50  96.02
   2 BAYES_00                   183  45.64   0.00  91.04
   3 RELAY_IS_203                78  20.20   1.50  38.81
   4 FH_RELAY_NODNS              75  25.44  13.50  37.31
   5 HTML_MESSAGE                72  35.66  35.50  35.82
   6 UPPERCASE_25_50             60  14.96   0.00  29.85
   7 FORGED_RCVD_HELO            56  36.16  44.50  27.86
   8 USER_IN_WHITELIST           23   5.74   0.00  11.44
   9 NO_REAL_NAME                20  13.22  16.50   9.95
  10 SPF_HELO_PASS               19   5.49   1.50   9.45

I toyed with changing the scores on rules that hit lots on both ham and spam, such as FORGED_RCVD_HELO, but they contribute only very small weightings overall at the moment.