Craig Box's journeys, stories and notes...


Posts Tagged ‘linux’

Graphing and analysing SpamAssassin

Friday, July 21st, 2006

Here's something simple that I never thought of - props to my workmate Tom for coming up with this.

SpamAssassin scores plot

This is a gnuplot graph of our SpamAssassin scores. The code used to generate it is on the bottom of the SpamAssassin notes page at the WLUG wiki.

The grouping around -100 is caused by the whitelist rule, which scores messages down 100 points (ensuring they are never marked as spam). Usefully, this rule doesn't count towards the threshold needed to be reached before a message is learnt as ham by the Bayesian categoriser.

We seem to have a reasonably normal distribution of good mail, between about -5 and +5, and a reasonably normal distribution of spam, between 10 and 60. This means our filter is working really well. What I took from this, is that it was safe to up the ham learning threshold - it defaults to -0.1, but I've set ours to 1, as we have a lot of rules that score all messages up quite equally.

Also useful is sa-stats.pl, which generates a summary table of how often rules were hit on messages that were either marked as ham or spam. As of today:

TOP SPAM RULES FIRED
———————————————————————-
RANK RULE NAME                COUNT  %OFMAIL %OFSPAM  %OFHAM
———————————————————————-
   1 RAZOR2_CHECK               153  38.65  76.50   1.00
   2 BAYES_99                   150  37.41  75.00   0.00
   3 RAZOR2_CF_RANGE_51_100     149  37.41  74.50   0.50
   4 RAZOR2_CF_RANGE_E8_51_100  128  31.92  64.00   0.00
   5 URIBL_JP_SURBL             125  31.17  62.50   0.00
   6 URIBL_BLACK                120  29.93  60.00   0.00
   7 URIBL_SC_SURBL             105  26.18  52.50   0.00
   8 URIBL_OB_SURBL             105  26.18  52.50   0.00
   9 HOST_EQ_D_D_D_D            102  28.93  51.00   6.97
  10 RCVD_IN_SORBS_DUL           92  23.19  46.00   0.50
TOP HAM RULES FIRED
———————————————————————-
RANK RULE NAME                COUNT  %OFMAIL %OFSPAM  %OFHAM
———————————————————————-
   1 AWL                        193  57.86  19.50  96.02
   2 BAYES_00                   183  45.64   0.00  91.04
   3 RELAY_IS_203                78  20.20   1.50  38.81
   4 FH_RELAY_NODNS              75  25.44  13.50  37.31
   5 HTML_MESSAGE                72  35.66  35.50  35.82
   6 UPPERCASE_25_50             60  14.96   0.00  29.85
   7 FORGED_RCVD_HELO            56  36.16  44.50  27.86
   8 USER_IN_WHITELIST           23   5.74   0.00  11.44
   9 NO_REAL_NAME                20  13.22  16.50   9.95
  10 SPF_HELO_PASS               19   5.49   1.50   9.45

I toyed with changing the scores on rules that hit lots on both ham and spam, such as FORGED_RCVD_HELO, but they contribute only very small weightings overall at the moment.

AWStats on Ubuntu

Friday, July 21st, 2006

AWStats is a "free powerful and featureful tool that generates advanced web, streaming, ftp or mail server statistics, graphically". It's commonly used for generating pretty logs of your Apache web server. (See the AWStats demo if you're unfamiliar and interested.)

I got it going with my Ubuntu virtual web hosting setup this morning, and wrote a page about AWStats, Apache 2 and Ubuntu or Debian on the WLUG wiki. Enjoy.

Audio on Ubuntu

Thursday, July 20th, 2006

Ian recently mentioned a new Skype beta for Linux, using ALSA. Did you know that ALSA has supported software mixing "out of the box" since 1.0.9rc2? This means everything from Ubuntu Breezy up did sound mixing, and you didn't even know it. That means if Linux can play sound to your sound card, it will automatically mix multiple sound inputs at once, in hardware if possible, on the CPU if not.

GNOME uses the Enlightened Sound Daemon (ESD) to provide its audio notifications. ESD is, amongst other things, a software mixer - before ALSA, it would take control of the sound device and applications would connect to ESD, which would mix the sound together. Since Breezy, Ubuntu has used ESD with an ALSA backend, meaning that sound mixing "just works" for any application using ESD or ALSA for sounds. The only leftover was applications that wanted to write directly to /dev/dsp device, which can only ever be used by one person at a time. Skype was the last application I could name that didn't talk to ALSA natively, and unfortunately it had issues operating with ESD's dsp emulator, esddsp.

ESD hasn't been maintained for some years, and is probably going to be replaced with the new PulseAudio, formerly known as PolypAudio, a program designed to be a drop in ESD replacement.

Then, of course, there is Gstreamer, which can loosely be compared to DirectX's DirectShow. gstreamer-properties (or Preferences -> Multimedia Systems Selector) lets you set gstreamer to output to ALSA. I assume it's the default in recent Ubuntu releases, so you can play as many sounds, via as many methods, as you like.

crb@machine:~$ apt-cache search gstreamer | grep alsa
gstreamer0.10-alsa - GStreamer plugin for ALSA
gstreamer0.8-alsa - ALSA plugin for GStreamer

Which is it, though? 🙂

Shutting Debconf up

Monday, July 17th, 2006

Debian's package system, as well as its automatic dependency resolution, has reasonable management of configuration files - not as great as Gentoo, unfortunately, which has some smarts about merging changes, but at least it stops you and tells you what is changing. It does this for files that are labelled as 'conffiles'.

If you're upgrading a lot of alike machines, you can find out what answers you want to load in first, and then tell the others to accept or reject the changes appropriately.

For example, hdparm gets an init script in Dapper that it didn't have in Hoary, so we can safely force an answer of 'yes' for that package:

apt-get install -y hdparm -o Dpkg::Options::="--force-confnew"

However, the firewall rules have been customized locally, and overwriting them with defaults would be bad!

apt-get install -y linuxserver-firewall -o Dpkg::Options::="--force-confold"

ClamAV's packages are a bit smarter, using the newer ucf configuration system, which, among other things, can handle a three way merge - letting you compare new, current and original, in a way that can roll your changes in a bit better. (It's also designed more for files edited or created in postinst, and not just plain configuration files). The syntax for automatic accepting of conffile changes is different for UCF:

UCF_FORCE_CONFFOLD=yes apt-get install -y clamav-base

Look at 'man ucf' and 'man dpkg' for more force options.

Using udev to set network card order

Friday, July 14th, 2006

Don't you hate it when you update a Linux machine, and the order that the network cards are detected, changes?

Code:

ifconfig | grep HWaddr | awk ' { printf"KERNEL==\"eth*\",SYSFS{address}==\"%s\", NAME=\"%s\"\n", $5, $1; }' > /etc/udev/rules.d/10-network-cards.rules

The cables don't change around, so neither should the order in which they come up.

lvm2 pre-installation script returned exit status 10

Friday, July 7th, 2006

Tracking down bugs in Debian and Ubuntu packages is fun for the whole family. Found this one while upgrading from Hoary to Dapper on a test box:

root@unassigned-firewall:~ # apt-get install lvm2..
Preparing to replace lvm2 2.00.32-1 (using .../lvm2_2.02.02-1ubuntu1_i386.deb) ...
dpkg: error processing /cdrom//pool/main/l/lvm2/lvm2_2.02.02-1ubuntu1_i386.deb (--unpack): subprocess pre-installation script returned error exit status 10
Errors were encountered while processing:
/cdrom//pool/main/l/lvm2/lvm2_2.02.02-1ubuntu1_i386.deb
E: Sub-process /usr/bin/dpkg returned an error code (1)

Straight to Google. Nothing for this package, but some other packages with a similar error are reported. Eventually, I find a similar example, and work through the steps:

root@unassigned-firewall:~ # export DEBCONF_DEBUG=developer
root@unassigned-firewall:~ # apt-get install lvm2
..
Preparing to replace lvm2 2.00.32-1 (using .../lvm2_2.02.02-1ubuntu1_i386.deb) ...
debconf (developer): frontend started
debconf (developer): frontend running, package name is lvm2
debconf (developer): starting /var/lib/dpkg/tmp.ci/preinst upgrade 2.00.32-1
debconf (developer): <-- VERSION 2.0
debconf (developer): --> 0 2.0
debconf (developer): <-- CAPB backup
debconf (developer): --> 0 multiselect escape backup
debconf (developer): <-- TITLE LVM2
debconf (developer): --> 0
debconf (developer): <-- FSET lvm2/kernel seen false
debconf (developer): --> 10 lvm2/kernel doesn't exist
dpkg: error processing /cdrom//pool/main/l/lvm2/lvm2_2.02.02-1ubuntu1_i386.deb (--unpack):
subprocess pre-installation script returned error exit status 10
debconf (developer): frontend started
debconf (developer): frontend running, package name is lvm2
debconf (developer): starting /var/lib/dpkg/info/lvm2.postinst abort-upgrade 2.02.02-1ubuntu1
Errors were encountered while processing:
/cdrom//pool/main/l/lvm2/lvm2_2.02.02-1ubuntu1_i386.deb
E: Sub-process /usr/bin/dpkg returned an error code (1)

Aha! Eventually, the problem presents itself, in the postinst - but of the new package, not the one that is in /var/lib/dpkg/info:

if ! dpkg --compare-versions $(uname -r) ge '2.6.12'; then
db_fset lvm2/kernel seen false
db_input critical lvm2/kernel || true
db_go
exit 1
fi

Which neatly matches this Debian bug. I built me a package without this block (you're going to be running a new kernel when the upgrade that includes this package is done - the new version wouldn't cleanly backport), and the upgrade continued.

The moral of this story is I should have gone to Launchpad first, as the bug is recorded there. Google just didn't see it.

Rosebud

Wednesday, June 28th, 2006

SUSE Linux Enterprise Desktop (previously Novell Linux Desktop, now "SLED") has released a public preview of version 10. Along with this are some preview videos.

I like what they've done with the main application starter menu, but I also like what Gimmie is doing in this area. Check out Alex Graveley's Gimmie GUADEC slides for some idea of the direction that launch panels might be going. If it gets combined with MacSlow's Cairo dock, we could see some excellent GNOME app launch/management lovin'.

Also, Ubuntu fits snugly on one CD. Why does SLED need five? Can I make do with just one?

Falcon repository builder

Wednesday, June 21st, 2006

While Matt has built a repository system based on reprepro, thanks to Seveas (beware, that link is in Dutch), I've got Falcon working as a Debian/Ubuntu package repository. And in the process, increased the number of bugs fourfold!

The great thing is that he's fixed two of them already and there will probably be a update released today based on that. What great service.

(Is your blog staying at the top of the Planet longer than it should? Is your feed showing the time in the correct time zone? If not, you're posting from 12 hours in the future!)

Envision a world where IBM staffers run HP laptops...

Monday, June 5th, 2006

This may be close to the truth, if it turns out to be true that Lenovo intend to shun Linux, supporting only Windows on their Thinkpads and desktop machines.

In saying that, they hardly support anything anyway, but the general consensus is the build quality is decreasing with cost cutting, and I expect that Linux people will start shunning Lenovo anyway, in both developing drivers for, and buying, the hardware. But IBM were meant to move to a total Linux desktop internally by 2005 - a target they missed - and much of their development work is done on Linux desktops. I'm sure they have an arrangement to buy gear from Lenovo at or near cost, and probably without Windows licenses, and I'm also sure that the worlds largest IT services company can probably orchestrate their own driver writing and distribution maintenance. However, with the speed at which Lenovo are distancing themselves from the powerful IBM brand, it's not hard to imagine that IBM staff will end up using another vendors desktops and laptops in the future.

There were a lot of Thinkpads at Linux.conf.au in January.

At about the same time, the Chinese government purchasing agency announced that all new PCs they purchase must be Linux compatible, and Lenovo are a supplier on that list. Go figure.

Edit: turns out that the manager commenting was responsible for only the cheap 3000 series, and his comments weren't meant for Thinkpad. Foot in mouth please.

Saviour of the universe

Sunday, May 28th, 2006

Macromedia have announced that there will be a Flash Player 9 for Linux (but not till 2007), and Darron Schall has announced something to do with it when it arrives. There is also a new Flash on Linux engineer at Adobe, who sounds like he wants to consider community opinion on various issues.