- To: slug-chat@xxxxxxxxxxx
- Subject: [chat] Re: [SLUG] Meaning of Nonsense in s p a m
- From: Malcolm V <farkit@xxxxxxxxxxxx>
- Date: 17 Jun 2003 00:54:06 +1000
- Organization:
On Mon, 2003-06-16 at 10:27, Andrew McNaughton wrote:
> Talking of which, does anyone know any good or interesting approaches to
> identifying these junk strings?
<snipped>
I think these strings are added mainly to fool simple string matching
filters. The fact that they may bloat statistical based filtering
programs is probably seen as serendipity by the spammers.
If you don't want to bloat your database, I guess you can always drop
"words" only seen once in the last x months. Otherwise leave them in as
these strings are usually a set length and only ever associated with
spam, so that after a while they will actually aid in detection...
I think they are handy for human parsing too, as they are usually in the
subject line, so if they slip passed your filters you can tell with a
glance it is spam. It breaks any attempt at "human engineering" their
subject.
Cheers,
Malcolm V.