tag:blogger.com,1999:blog-4637631607249418081.post2783355910607671318..comments2023-10-07T00:29:30.516-07:00Comments on The Other Kelly Yancey: The price of spamKelly Yanceyhttp://www.blogger.com/profile/08648597728708472240noreply@blogger.comBlogger11125tag:blogger.com,1999:blog-4637631607249418081.post-70223503255698899622007-10-06T12:52:00.000-07:002007-10-06T12:52:00.000-07:00The important thing about greylisting is that it's...The important thing about greylisting is that it's insanely cheap. SpamAssassin takes a lot of memory and CPU cycles to do it's work. Greylisting just takes a database lookup and a write, plus a delayed delete for more spam.<BR/><BR/>It's not at all surprising that if you have an expensive mechanism like SpamAssassin without protecting it by cheaper mechanisms, that your Soekris can barely keep up.<BR/><BR/>I will say that your problem is totally solvable. I've had the same e-mail address since 1994 and it's all over the net, and I only get 1 or 2 spams through to my main mailbox per day. Legitimate messages misclassified as spam is maybe 2 to 3 messages per week.<BR/><BR/>I use SPF, greylisting, rejecting mail from the SpamCop "top 200" site list, SpamAssassin, ClamAV, auto-whitelisting of people I mail to, and a custom set of black and white lists along with a long custom set of header and body rules.<BR/><BR/>It's a lot of work, but it's aazingly effective.<BR/><BR/>SeanAnonymousnoreply@blogger.comtag:blogger.com,1999:blog-4637631607249418081.post-48095066937307112962007-10-03T15:23:00.000-07:002007-10-03T15:23:00.000-07:00Try Google Apps, it's gmail with your own domain n...Try Google Apps, it's gmail with your own domain name. I average about 10 spams a week. <BR/><BR/>You do need to check your spam folder regularly though. Some of my client's emails ended up inside.Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-4637631607249418081.post-29025909509413086362007-10-03T09:53:00.000-07:002007-10-03T09:53:00.000-07:00Yes you need to train spambayes with ham and spam....Yes you need to train spambayes with ham and spam. I keep my spams to do that when I want to play with another setup.<BR/><BR/>$ sb_filter.py -d $HOME/.hammie.db -n<BR/>Created new database in /home/bsergean/.hammie.db<BR/><BR/>$ sb_mboxtrain.py -d $HOME/.hammie.db -g ~/Mail/sent-mail/cur/ -s ~/Mail/Spam/cur/<BR/>Training ham (/home/bsergean/Mail/sent-mail/cur/):<BR/> Reading as MH mailbox<BR/> Trained 8020 out of 8020 messages<BR/>Training spam (/home/bsergean/Mail/Spam/cur/):<BR/> Reading as MH mailbox<BR/> Trained 1847 out of 1847 messages<BR/><BR/>I am using kmail right now, here are the two rules I use when something bad happens.<BR/><BR/>(re)train as a bad (spam) message<BR/>sb_filter.py -d $HOME/.hammie.db -s<BR/><BR/>(re)train as a good (ham) message<BR/>sb_filter.py -d $HOME/.hammie.db -g<BR/><BR/>There are two options for the database format, pickle and berkely db, each one have their own advantage (I forgot, it's in the man page :).<BR/><BR/>One last word, ... if it ain't break don't fix it. It's always a pain to reconfigure something already working ... And the grey listing seems like a good thing too, accoring to another comment.<BR/><BR/>Benjamin.bsergeanhttps://www.blogger.com/profile/11190017812209279955noreply@blogger.comtag:blogger.com,1999:blog-4637631607249418081.post-6657914774132829022007-10-02T23:53:00.000-07:002007-10-02T23:53:00.000-07:00I've got the same setup - FreeBSD/SpamAssassin, an...I've got the same setup - FreeBSD/SpamAssassin, and I averaged over 2,300 mails flagged as spam per day for the past month. I maybe get ~25/day that aren't caught, and about one false positive per month.<BR/><BR/>All the mail that gets flagged as spam is forwarded to a Google account. Virtually all of it gets flagged as spam by Google as well, and I review what doesn't every few days. I figure if my spam filters and Google's spam filters both agree that it's spame, then I can safely ignore it.<BR/><BR/>The server I use doesn't really have any trouble processing that volume mail. You are running spamd so you don't have to fire up a Perl process for eaach mail, right? If you're still having difficulty, try greylisting, I've heard it's a very inexpensive way of cutting out a lot of spam.Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-4637631607249418081.post-49839518639839371742007-10-02T23:40:00.000-07:002007-10-02T23:40:00.000-07:00I'm using a Soekris box running OpenBSD as a firew...I'm using a Soekris box running OpenBSD as a firewall. When I enabled greylisting in spamd, the number of spam mails plummeted by a factor of 100 -- and it's stayed that low ever since.Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-4637631607249418081.post-91939359606571952532007-10-02T17:23:00.000-07:002007-10-02T17:23:00.000-07:00Kelly: That little mail server you've got is prett...Kelly: That little mail server you've got is pretty neat! I'm assuming that back in the day your P100 wasn't doing much in the way of spam prevention, except maybe blacklisting. Sorting mail is much less resource-intensive :-)<BR/><BR/>I was the email admin for a fortune 500 about a year ago, roughly 80% of the inbound email was spam, something like 45,000 messages/day. We actually had two layers of spam protection, one right behind the other (not my design, but it worked.) Maybe you need a couple more Soekris in series to cut the volume down further ;-)<BR/><BR/>As for me, I've been using Yahoo for something like 9 years. I get maybe 2-3 pieces of spam a week.<BR/><BR/>- AAnonymousnoreply@blogger.comtag:blogger.com,1999:blog-4637631607249418081.post-45536603068690153612007-10-02T14:43:00.000-07:002007-10-02T14:43:00.000-07:00Kumar: Thanks for the comment. Your blog is reall...Kumar: Thanks for the comment. Your blog is really interesting. I've been thinking about switching over to gmail or Yahoo! mail, but have resisted the idea because changing my e-mail address feels too much like capitulation.<BR/><BR/>Of course, forwarding all of my mail from my home server to gmail (or wherever) would work, but spam would consume twice as much bandwidth as it currently does (once to download it, and again to forward it to the hosted mail service). :(Kelly Yanceyhttps://www.blogger.com/profile/08648597728708472240noreply@blogger.comtag:blogger.com,1999:blog-4637631607249418081.post-72896063326661277942007-10-02T14:38:00.000-07:002007-10-02T14:38:00.000-07:00To the anonymous poster suggesting spambayes: I ha...To the anonymous poster suggesting spambayes: I haven't tried that particular filter, but I did try dspam a while back. It advertised that it was written all in C so it was faster than SpamAssassin (which is written in perl).<BR/><BR/>It was fast alright: it didn't do much of anything. I realize that I have to train it, but that is a pain to do unless you use mutt as your mail reader (and configure keys for flagging messages as false negatives and positives). I really didn't enjoy reading through 1000+ messages a day to train my filter. One thing SpamAssassin has going for it is that it uses both a bayesian filter and static analysis, so I get decent results immediately.<BR/><BR/>That said, perhaps disabling some of the SpamAssassin online checks may be in order. It appears that SpamAssassin is CPU bound rather than network I/O bound, though, so I doubt it will help much.Kelly Yanceyhttps://www.blogger.com/profile/08648597728708472240noreply@blogger.comtag:blogger.com,1999:blog-4637631607249418081.post-50369626920798283252007-10-02T14:25:00.000-07:002007-10-02T14:25:00.000-07:00You should maybe give spambayes a try. It's python...You should maybe give spambayes a try. It's python and don't do the spamassassin online checks that might not be that good.Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-4637631607249418081.post-90790254231241371252007-10-02T14:16:00.000-07:002007-10-02T14:16:00.000-07:00You say you get over 1000 pieces of SPAM per day. ...You say you get over 1000 pieces of SPAM per day. Let's say you get 2000, and that 50 percent of your CPU cycles go to processing it. There are 86,400 seconds in a day, 50 percent of that is 43,200 divided by 2000 is over 20 CPU seconds per email. You really need a faster machine.Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-4637631607249418081.post-45337093383434381292007-10-02T14:09:00.000-07:002007-10-02T14:09:00.000-07:00This is exactly why I use gmail. I guess that mak...This is exactly why I use gmail. I guess that makes me lazy and cheap! My email address is all over the web but due to their filtering (I imagine it learns from all the many gmail users) I only get about 4 spam messages a week, on average.Kumar McMillanhttps://www.blogger.com/profile/18371805776129363077noreply@blogger.com