Misplaced Pages

SpamBayes

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.
(Redirected from Spambayes)
This article has multiple issues. Please help improve it or discuss these issues on the talk page. (Learn how and when to remove these messages)
An editor has performed a search and found that sufficient sources exist to establish the subject's notability. Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed.
Find sources: "SpamBayes" – news · newspapers · books · scholar · JSTOR (May 2024) (Learn how and when to remove this message)
This article relies excessively on references to primary sources. Please improve this article by adding secondary or tertiary sources.
Find sources: "SpamBayes" – news · newspapers · books · scholar · JSTOR (May 2024) (Learn how and when to remove this message)
(Learn how and when to remove this message)
SpamBayes
Original author(s)Tim Peters
Initial releaseSeptember 2002
Stable release1.0.4 / March 2005
Preview release1.1a6 / December 6, 2008 (2008-12-06)
Written inPython
PlatformCross-platform
Available inEnglish only
TypeE-mail filtering
LicensePSFL
Websitespambayes.sourceforge.net

SpamBayes is a Bayesian spam filter written in Python which uses techniques laid out by Paul Graham in his essay "A Plan for Spam". It has subsequently been improved by Gary Robinson and Tim Peters, among others.

The most notable difference between a conventional Bayesian filter and the filter used by SpamBayes is that there are three classifications rather than two: spam, non-spam (called ham in SpamBayes), and unsure. The user trains a message as being either ham or spam; when filtering a message, the spam filters generate one score for ham and another for spam.

If the spam score is high and the ham score is low, the message will be classified as spam. If the spam score is low and the ham score is high, the message will be classified as ham. If the scores are both high or both low, the message will be classified as unsure.

This approach leads to a low number of false positives and false negatives, but it may result in a number of unsures which need a human decision.

Web filtering

Some work has gone into applying SpamBayes to filter internet content via a proxy web server.

References

  1. "Download CHANGELOG.TXT (SpamBayes anti-spam)".
  2. Robinson, Gary (1 March 2003). "A Statistical Approach to the Spam Problem". Linux Journal. ISSN 1075-3583.
  3. Montanaro, Skip (2003-12-07). "[spambayes-dev] Web filtering". Retrieved 2023-04-18.
  4. "[spambayes-dev] Web filtering". 7 December 2003.
  5. "OSDIR". 6 November 2020.

External links

Categories: