Misplaced Pages

Sparse binary polynomial hashing

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.

Sparse binary polynomial hashing (SBPH) is a generalization of Bayesian spam filtering that can match mutating phrases as well as single words.

SBPH is a way of generating a large number of features from an incoming text automatically, and then using statistics to determine the weights for each of those features in terms of their predictive values for spam/nonspam evaluation.

External links

A paper on the subject as it relates to spam (some article text comes from this document, which is under the GFDL)
Ending Spam: Bayesian Content Filtering and the Art of Statistical Language Classification. No Starch Press. 2005. p. 108. ISBN 978-1-59327-052-0.

This statistics-related article is a stub. You can help Misplaced Pages by expanding it.

v
t
e