Django Spam Classifier

This is an imple­ment­a­tion of a naive bayesian clas­si­fier and a fisher clas­si­fier based on algorithms found in Programming Collective Intelligence.

This imple­ment­a­tion uses django mod­els to per­sist train­ing data, and allows for simple clas­si­fic­a­tion of input text.

Grab spam­clas­si­fier from bit­bucket.

e.g.

from spamclassifier.classifier import NaiveBayesClassifier, get_words

classifier = NaiveBayesClassifier([get_words])
classifier.train('Nobody owns the water.', 'good')
classifier.train('the quick rabbit jumps fences', 'good')
classifier.train('buy pharmaceuticals now', 'bad')
classifier.train('make quick money at the online casino', 'bad)
classifier.train('the quick brown fox jumps', 'good')

classifier.prob('quick rabbit', 'good')
classifier.prob('quick rabbit', 'bad')

from spamclassifier.classifier import FisherClassifier, get_words

classifier = FisherClassifier([get_words])
classifier.train('Nobody owns the water.', 'good')
classifier.train('the quick rabbit jumps fences', 'good')
classifier.train('buy pharmaceuticals now', 'bad')
classifier.train('make quick money at the online casino', 'bad)
classifier.train('the quick brown fox jumps', 'good')

classifier.classify('quick rabbit')
classifier.classify('buy pharmaceuticals')