Consider a scenario where you have a blog, and you want to prevent spam comments, but don't want to use captcha as they are bad from accessibility standpoint. What is the solution, in steps `acts_as_classifiable'. Use a Bayesian classifier to distinguish between spam and non-spam and if the comment is flagged as a spam, use a captcha based solution or reject it. Or maybe you want to track the preferences of each user and then based on that make suggestions to them. `acts_as_classifiable' can help in both scenarios and several other. Currently I use it for the web application at Kreeti.com.
To use this plugin, you need to have the gem `classifier' and its dependencies installed. The command below should do it.
gem install classifier --include-dependencies
The plugin itself can be downloaded from
Next your database needs to have a table named `classifier_models'. This is used as a persitent store for the built classifier model.
create_table :classifier_models, :force => true do |t|
t.column :identifier, :int
t.column :classifiable_type, :string, :null => false
t.column :data, :blob
Now, to use this plugin in your model, put:
class Comment < ActiveRecords::Base
acts_as_classifiable :fields => ["text"], :categories => ["Spam", "Legit"]
Let us assume that we have an instance of the above model in '@comment'. Then the classifier can be trained by calling the method `train'
Better have some additional helper functions in the model which will do so.
You can also untrain (use it with care) an instance, by using
Bulk training and untraining is also possible by
Comment.train @comments, @classifications
where both @comments and @classifications are arrays, such that @classifications contain categorization of each message in @comments.
To use, the classifier to make classification:
this will return either "Spam" or "Legit". Bulk classification is also possible.
If you want the comment class to have multiple classifiers, one for each user, then all the above methods can be given an additional argument `identifier'.
@comment.train :legit, @user.id
This will create and store a classifier for that particular model, identified by the `identifier'. This can be used in scenarios when one wants to track preference of each user and then want to make suggestions.
The Bayesian classifier in the gem classifier needs some work, but more about it later.
Any questions/issues about this plugin, please post it as a comment or email me.
Update: New release Clusterer + other plugins