Smorgasbord - Politics, Lisp, Rails, Fencing, etc.

My musings on assortment of things ranging from politics, computer technology and programming to sports.

Tuesday, February 19, 2008

On this day:

Whistle blower forcibly taken off in US

Read and for more details. The wikileaks site can be still accessed at and the page on the Swiss Bank Julius Baer which led to the legal action and the corresponding legal notice can be found at

Thursday, March 22, 2007

On this day:

Clusterer + other plugins

Lately I have been very busy with commercial work, so I am not able to give much time to Open Source projects. But I still somehow managed to clean up the clusterer gem somewhat and has made a release 0.1.9, its actually a release candidate for 0.2.0. The gem has changed quite a lot, since the initial release, and its not backwards compatible. But there are several new features including Bayesian classifiers. The documentation is very sparse at this moment, but the examples and tests should help.

There is a new version of acts_as_classifiable . There was some incompatibility issue with the old version and Rails 1.2.*, this is fixed in the new release. The plugin now depends upon the clusterer gem and not classifier gem.
The new plugin can be downloaded from rubyforge:

Also there is a new release of acts_as_clusterable it can be downloaded from:

Please try out the gem and the plugins, and if you face any problems let me know.

Saturday, September 09, 2006

On this day:

Changing and Rising China

My affair with China continues, another story, this one from NYT.

Tuesday, September 05, 2006

On this day:

Google Eavesdropping - the technology

Just few days back it came to limelight that Google would be listening to noises in the room to find out what users are watching on TV and offer web-based supplemental information (see 'do no evil' or read Techcrunch or slashdot).

For people who were in the The Twelfth International World Wide Web Conference(WWW2003, Budapest Hungary) this should come as no surprise. Sergey Brin had co-authored a paper titled Query-Free News Search.

To quote from the abstract of the paper clearly says "in this paper we discuss finding news articles on the web that are relevant to news currently being broadcast." Broadcast here refers to TV broadcast. There they had discussed extracting "queries from the ongoing stream of closed captions" and based on that showing relevant pages/information.

Currently they are talking about audio, if one reads the very last paragraph then, one can clearly find "Finally, as voice recognition systems improve, the same kind of topic finding and query generation algorithms described in this paper could be applied to conversations, providing relevant information immediately upon demand."

So, has the voice recognition systems improved so much that such systems can make sense out of low quality noise and understand words from it? That would be magic.

The key thing is "listening on TV". If one were to carefully read the paper, which was the source if this story, then one would find that they actually compare the audio which is recorded from the user's machine to the audio of things which were telecasted (in last X number of days, on various channels). Doing it fast is a challenge, but with resources which google have at their disposal it is not very hard. Then from the matched shows, they pick up the captions, and then do the search in a fashion similar to one mentioned in Query-Free News Search.

Another aspect of this search algorithm is to maintain "terms from previous text segments to aid in generating a query for the current text segment, the notion being that the context leading up to the current text may contain terms that are still valuable in generating the query". A concept similar to what is currently being used in Google Personalized search.

If some people where wondering that why don't they "train their system to listen in on everything that goes on at home", things still have some way to go before that is feasible.

Monday, September 04, 2006

On this day:

Do No Evil

Do no evil. Hey but I am writing on a site owned by Google. Read the comments, also Slashdot.

Friday, August 25, 2006

On this day:

India vis-a-vis China

An old but nice summarized comparison drawing from articles by Yasheng Huang, a business professor at MIT Sloan and his book.

Maybe someday when I am in the mood to write I will post my own rant on some of the problems which small startups face in India.

Thursday, August 24, 2006

On this day:


This plugin makes it very easy to cluster active record objects. It uses the clusterer Ruby gem.

Download and install the gem, and its dependencies. Copy the plugin into vendor/plugins from the following svn repository

Add the following lines to your model:

acts_as_clusterable :fields => ['title','text']

If no fields are given then it will use all text and string fields present in the model.
Now, doing clustering is as easy as:


By default the number of clusters is Math.sqrt(no. of objects). To get custom no. of clusters give the method the no. of clusters as an argument, i.e.,


For better results use hierarchical clustering, for faster results use kmeans.

Update: New release Clusterer + other plugins

Tuesday, August 22, 2006

On this day:

Ruby Clustering Library for Text Data

Few days back I came across, the Carrot Clustering Framework this inspired me to write something similar for Ruby. So, I started off with this project, and have right now implemented the basic K-Means and Hierarchical Clustering algorithms.

The first release can be downloaded from Rubyforge using the following command

gem install clusterer

The gem requires the stemmer gem, as a dependency.

There are also two example files which shows, how to use the library by clustering search results returned by Yahoo and Google. To try the example, the corresponding API key is needed.

Basically, one has to pass an array of strings to the clustering algorithm, and it will return the index of the clustered elements.

Clusterer::Clustering.kmeans_clustering(["hello world","mea culpa","goodbye world"])
Clusterer::Clustering.hierarchical_clustering(["hello world","mea culpa","goodbye world"])

The result might be something like [[1,3],[2]].

The method signature for K-means is as follows

def kmeans_clustering (docs, k = nil, max_iter = 10, &similarity_function)

K-means is a simple hill climbing algorithm, and can get stuck at local maxima, but it fast in nature. Just to ensure that the algorithm doesn't gets stuck in a state where it oscillates the max number of iteration is necessary.

When k=nil the algorithm finds k = Math.sqrt(docs.size) clusters.

def hierarchical_clustering (docs, k = nil, &similarity_function)

Hierarchical clustering gives much better results, but is comparatively slower, when data volume is quite high.

If you are using this gem in a live public facing site, then let me know; I would like to link to that.

Update: New release Clusterer + other plugins

Sunday, August 20, 2006

On this day:

IE the Cancer

Internet Explorer is a cancer, it has an extremely poor web standards support. A nightmare for web developers and designers.

It doesn't seems that things are likely to improve much with the IE7 release.

As the above and this slashdot articles point out

Hopefully one day, most web users will get smarter and switch to Firefox or some other standards friendly browser.

Friday, August 04, 2006

On this day:

Rails model - next and previous objects

Its been a long time since, I posted anything. So, I thought I will let something go.

In one of my Rails application, while showing an object I need to show links to the previous and next objects. I think, this is a common task in many Rails application, the way I am doing it is I have extended the Active Record Base class so that all my models have this function. There are numerous way to include this, choose one which you think most suits your purpose or which you like.

module ActiveRecord #:nodoc:
class Base
def next_id
@next ||= self.class.find(:first,:select => ['id'],:conditions => ['id > ?', id], :order => 'id')
@next ? : nil

def previous_id
@previous ||= self.class.find(:first,:select => ['id'],:conditions => ['id < ?', id], :order => "id desc")
@previous ? : nil

Let this file be called 'base_ext.rb' and put it in 'lib' folder of your application.

Now, in your model files add the line

require 'base_ext'

at the top this will ensure that this file is loaded before your model class is declared.

Now, in your views you can use something like this:

<%= link_to '« Previous',{:id => @creation.previous_id},:class => "alignleft" if @creation.previous_id %>
<%= link_to 'Next »',{:id => @creation.next_id},:class => "alignright" if @creation.next_id %>