My affair with China continues, another story, this one from NYT.
Saturday, September 09, 2006
On this day:
Tuesday, September 05, 2006
On this day:
Just few days back it came to limelight that Google would be listening to noises in the room to find out what users are watching on TV and offer web-based supplemental information (see 'do no evil' or read Techcrunch or slashdot).
For people who were in the The Twelfth International World Wide Web Conference(WWW2003, Budapest Hungary) this should come as no surprise. Sergey Brin had co-authored a paper titled Query-Free News Search.
To quote from the abstract of the paper clearly says "in this paper we discuss finding news articles on the web that are relevant to news currently being broadcast." Broadcast here refers to TV broadcast. There they had discussed extracting "queries from the ongoing stream of closed captions" and based on that showing relevant pages/information.
Currently they are talking about audio, if one reads the very last paragraph then, one can clearly find "Finally, as voice recognition systems improve, the same kind of topic finding and query generation algorithms described in this paper could be applied to conversations, providing relevant information immediately upon demand."
So, has the voice recognition systems improved so much that such systems can make sense out of low quality noise and understand words from it? That would be magic.
The key thing is "listening on TV". If one were to carefully read the paper, which was the source if this story, then one would find that they actually compare the audio which is recorded from the user's machine to the audio of things which were telecasted (in last X number of days, on various channels). Doing it fast is a challenge, but with resources which google have at their disposal it is not very hard. Then from the matched shows, they pick up the captions, and then do the search in a fashion similar to one mentioned in Query-Free News Search.
Another aspect of this search algorithm is to maintain "terms from previous text segments to aid in generating a query for the current text segment, the notion being that the context leading up to the current text may contain terms that are still valuable in generating the query". A concept similar to what is currently being used in Google Personalized search.
If some people where wondering that why don't they "train their system to listen in on everything that goes on at home", things still have some way to go before that is feasible.
Monday, September 04, 2006
On this day: