Tuesday, January 4, 2011

naive bayes knows restaurants better than 5,000 mechanical turks

Yelp recently sponsored a bake-off between a Naive Bayes classifier and the online crowd-sourcing site Mechanical Turk. The task was classifying web sites according to their business category (i.e., is it a restaurant or a doctors office?).  The classifier beat the turkers handily:

Money quote:
In almost every case, the algorithm, which was trained on a pool of 12 million user-submitted Yelp reviews, correctly identified the category of a business a third more often than the humans. In the automotive category, the computer was twice as likely as the assembled masses to correctly identify a business.

There are a variety of qualifications (why did 99% of Turkers who applied for the task fail the basic test? ESL issues perhaps?). But it's an interesting result.

HT kdnuggets

No comments:

Putting the Linguistics into Kaggle Competitions

In the spirit of Dr. Emily Bender’s NAACL blog post Putting the Linguistics in Computational Linguistics , I want to apply some of her thou...