Monday, March 17, 2008

The Ling-O-Sphere Revisited

In December I posted about an idea regarding my desire to see a linguistics blog aggregator that "automatically checks a given set of linguistics websites, then updates a topic cloud which clusters posts according to relevance for a particular topic" (see my full post and relevant comments here ).

I see now that William Cohen at his Cranial Darwinism blog has recently posted two new academic papers on the automatic discovery of blog topics (aka, latent topic modeling) as well automatic methods of modeling blog influence. Daume has posted on related topics in the past as well (see here for one relevant post).

Having skimmed the first paper a bit, I see lots of scary words and phrases like "Latent Dirichlet Allocation" and "probabilistic framework"; I'm neck deep in finishing my dissertation (or failing to finish it; I'll be able to distinguish the two in about 3 weeks), so my interest in struggling through challenging papers is low, but they look well worth the read ... someday ... sigh.

No comments:

A linguist asks some questions about word vectors

I have at best a passing familiarity with word vectors, strictly from a 30,000 foot view. I've never directly used them outside a handfu...