I spent a good deal of Sunday afternoon trolling around linguistics blogs. While there are dozens of linguists with blogs, it’s hard to keep track of them all. The linguist List has a modest static list here. When I scan the blog roll at Language Log, it’s not even clear which ones are dedicated primarily to linguistics since many of the blog names are intentionally obscure. Also, many are defunct or stale as wishydig recently noted . I found a couple which had no posting in 2 years, many none for months. (UPDATE: while doing something else mildly productive, I literally clicked on EVERY single blog listed in Language Log's blog roll. If you deleted each one that was either dormant for at least 6 months or had little linguistics content, you’d delete at least 70%).
Iput the term “linguistics" into each of the three major social bookmarking sites above and frankly, the results were far from encouraging. Even though Technorati has a “blogs” tab, the first page of hits were not really linguistics blogs, as far as I could tell (the second page was more relevant). The Digg results were disappointing, to say the least. One reference to a Chomsky interview and one to a study on swearing, but again, none of the top hits appeared to be from blogs I would consider “linguistic blogs” (e.g., none are on the Language Log Other language blogs list). The del.icio.us returns at least put Language Log on top, but most of the first page returns were resource pages for computational linguistics, not blogs per se.
Imagine a site which automatically checks a given set of linguistics websites, then updates a topic cloud which clusters posts according to relevance for a particular topic, with links to each post within the cloud, plus a blog roll of all participating blogs on the right margin. I could imaging this happening in one of two ways (I prefer the first, but it's computationally complicated):
1) Search the participating blogs and perform some sort of cluster analysis of the words in each post, taking all the posts together as a corpus (perhaps an LSA style analysis), then create the cloud.
2) Create a fixed set of topic key words, and search for semantically similar words in each post. I could imagine WordNet being used for this