Saturday, November 28, 2009

Google Linguistics 2

(screen shot from WebCorp)

I have posted before about the use of Google as a linguistics search engine here. Today, I ran across WebCorp Live, which allows a user to perform some linguistically interesting searches over the web as a corpus. From their site:

WebCorp LSE is a fully-tailored linguistic search engine to cache and process large sections of the web. WebCorp LSE offers:

* enhanced sentence boundary detection
* date identification
* 'boilerplate' removal
* collocation and other statistical analyses
* grammatical tagging
* language detection
* full pattern matching and wildcard search

In spirit, this is quite similar to Mark Davies excellent BYU Corpus resources. If I get a chance to play with it some more, I might try running some of my old dissertation searches though it. That should be a good test.

UPDATE: see my original post titled Google Linguistics which more specifically talks about using Google for research.

No comments:

TV Linguistics - and the fictional Princeton Linguistics department

 [reposted from 11/20/10] I spent Thursday night on a plane so I missed 30 Rock and the most linguistics oriented sit-com episode since ...