Thursday, December 9, 2010
a brief history of stanford linguistics dissertations
The above image comes from the Stanford Dissertation Browser and is centered on Linguistics. This tool performs some kind of textual analysis of Stanford dissertations: every dissertation is taken as a weighted mixture of a unigram language model associated with every Stanford department. This lets us infer, that, say, dissertation X is 60% computer science, 20% physics, and so on...Essentially, the visualization shows word overlap between departments measured by letting the dissertations in one department borrow words from another department..
Thus, the image above suggests that Linguistics borrows more words from Computer Science, Education, and Psychology than it does from other disciplines. What was most interesting was using the Back button to creating a moving picture of dissertation language over the last 15 years. you'll see a lot of bouncing back and forth. Stats makes a couple jumps here and there.
HT Razib Khan