Wednesday, October 3, 2007

Buffalo Buffalo Bayes

The (somewhat) famous Buffalo sentence below seems to say something about frequency and meaning, I’m just not sure what:

Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo

Since the conditional probability of “buffalo” in the context of “buffalo” is exactly 1 (hey, I ain’t no math genius and I didn’t actually walk through Bayes theorem for this so whaddoo I know; I’m just sayin’, it seems pretty obvious, even to The Lousy Linguist).

Also, there is no conditional probability of any item in the sentence that is not 1; so from where does structure emerge? Perhaps the (obvious) point is that a sentence like this could not be used to learn language. One needs to know the structures first in order to interpret. Regardless of your pet theory of learning this sentence will crash your learner.

There are only two sets of cues that could help: orthographic and prosodic. There are three capitalized words, so that indicates some differentiation, but not enough by itself. A learner would have to have some suprasegmental prosodic information to help identify constituents. But how much would be enough?

Imagine a corpus of English sentences along these polysemic lines (with prosodic phrases annotated). Would prosodic phrase boundaries be enough for a learner to make some fair predictions about syntactic structure?

UPDATE (Nov 16, 2009): It only now occurs to me, years later, that the the very first Buffalo has no preceding context "buffalo". Better late than never??

No comments:

A linguist asks some questions about word vectors

I have at best a passing familiarity with word vectors, strictly from a 30,000 foot view. I've never directly used them outside a handfu...