(a pod of whales from About.com)[reposted from last year with update]
[UPDATE: kottke points to the same blog with added pics here).
10 years ago, when I was teaching English in China, I was surprised by how interested my students were in learning about phrases like "a pod of whales," "a cup of coffee," and "a pride of lions." When I mentioned a phrase like this, they would perk up immediately (difficult to do in the oppressive Guangzhou 广州 summer heat). This was a year before I began studying linguistics proper so I had no clue what a collective noun was, nor did I know what a classifier was, nor did I know that Chinese languages like Mandarin and Cantonese have elaborate systems of nominal classifiers (this Wiki page is a good primer). I just thought it was a cute diversion to talk about at the end of an evening's class.
It turns out that collective nouns have very interesting properties which linguists love to obsess over (I regret I do not have access to a copy of The Cambridge Grammar of the English Language because I suspect Huddleston and Pullum have some fascinating points).
Now, Via kottke, I discovered a blog called All Sorts dedicated to culling collective nouns from Twitter feeds. It relies on a little NLP and some crowd sourcing. It appears to be restricted to the syntactic construction "an X of Y". Since it relies so heavily on syntax, it gathers examples that are weak, at best. For example, in what way are the following collective nouns?
a conspiracy of theorists
a tantrum of 2 year olds
a pratfall of clowns
My first pass reading of those thee phrases is not as collective nouns, but rather as periphrastic genitives (e.g., "a mayor of Buffalo once said..."). The "a X of Y" syntax is, by itself, ambiguous between the periphrastic genitive and collective noun constructions (as well as simple PP attachment like "a webcomic of romance"). Do people prefer the use of "a X of Y" for one of these constructions? I suspect any preference would be based on the semantic features of the nouns involved (once you read the word "group", you pretty much know you've got a collective noun on your hands).
I wonder if anyone has done online reading tasks with subjects reading the two kinds of phrases and experimenting with different features to see what cues one reading over another. Imagine creating a set of stimuli containing sentence frames that could take either a collective noun or a periphrastic genitive and alternating each, controlling for features like animacy.
I'll take a crack at one such frame. My goal is to create sentence pairs involving minimal pairs of "a X of Y" constructions which differ only in the Y noun and where the first is a collective noun while the second is a periphrastic genitive. This relies critically on finding an X word that can be a collective noun like "group" as well as a possessive. Hmmmmmm, this ain't gonna be easy....
a. That cup of coffee that I broke has been cleaned up.
b. That cup of John's that I broke has been cleaned up.
My original hypothesis was that people will delay on (b) [meaning, their reading of the following region will be slower than (a)]. But I dunno, because some will be confused at "broke" in (a) as well. Part of this will depend on where the delay occurs.
Now, you go write up 100 of these pairs, norm them for acceptability, set up a moving window reading test in ePrime, run at least 30 subjects, then call me when you got results. I've done my part.