Friday, February 15, 2013

Crazy Question - The Primacy of Nouns or Verbs?

A question was posted recently in a Text Analytics group discussion on a well known social networking site (sorry, not Facebook). I posted an answer and thought it was worth broadcasting to a wider audience, but I recognize the murky new technology ethics involved. It is a closed networking site, though it has lots of open access for non-members. I don't want to steal the thunder of the original poster who asked the question in all earnestness. And I gave my full answer within the confines of that site, but it is MY answer, after all. I feel I own it. And it is a question and answer interaction I think a lot of non-linguists might benefit from and I have the means to distribute it beyond the semi-walled garden of the original site. Imma post this modified version* and let you, dear reader, decide the ethics for yourselves**.

Original Question
I have a crazy question: In language, which is formed first, a noun or a verb? I think it is a noun. We know 'Google' as a noun from when it started as a brand name. Now we use it as a verb. When a language evolves, naming should happen first, right?  Naming of actions and entities. That itself is noun. After that, we are defining different forms of verbs.
My Answer
As a linguist, I'd have to disambiguate the question before beginning an answer. There are (at least) three variants of the question. There's no simple answer to any of these questions, but getting the question right is often the best starting point.:
  1. In the contemporary evolution of a new language (e.g., pidgins), what parts of speech (POS) form first?
  2. In the development of language in a child, which POS is learned/utilized first?
  3. In the brain, which POS is the base or most salient form of ambiguous words like "Google"? 
To begin an answer, it's important not to confuse the cognitive act of labeling events in the world (the gavagai problem) with labeling parts of speech (POS) like Noun/Verb/Adjective. POS are used by linguists to identify how words behave in the grammatical structure of a sentence. POS tell us what structural rules a word follows in a particular grammatical structure, not what a word refers to in the world. POS are syntactic objects, not semantic ones. Once this distinction is clear, then issues are made more plain.

I’ll use the English “Google” example from the original question to illustrate (but let’s be aware that brand names like Google, Kleenex, and Xerox have their own weird, unique linguistic life, so this is not a perfect example).

When “Google” is used as a noun in English, it can function as the Subject of a sentence. For example, “Google rolled out a new service today.” As a noun, it can be counted and take plural morphology and count determiners like “one” and “two”. For example, “There are not two Googles, there is only one Google.” As a verb, it can take tense morphology like past tense -ed. For example, “I googled around for a new phone yesterday.”

Contrast this purely syntactic analysis with how words are learned by children. When a child is at the one word stage, she may use a single word to label a whole series of events and objects (known as holophrasis). For example, imagine playing with a one year old by picking her up, twirling her around, then setting her down and she giggles. After you set her down, she holds her arms up to you and says “up.” What she wants is for you to go through the whole series of events again. She is not using that one word to refer to one discrete object in the world. She is referring to a holistic series of events.

In that situation, what POS is “up”? Is it a preposition? A verb? A noun? It’s none of the above. It is simply not appropriate in a linguistics sense to give the word “up” any POS under these circumstances because it is not functioning with the grammar of a structured sentence.

To return to the original question: what the questioner is calling naming is not the same thing as the POS “noun”. Labeling [events and objects in the world] and labeling POS are fundamentally different, though there is some rough but buggy correlation in some languages, but it's all very messy.

As to what comes first in the evolution of language, that’s a deeply complicated topic with no clear answers. We have little to no direct evidence for how languages evolved originally. This is not to say that there aren't some very smart theories. For a readable lay introduction, I recommend the book “Adam’s Tongue” by Derek Bickerton (I do not endorse the conclusions in that book, but I do recommend it as a good, readable intro to the issues of language evolution for the lay reader). For more detailed analysis and up-to-date discussions of language evolution I can highly recommend the group blog Replicated Typo.

That's my first pass attempt at answering the question as originally posed. I'm happy to accept dissents, revisions, updates, addendum, rude noises, porn links, kitten pictures, and hot stock picks.

*I naturally re-worded and added clarification to my hastily composed original, so this version is its own unique product.
**I cleaned up the original question for clarity and brevity.


Matías Guzmán said...

It's an interesting question, although I'm not too sure we actually have POS tags in our brains.

Chris said...

Matías, I agree. The psychological reality of parts of speech is an ongoing and unsettled research question. One thing is for sure. The traditional set of 8 or 9 parts of speech in English are insufficient to account for the observed variety of word class behaviors in English (let alone other languages). From a computational perspective, the Penn Treebank required 36 parts of speech to parse their data. Are those 36 psychologically real? No clue. Not sure anyone has ever tried to experimentally validate that set. It would be an interesting research goal.

Cory Cuthbertson said...

For the question posed form the point of view of language origins, it's logically impossible to have 'one' part of speech type. As POS is evidenced by how its used in a sentence with other words, without that contrast you can't call a single existing category a 'noun' or 'verb' (unless you're using a solely semantic definition of POS, which is answering a different question entirely - namely what were the semantic categories of the first words - and that is a comment on function and not structure). I wrote an sbstract on this and presented a poster, but have yet to expand it to a full paper...:

Chris said...

Cory, sorry foe the delay in responding. Your abstract looks great. Have you read Bickerton's book? He makes the claim that what's crucial in the evolution of language is the rise of abstract thinking. Whatever else they are, POSs are abstract categories. A challenging topic, though. No real way to test.

Sumant S Kulkarni said...

Interesting post. However, I'm really not sure whether how do we store these concepts in brain. Do we observe things in terms of their parts of speech and store so (even if there are 36 or more POS)? I feel we do not attach any part od speech when we observe things. We must be having some other attributes which characterize each of the concepts. May be POS is a inference on these attributes.

Chris said...

How the brain interacts with POS is a deeply interesting, and complex topic. I do know that there are phenomena like slips-of-the-tongue where nouns replace nouns and verbs replace verbs, but nouns replacing verbs is rare, so POS (or grammatical structure of some sort) is clearly stored somehow.

Jim Mischler, Northwestern State University of Louisiana said...

Have you read Tom Givon's "The Genesis of Syntactic Complexity" or Heine and Kuteva's "The Genesis of Grammar: A Reconstruction"? If so, what do you think of their views?

Putting the Linguistics into Kaggle Competitions

In the spirit of Dr. Emily Bender’s NAACL blog post Putting the Linguistics in Computational Linguistics , I want to apply some of her thou...