Thursday, November 26, 2009

Gee Wiz, Alien Language

(image of USC professor Paul R. Frommer from LA Times)

There are certain topics in linguistics that are far more interesting to non-linguists than linguists themselves. Animal language is a classic example, as well as language evolution. And third on the list is alien languages from movies (as opposed to Kirby's artificial languages). For example, for decades now people have been fascinated by Marc Okrand's Klingon (this guy took it a little too far though; isn't this child abuse?).

When people hear that someone has "invented a language," they seem shocked, shocked! to discover that such a thing occurred. As if it's a difficult feat. There seems to be a gee wiz factor. In fact, the average second year grad student in linguistics can do it, and typically they do, just for fun. Logicians are required to do it. Here, let's make up a language right now:

Language X

bbhl = /bel/, intransitive, 'to run', (actor)
hhli = /hla:/, transitive verb, 'to hit', (undergoer, actor)
ttrsh = /dos/, proper noun, 'Wally'
pploi = /pli/, proper noun, 'Sparky'
8_9 = /ha_mu/, particle, simple past

S --> V + N
S --> V + N + N
V --> prt+V+prt

There. Done. I just invented language X and it took all of 20 minutes. Now, which of the following sentences are grammatical in language X and what do they mean? Which rules to do ungrammatical sentences break?
  1. bbhl ttrsh
  2. ttrsh bbhl
  3. 8hhli9 pploi ttrsh
  4. ttrsh pploi
  5. 8hhli9 ttrsh pploi
  6. hhli9 ttrsh pploi
Answers below.

The latest variation of this hoopla comes to us from James Cameron's latest big budget movie Avatar. Cameron recruited a linguist from USC, Paul Frommer, to create a language for his goofy blue aliens. But an article about this from the LA Times involved a bit of an exaggeration: "USC professor creates an entire alien language for 'Avatar'" (my emphasis).

Wow! An entire language, you say? That's gotta be at least 30 or 40 thousand words and at least a couple thousand rules, right? Nope. In fact, the language only contains about 1000 words. From the article itself: "Between the scripts for the film and the video game, Frommer has a bit more than 1,000 words in the Na'vi language, as well as all the rules and structure of the language itself." It seems a tad redundant to say "rules and structure" of a language, but that's neither here nor there. As far as I can tell (after just a little bit of Googling) the Na'vi language has not been released so it's not possible to follow up on just how extensive this language is beyond the word count reported in the article. I'm sure a grammar is on the way. Sci fi fans are notoriously detail oriented. But it brings up a more serious issue: what counts as a language? Language X above certainly counts as a language in the simple sense of having a lexicon and set of rules for combining them. Heck, I even threw in some phonetics. If we want to claim that language X is not an entire language, we're gonna have to come up with some guidelines for what counts as an entire language. The logicians have their rules for formal languages, of course, but we need some natural human language guidelines. I'm sure the pidgin/creole experts have thoughts on this and this is one of things that pidgin & creole expert Derek Bickerton ruminates on in his book Adam's Tongue. See my reviews here. He's concerned with what proto-language must have looked like when humans first used language.

Now, I do not mean to belittle professor Frommer's accomplishment. I can certainly imagine spending a lot of time and energy on creating a language. But it's not rocket science. It's closer to knitting.

  1. bbhl ttrsh = 'Wally runs'
  2. *ttrsh bbhl -- bad because all sentences in X begin with a verb
  3. 8hhli9 pploi ttrsh = 'Wally hit Sparky'
  4. *ttrsh pploi -- bad because all sentences in X must have a verb
  5. 8hhli9 ttrsh pploi = 'Sparky hit Wally'
  6. *hhli9 ttrsh pploi -- bad because past tense morpheme is not properly realized
UPDATE: cute HTML note. My original argument structure definitions used angle brackets and I only just now realized they didn't show up in the post, because, of course, those are interpreted as HTML tags. So I used parens.

UPDATE 2: a commenter points out a more complete interview with Frommer here.

UPDATE 3: I scooped Ben Zimmer on this one (HT Language Hat), another LL scoop for me.

UPDATE 4: Ben Zimmer has posted a gust post by Frommer in which he gives a brief description of the language here.


cjrecord said...

"Q5. 8hhli9 ttrsh pploi"
"A5. hhli ttrsh pploi = 'Sparky hits Wally'"

Think you've got a transcription arror, there. The answer should be past tense, no?

Chris said...

oops, yep. Nice catch. Thanks!

Chris said...


U.S.O. Project said...

U.S.O. Project meets Paul Frommer, Alien Language Creator for Avatar.

Anonymous said...

There are more and more people that are in the need of Buy Viagra, many for serious and real erectile dysfunctions, others just for the placer of using Viagra, Generic Viagra. Now with the internet almost in every home it is more easy to get Viagra Online or Cheap Viagra or if you want you can purchase Viagra Online Without Prescription or Viagra Without Prescription, you just have to take the easiest way to obtain the product.

Putting the Linguistics into Kaggle Competitions

In the spirit of Dr. Emily Bender’s NAACL blog post Putting the Linguistics in Computational Linguistics , I want to apply some of her thou...