Thursday, July 7, 2011

more on auto metaphor recognition methods

A quick follow-up to my previous post on automatic metaphor recognition wrt the IARPA Metaphor Program. The paper Automatic Metaphor Recognition Based on Semantic Relation Patterns by Tang et al. challenges the dominant selectional preferences method by substituing their own Semantic Relations Patterns. They point out the problems with Selection Preferences (unfortunately I don't think they solved the problems with their own method, more on that in a bit).

Again I'll give the Ling 101, computational linguistics for dummies version (as I understand it ...): Selection Preferences assumes that words frequently co-occur with other words that are literally associated with the same semantic domain. For example,
  1. That ship has sailed the mighty ocean.
  2. That boat has sailed across lake Erie.
  3. That captain has sailed many seas.
In these three sentences, the verb sailed occurs with three different subjects (ship, boat, captain) and three different objects (ocean, lake, seas), but all of them evoke the SAILING domain. So a computer could use this info to create a model of the verb sail that would match up the semantics of its expected subjects and objects, then compare them to a new sentence. If the computer encountered the new sentence

    4. That student sailed through final exams.

It could automatically use the model created from sentences 1-3 above to recognize that the verb sailed occurs with a subject and object not from the SAILING domain, but rather from the STUDENT domain. Then it could use a metaphor mapping component to recognize that HUMANS as MACHINES is an acceptable mapping and thus recognize that #4 might be coherent under a metaphorical interpretation.

Tang et al. rightly point out that matching frequency-based selectional preferences is not the same thing as literal meaning. First, they note that some times, a metaphorical pairing is actually MORE FREQUENT than a litertal pairing. They use some Chinese examples, but I think the English translation makes the point. Take the following two uses of close:
  • The plane is close to the tower.
  • Opinion are close.
In their corpus, Chinese uses like 'opinions are close' were more frequent, even though this is a non-literal use of close. Frequency would lead the Selectional Preference method to believe that the opinions-type use is literal simply because it is more frequent. This outcome is predicted by Lakoff & Johnson, btw, because one of the core tenants of their seminal work on metaphors was that metaphors are NOT special uses of language, but rather quite common and normal.

Tang et al.'s solution is a new method they call Semantic Relation Patterns. Their explanation is brief and highly technical, making it a slog to get through, but it hinges on incorporating an existing semantic relations knowledge base, HowNet, and adding a probabalistic model. Note, I had trouble getting the HowNet website to load, but here is a PDF explanation.

How Net is an on-line common-sense knowledge base unveiling inter-conceptual relations and inter-attribute relations of concepts as connoting in Chinese and English bilingual lexicons.

In my quick read the two methods differed only minimally in the crucial ways (namely, they are both lexalist and local). Semantic Relation patterns are still based on lexical semantics and still derived entirely locally. I don't see how SRP would handle this metaphor from my earlier post any better than SP:

Imagine a situation in a biology class where two students, Alger and Miriam, were originally going to be partners for a lab assignment. Then they got into an argument. A third student, Annette, asks Miriam:
  • Annette: Are you still going to be lab partners with Alger?
  • Miriam: No. That ship has sailed.
In this scenario, the sentence "That ship has sailed" is entirely coherent and literal from a selectional preferences perspective (i.e., ships really do sail). Yet it is clearly being used metaphorically (there is literally no ship). Here, the metaphor is only detectable if we link two sentences together via co-reference. The phrase "the ship" does not co-refer to a real ship in the discourse. Rather, it refers to the possible event of be-lab-partners-with-Alger. Unless we can link phrases between sentences and between types (i.e., allowing an NP to co-refer to an event), then we are not going to get a computer to recognize these types of metaphors (which I suspect are quite common).

I appreciate Tang et al.'s critique of the SP method and their attempt to get beyond it, but I think their methodology fails to make the critical improvements to automatic metaphor recognition that will be crucial to creating a full scale tool that handles real world metaphor.


ResearchBlogging.org
Xuri Tang, Weiguang Qu, Xiaohe Chen, & Shiwen Yu (2010). Automatic Metaphor Recognition Based on Semantic Relation Patterns International Conference on Asian Language Processing

3 comments:

ianbellamy said...

Interesting. On first look, it seems as if the authors are really focusing on location aspect of semantics, especially in the given example. More to come.

outerhoard said...

What does "opinion(s) are close" mean?

I would understand, "Your opinions are close to mine", but "opinions are close" as an isolated phrase is meaningless as far as I can tell.

(Also, I've always wanted to ask: why are your blog posts always repeated in the comments section? Looks like some kind of add-on bug, but it's been that way for years and I've never seen another blogger affected by it.)

Chris said...

@outerhoard: agreed that the generic use of "opinions" out of context is odd (keep in mind that the original was Chinese). I think we must assume there is some known group of people which "opinions" refers to. Like saying "the opinions of top lawmakers are close...";

Also agreed that repeated the post is awkward. There must be a setting I can change some where. I'll root around.

@ianbellamy: not sure what you mean by "location aspect of semantics" but I'm curious...

NLPers: How would you characterize your linguistics background?

That was the poll question my hero Professor Emily Bender posed on Twitter March 30th. 573 tweets later, a truly epic thread had been cre...