The purpose of the paper was to run a sort-of bake-off between three methods of creating source/target word lists (to be used by selection preference metaphor recognition system): Three different methods of compiling the word lists were tested: a) word association experiment, b) dictionary of synonyms, and c) reference corpus.
Ultimately they found that their corpus based method was most successful as measured by recall/precision, but there was a more striking result rather buried in the paper that I feel deserves more analysis. They created a gold standard by hand-tagging a 30,000 word "baseline" corpus. Here's what they found:
At the first attempt, inter-annotator agreement was only 17%. After refining the annotation instructions, we made a second attempt, which resulted in an agreement level of 48%, which is still a strikingly low value. These results indicate that the definition of “metaphoricity” is problematic in itself [emphasis added].
They reported three general sources of inter-annotator DISagreement:
- Direct vs. Indirect Reference: For example, in the case of the conceptual metaphors ANGER IS HEAT or CONFLICT IS FIRE, the source domain should be an expression referring to a sort of “heated thing”. However, in some cases, one or the other annotator included words indirectly suggesting the presence of heat, such as kiolt ('extinguish'), kihől ( 'get cold') etc.
- Lexical Ambiguity: For example, the expression eljutottam a mai napig ('I've gotten to this day') may or may not represent a CHANGE IS MOTION metaphor depending on whether the Hungarian verb jut (literally: get somewhere, reach a place by moving the entire body) is taken only to denote physical movement or to be ambiguous.
- Discrepancies in Classification: ...it is difficult to make an informed decision on whether the following example contains a CHANGE IS MOTION or a PROGRESS IS MOTION FORWARD metaphor, neither of which appear to be an intuitively correct choice: a járvány végigsöpört szülıvárosukon ('the epidemic swept through their hometown').
Of the four or five articles I've reviewed on automatic metaphor identification, this is the only one which reported on the results of human-tagging a corpus for metaphor. This strikes me as the sort of thing that should be a first step for anyone seriously interested in this program (certainly anyone interested in the IARPA Metaphor Program).I don't doubt that others have done this, but it seems to be under-reported, suggesting it is not be treated as a core part of the problem.
I've complained in my previous posts that there is an overly restricted definition of metaphor underlying contemporary approaches to auto identification, but even within a highly restricted definition like those used by Babarczy et al. and others, there appears to be problems at the heart of identification for humans. So what exactly is being identified?
Anna Babarczy, Ildikó Bencze M., István Fekete, & Eszter Simon (2010). The Automatic Identification of Conceptual Metaphors in Hungarian Texts: A Corpus-Based Analysis LREC 2010 Workshop. Proceedings