But, he’s also taking the commonly used term "natural language processing" and insisting that it NOT refer to what 99% of people who use the term use it for, but rather only a very narrow interpretation consisting of something like "computer systems that mimic human language processing." This is fundamentally unfair.
In the 1980s I was convinced that computers would soon be able to simulate the basics of what (I hope) you are doing right now: processing sentences and determining their meanings.I feel Pullum is moving the goal posts on us when he says “there is, to my knowledge, no available system for unaided machine answering of free-form questions via general syntactic and semantic analysis” [my emphasis]. Pullum’s agenda appears to be to create a straw-man NLP world where NLP techniques are only admirable if they mimic human processing. And this is unfair for two reasons.
One: Getting a machine to process language like humans is an interesting goal, but it is not necessarily a useful goal. Getting a machine to provide human-like output (regardless of how it gets there) is a more valuable enterprise.
Two: A general syntactic and semantic analysis of human language DOES. NOT. EXIST. To draw back the curtain hiding Pullum’s unfair illusion, I ask Pullum to explain exactly how HUMANS process his first example sentence:
Which UK papers are not part of the Murdoch empire?Perhaps the most frustrating part of Pullum’s analysis so far is that he fails to point the blame where it more deservedly belongs: at linguist themselves. How dare Pullum complain that engineers at Google don’t create algorithms that follow "general syntactic and semantic analysis" when you could make the claim against linguists that they have failed to provide the world with a unified "general syntactic and semantic analysis" to begin with!
Ask Noam Chomsky, Ivan Sag, Robert van Valin, and Adele Goldberg to provide a general syntactic and semantic analysis of Pullum’s sentence and you will get four vastly different responses. Don’t blame Google for THAT! While commercial vendors may be overly-focused on practical solutions, it is at least as true that academic linguists are overly-focused on theory. Academic linguists rarely produce the sort of syntactic and semantic analyses that are useful (or even comprehensible … let alone UNIFIED!) to anyone outside of a small group of devotees of their pet theory. Pullum is well known to be a fierce critic of such linguistic theory solipsism, but that view is wholly unrepresented in this series of posts.
In his more recent post, Pullum insists again that commercial NLP is tied to keyword searching, but this remains naïve. Pullum does his readers a disservice by glossing over the now almost 70 years of research on information theory underpinning much of contemporary NLP.
Also, Pullum unfairly puts Google search at the center of the NLP world as if that alone represents the wide array of tools and techniques that exist right now. This is more propaganda than fact. He does a disservice by not reviewing the immense value of ngram techniques, dependency parsers, Wordnet, topic models, etc.
When he laments that Google search doesn’t "rely on artificial intelligence, it relies on your intelligence", Pullum also fails to relate the lessons of Cyc Corp and the Semantic Web community which have spent hundreds of millions of dollars and decades trying to develop smart artificial intelligence approaches with comparatively little success (compared to the epic scale success of Google et al). In this, Pullum stacks the deck. He laments the failure of NLP to include AI without reviewing the failure of AI to enhance NLP.
I actually agree that business goals (like those of Google) have steered NLP in certain directions away from the goal of mimicking human language, but to dismiss this enterprise as a failure is unfair. It may be that NLP does not mimic humans, but until [we] linguists provide engineers with a unified account of human language, we can hardly complain that they go looking elsewhere for inspiration.
And for the record, there does exist exactly the kind of NLP work that attempts to incorporate more human-style understanding (for example, this). But boy, it ain’t easy, so don’t hold your breath Geoff.
If Geoff has some free time in June, I recommend he attend The 1st Workshop on Metaphor in NLP 2013.