Monday, September 15, 2014

can you still do linguistics without math?

A reader emailed me an interesting question that's worth giving a wider audience to:
It nearly broke my heart to hear that maths may be a required thing in linguistics, maths has pulled me back from a few opportunities in the past before linguistics, I'd been interested in engineering, marine biology, etc. I was just wondering if there was any work around, anything that would help me with linguistics that didn't require maths. Just. any advice at all, for getting into the field of linguistics with something as troubling as dyscalculia.
The reader makes a good point I hadn't thought about. I remember my phonetics teacher telling us that she often recruited students into linguistics by telling them that it's one of the few fields that teach non-mathematical data analytics. That was something that appealed to me.

I'm not familiar with dyscalculia so I can't speak to how it impacts the study of linguistics directly. But even linguists who don't perceive themselves as "doing math" often still are, in the form of complicated measurements and such, like in phonetics and psycholinguistics. Generally though, I think that there are still many opportunities to do non-mathematical linguistics, especially in fields like sociolinguistics, language policy, and language documentation. Let us not forget that the vast majority of the world's languages remain undocumented so we need an army of linguists to work with speakers the world over to record, analyze, and describe the lexicons, grammars, and sound systems of those languages. We also need to understand better child language acquisition, slang, pragmatic inferences, and a host of other deeply important linguistic issues. It still requires a lot of good old fashioned, non-mathematical linguistics skills to study those topics.

Unfortunately, those are woefully underpaid skills as well. One of the reasons math is taking over linguistics is simple economics: that's where the money is. Both the job market and the research grant market are trending heavily towards quantitative skills and tools, regardless of the discipline. That's just a fact we all have to deal with. I didn't go to grad school in order to work at IBM. That's just where the job is. I couldn't get hired at a university to save my life right now, but I can make twice what a professor makes at IBM. So here I am (don't get me wrong. I have the enviable position of getting paid well to work on real language problems, so I ain't complaining).

Increasingly, the value of descriptive linguistic skills is in the creation of corpora that can be processed automatically with tools like AntConc  and such. You can do a lot of corpus linguists these days without explicit math because the software does a lot of the work for you. But you will still need to understand the underlying math concepts (like why true "keywords" are not simply frequency searches). For details, I can highly recommend Lancaster University's MOOC "Corpus linguistics: method, analysis, interpretation" (it's free and online right now)

The real question is; what do you want to do with linguistics? Do you want to get a PhD then become a professor? That's a tough road (and not just in linguistics. The academic market place is imploding due to funding issues). There aren't that many universities who hire pure descriptive linguists anymore. Those jobs do exist, but they're rare. SUNY Buffalo, Oregon, and New Mexico are three US schools that come to mind as still having descriptive field linguist faculties. But the list is short.

If you want to teach a language, that's the most direct route to getting a job, but you'll need the TESOL Certificate too and frankly, those tend to be low paid, part-time jobs. Hard to build a secure career off of that.

That leaves industry. There are industry jobs for non-quantitative linguists, but they're unpredictable. Marketing agencies occasionally hire linguists to do research on cross-linguistic brand names and such. Check out this old post for some examples.

I hope this helps. I recommend asking this question over at The Linguist List too because I have my own biases. It's smart to get a wide variety of perspectives.

Tuesday, September 2, 2014

neural nets and question answering

I just read A Neural Network for Factoid Question Answering by Iyyer et al  (presented at EMNLP 2014).

I've been particularly keen on research about question answering NLP for a long time because my first ever NLP gig was as a grad student intern at a defunct question answering start-up in 2000 (QA was all the rage during the 90s tech bubble). QA is somewhat special among NLP fields because it is a combination of all of the others put together into a single, deeply complex pipeline.

When I saw this paper Tweeted by Mat Kelcey, I was excited by the title, but after reading it, I suspect the constraints of their task make it not quite applicable to commercial QA applications.

Here are some thoughts on the paper, but to be clear: these comments are my own and do not represent in any way those of my employer.

What they did:
Took question/answer pairs from a college Quiz Bowl game and trained a neural network to find answers to new questions. More to the point, "given a description of an entity, [they trained a neural net to] identify the person, place, or thing discussed".

The downside:
  1. They used factoid questions from a game called Quiz Bowl
  2. Factoid questions assume small, easily identifiable answers (typically one word or maybe a short multi-word phrase)
  3. If you’re unfamiliar with the format of these quiz bowl games, you can play something similar at bars like Buffalo Wild Wings. You get a little device for inputting an answer and the questions are presented on TVs around the room. The *questions* are composed of 4-6 sentences, displayed one at a time. The faster you answer, the more points you get. The sentences in the question are hierarchically ordered in terms of information contained. The first sentence gives very little information away and is presented alone for maybe 5 seconds. If you can’t answer, the second sentence appears for 5 seconds giving a bit more detail. If you still can’t answer, the third sentence appears providing even more detail, but fewer points. And so on.
  4. Therefore, they had large *questions* composed of 4-6 sentences, providing more and more details about the answer. This amount of information is rare (though they report results of experimental guesses after just the first sentence, I believe they still used the entire *question* paragraph for training).
  5. They had fixed, known answer sets to train on. Plus (annotated) incorrect answers to train on.
  6. They whittled down their training and test data to a small set of QA pairs that *fit* their needs (no messy data) - "451 history answers and 595 literature answers that occur on average twelve times in the corpus".
  7. They could not handle multi-word named entities (so they manually pre-processed their corpus to convert these into single strings).
The upside:

  1. Their use of dependency trees instead of bag o' words was nice. As a linguist, I want to see more sophisticated linguistic information used in NLP.
  2. They jointly learned answer and question representations in the same vector space rather than learning them separately because "most answers are themselves words (features) in other questions (e.g., a question on World War II might mention the Battle of the Bulge and vice versa). Thus, word vectors associated with such answers can be trained in the same vector space as question text enabling us to model relationships between answers instead of assuming incorrectly that all answers are independent."
  3. I found their error analysis in sections “5.2 Where the Attribute Space Helps Answer Questions” and 5.3 "Where all Models Struggle” especially thought provoking. More published research should include these kinds of sections.
  4. Footnote 7 is interesting: "We tried transforming Wikipedia sentences into quiz bowl sentences by replacing answer mentions with appropriate descriptors (e.g., \Joseph Heller" with \this author"), but the resulting sentences suffered from a variety of grammatical issues and did not help the final result." Yep, syntax. Find-and-replace not gonna cut it.

NLPers: How would you characterize your linguistics background?

That was the poll question my hero Professor Emily Bender posed on Twitter March 30th. 573 tweets later, a truly epic thread had been cre...