Monday, October 29, 2007

Computational Linguistics vs. NLP

What is the difference between Computational Linguistics and Natural Language Processing? (Hint: There is no official answer to this question).
I had my 476th version of this conversation just now (because we’re in the hiring process for a new “CL lead” and having challenges defining the job) and I made the off-the-cuff claim that it’s the same as the difference between science and engineering. An engineer tries to build things while a scientist is in essence a reverse-engineer, dedicated to trying to figure out how the world works. Human language is a system that already exists, and it works in some way that no one really understands. Linguistics and cognitive scientists have been studying it for decades (well, you could make the claim for millenia). They are now joined by a group of specialists whose skill set involves computer programming and statistics.
Computational linguistics, then, involves trying to figure out how human language works using computational tools (e.g., automated methods of corpus analysis like Tgrep2 [UPDATE 12/02/2010: dead link, for Tgrep2 tutorials, see HERE] and Perl scripting, learning models, etc) while NLP involves building tools that involve language input or output like voice user interfaces, machine translators, entity recognizers, etc. It can be the case that a single person is both a computational linguist and an NLP developer.
That’s my answer, for now… (my previous thoughts are here).

9 comments:

Valeria said...

Hey there,
I stumbled across your post during my researching for my newfound interest in computational linguistics (yes, I Googled just that and your blog post came up). I also read your bio section, and it seems like you'd be just the person I would like to talk to.
You see, I just graduated with a BA in Spanish and Linguistics and a BS in Psychobiology - originally with the grand plan of going to medical school...however, at this point, I decided that's not the path for me, and my passion truly lies in linguistics, languages - more analytical practice. I'm from the SF Bay Area, and there are vast opportunities here (and many places in the world, I'm sure) to utilize that passion in the comp ling industry.
Having said all that, I am pretty ignorant in the actual field - I have little programming knowledge and even less experience, but am more than willing to take classes/learn on my own (have already started to do so). What can you tell me about a career in comp ling? Is it something that is heavily coding-based, or more of a higher level linguistics modeling? (e.g., would one work on the grand scheme of search engine issues/linguistic framework, or write code for such issues).
Anything else you can tell me would be invaluable!!

If you have actually read this long message, I thank you a thousand times. I'm hoping to get as much information from professionals in the field before diving head first into this profession.

If you prefer, you can email me at valgofman@gmail.com

Thanks again:)

Valerie

Chris said...

Valeria, thanks for the comment. I certainly do have thoughts for anyone thinking of entering the CL/NLP field. First, I suggest you read this previous post of mine, On Jobs and NLP where I respond to someone very much like you asking a similar question.

The bottom line is, yes, learn to program. In C++, Python, and Perl. Also learn about machine learning algorithms and statistical inference algorithms (it's easier than you think; a good 2 year CL master's will do it). You will soon be good friends with latent semantic analysis. (pssst, search engines are less CL than you might think.)

I'm not sure this a good time to be in the NLP job market, but it sounds like you're willing to do an MS first, so you could be positioned to enter a strong 2012 job market.

I have more thoughts, and some links, but I'll need to root around a bit. I'll try sending you an email tomorrow.

For now, go to my blogroll and link through to LingPipe where you'll find the work of two of the smartest CL gurus on the planet.

Chris said...

the link in the previous post appears to be bad. try this one: On Jobs and NLP Degrees.

Valeria said...

Thank you so much! I'll definitely look into the page you linked and see if that gives me any more questions :)

BThree said...

Chris,

Love the blog. I have some questions to ask you along the lines of Valeria's. I am just graduating from my second Master's program in a computational linguistics-related field (Speech), and have been looking for consulting opportunities in the DC area - my goal is to set up a small consulting practice. I did this before back in '07, when I worked as both an Arabic linguist and NLP software developed. I have 4 years of NLP development experience, an M.Sc. in Arabic linguistics, and now this new M.Sc. in speech technology, which I think makes me a good candidate for many consulting positions, including those involving speech recognition. However, just my initial forays into the market have given me the impression that things have changed drastically. As a consultant yourself, can you offer any insight into how things might have changed in the Washington DC area?

Elles Belles said...

Hey,

I am currently studying Computational Linguistics, and I was wondering if you had any advice as far as finding an internship? I really want to get one, but they seem to be rare.

Thanks,

Ellie

lingpipe said...

I think "CL is science" and "NLP is engineering" is a nice distinction. Linguistics parted ways with computation circa Chomsky's {\it Aspects}. The {\it CL Journal} and {\it ACL} now focus almost exclusively on NLP and pretty much have done since the advent of "machine learning". That raises the issue of how machine learning differs from stats.

Thanks for the props, Chris, but there are way smarter people than me and Breck doing computational linguistics and NLP (sorry, Breck).

I wrote a blog post three years ago outlining what I thought would be a reasonable NLP curriculum:

http://lingpipe-blog.com/2008/10/13/computational-linguistics-curriculum/

Actually learning stats or machine learning well takes a bit longer than a couple of years, though you can develop a practical knowledge even quicker than that.
I found learning the stats deeply hard enough that I joined the stats department at Columbia to learn more.

lingpipe said...

@Elles Belles: If you extend your scope to NLP as defined above, there are tons of internships. Microsoft, Google, IBM, and AT&T all have research labs that offer internships. But they tend to look for people with deep computational skills. Many smaller companies (like ours) also have internships.

Chris said...

Thanks for the updates @LingPipe, good info. I tweeted the link to your post as well.

I'm taking the Machine Learning course online at Stanford this semester (along with 80,000 other people worldwide, haha) but they decided to delete the NLP section. Boo.