The Lousy Linguist

Thursday, April 28, 2011

Shakespeare and the brain? maybe not...

Jezebel, The Guardian, and other sources have been promoting recent neuroscience involving reading Shakespeare. This research has popped up in the blogosphere before and was widely misunderstood. I fear this time is no better, so I offer this re-posting of my original response from late 2007:

Even though he blogs at a mere undergrad level I basically respect Andrew Sullivan as a blogger. He blogs about a diverse set of topics and has thoughtful and intelligent (even if controversial) comments and analysis. And he’s prolific, to say the least (surely the advantage of being a professional blogger, rather than stealing the spare moment at work while your test suite runs its course). That said, he can sometimes really come across as a snobbish little twit. Like yesterday when he linked to an article about Shakespearean language which talks about a psycholinguistics study initiated by an English professor, Philip Davis ; as is so often the case, the professor has wildly exaggerated the meaning of the study. Please see Language Log’s post Distracted By The Brain for related discussion. Here’s crucial quote from that post:

The neuroscience information had a particularly striking effect on non-experts’ judgments of bad explanations, masking otherwise salient problems in these explanations.

My claim: the neuroscience study discussed in the Davis article distracts the reader from Davis’s essentially absurd interpretations, and Andrew Sullivan takes the bait, hook, line and sinker (and looks like a twit in the end).

The article does not go into the crucial details of the study, but it says that it involves EEG (electroencephalogram) and MEG (magnetoencephalograhy) and fMRI (Functional Magnetic Resonance Imaging) noting that only the EEG portion has been completed. A pretty impressive array of tools for a single psycholinguistics study, I must say. Most published articles in the field would involve one or maybe two of these, but all three for a single study? Wow, impressive.

It’s not clear to me if this was a well designed study or not (my hunch is, no, it is a poorly designed study, but without the crucial details, I really don’t know). However, it is undeniable that professor Davis has gone off the deep end of interpretation. The study does not even involve Shakespearean English!!! It involves Modern English! Then Davis makes the following claims (false, all of them, regardless of the study):

["word class conversion"] is an economically compressed form of speech, as from an age when the language was at its most dynamically fluid and formatively mobile; an age in which a word could move quickly from one sense to another… (underlines added)

This is the classic English professor bullshit. I don’t even know what “economically compressed” means (Davis gives no definition); it has no meaning to linguistics that I know of. The quote also suggests Shakespeare’s English had some sort of magical linguistic qualities that today’s English does not possess. FALSE! Modern English allows tremendous productivity of constructions, neologisms, and ambiguity. A nice introduction to ambiguity can be found here: Ambiguous Words by George A. Miller.

Davis ends with a flourish of artistic bullshit hypothesizing:

For my guess, more broadly, remains this: that Shakespeare's syntax, its shifts and movements, can lock into the existing pathways of the brain and actually move and change them—away from old and aging mental habits and easy long-established sequences.

Neuroplasticity is only just now being studied in depth and it’s far from well understood, but the study in question says NOTHING about plasticity!!! There’s also no reason to believe that Shakespeare’s language does anything that other smart, well crafted language does not do. And we’re a generation at least away from having the tools to study any of this.

I’m accustomed to simply letting these all too common chunks of silliness go without comment, but then Andrew had to slip in his unfortunate bit of snooty arrogance. After pasting a chunk of the obvious linguistics bullshit on his site (then follow-up comments), he has to finish with "I knew all that already". Exactly what did you know, Andrew? Since all of the major claims Davis makes are obvious bullshit, what exactly do you claim to have had prior knowledge of? What did Andrew know, and when did he know it?

Really, Andrew, did you never take so much as a single linguistics course during all your years at Harvard and Oxford? The University at Maryland has excellent psycholinguists as does Georgetown. Please, consider sitting in on a course, won’t you?

Monday, April 11, 2011

Google linguist interview

Purpose: This post reviews my experience interviewing for a Linguist position at Google in Santa Monica, CA on February 29, 2008. I've long meant to post this but only now got around to it. There are lots of Google interview stories on the web. It appears to be its own genre. This is my contribution to the genre.

I originally wrote it as an email to a friend who wanted to know how my big day at Google went. It’s rather long, but then again, you don’t have to read it, you clearly have better things to do…

I found a job posting on the Google jobs board for a full time Linguist. I applied and was given a phone interview with a recruiter around late January, 2008:

Thank you for your interest in Google. I'd like to set up a time for us to discuss Google Linguist opportunities and your qualifications. Please let me know a day/time when you would be available to speak with me as well as the best phone number for me to contact you. I'll email you back to confirm.

I hope to hear from you soon!

Cheers,
JF
Google Staffing

During that phone interview the recruiter shared a Google doc which I was instructed to complete in about 45 minutes…

open science

Recently in North Carolina, moximer & David Dobbs and others discussed the value of opening up science research (such that all research is freely available for searching and interpretation, even draft versions and failed experiments, at least under the strong proposal). It's an interesting discussion (audio is a bit crappy, but whaddayagonnado?):

What's Keeping Us from Open Science? Is It the Powers That Be, Or Is It... Us? from Smartley-Dunn on Vimeo.

Hence, I thought it might be nice to list some open source journals offering free access to scientific research:

PLoS is a nonprofit organization of scientists and physicians committed to making the world's scientific and medical literature a freely available public resource.

The Internet Archive, a 501(c)(3) non-profit, is building a digital library of Internet sites and other cultural artifacts in digital form. Like a paper library, we provide free access to researchers, historians, scholars, and the general public.

CiteSeer: The NEC Scientific Literature Digital Library incorporating autonomous citation indexing, awareness and tracking, citation context, related document retrieval.

arXiv.org e-Print archive: Open access to 664,014 e-prints in Physics, Mathematics, Computer Science, Quantitative Biology, Quantitative Finance and Statistics.

Directory of open access journals: This service covers free, full text, quality controlled scientific and scholarly journals. We aim to cover all subjects and languages. There are now 6271 journals in the directory. Currently 2722 journals are searchable at article level.

Wikipedia Open access journals. A list of open access journals...

Google "open journals"

Cognitive Science Network directed by Mark Turner (HT Sport Linguist).

Free Full Text: a search engine returning full text scientific articles with no access fees.

Saturday, March 12, 2011

Korean in Killeen

Having spent nearly 4 months of the last year and a half working at Fort Hood, in Killeen Texas, I finally decided to leave the safe confines of the hotel-centric chain restaurants and Target/Wal-Mart shopping centers and take a drive to historic downtown Killeen. I found pretty much what I expected to find, empty one storey store fronts, dusty unused parking spaces, and lots of lots of Hangul ... (screeching sound) ... huh?

Yep, turns out historic downtown Killeen, heartland of America, is being somewhat revitalized by Korean immigration. My favorite grocery store by far is the Korean O-Mart (not the one pictured above, btw), where I can find genuinely fresh vegetables and dumplings (as well as shitake mushrooms, plenty of seaweed for soup, and a wide array of spicy sauces that I have been eagerly experimenting with).

It was a nice lesson in American multi-linguialism.

Monday, March 7, 2011

turning gaga into water = 200 terabytes

How much storage would it take to store the first 5 years of a child's linguistic environment? Apparently, 200 terabytes. From Fast Company:

...cognitive scientist Deb Roy Wednesday shared a remarkable experiment that hearkens back to an earlier era of science using brand-new technology. From the day he and his wife brought their son home five years ago, the family's every movement and word was captured and tracked with a series of fisheye lenses in every room in their house. The purpose was to understand how we learn language, in context, through the words we hear. A combination of new software and human transcription called Blitzscribe allowed them to parse 200 terabytes of data to capture the emergence and refinement of specific words in Roy’s son’s vocabulary.

The data visualization techniques he uses are pretty cutting edge ... and awesome! I love the fact that he is trying to use visualization techniques to help us understand something beyond raw statistics (which is where most graphs and pie charts die miserable deaths). Statistics are like molecules. Visualize them one by one and it's difficult for the average person to conceptualize the big picture of how they work together to create a grander whole. Roy appears to be trying to get beyond the yawn-inducing graphs that plague modern science. I mean, he uses freaky-deaky time-worms! How cool is that!

Roy talk's about feed-back loops as well:

..."Caregiver speech dipped to a minimum and slowly ascended back out in complexity.” In other words, when mom and dad and nanny first hear a child speaking a word, they unconsciously stress it by repeating it back to him all by itself or in very short sentences. Then as he gets the word, the sentences lengthen again. The infant shapes the caregivers’ behavior, the better to learn.

He gave a TED talk recently, but the video is not yet available.

Thursday, March 3, 2011

Hosni prefers "Hosny" in transliterated attire

Rachel Maddow et al. discovered a delicious gem fit for the annals of transliteration. Namely, how to write a specific Arabic name in the Roman alphabet (what we English speakers like to call "regular spelling"). She (and her staff) reported that Hosni Mubarak attended a head-of-state meeting in Albania a couple years ago wearing the world's most narcissistic pinstriped suit*, where the pin stripes were actually composed of lines of his name written in Roman alphabetic transliteration (this man really knows how to live the life of a tyrant, am I right?):

It is a troublesome fact of human language that writing the damned thing down is never easy. It's difficult enough to construct a writing system that is consistent for a single language, more difficult still to take a linguistic term (like a person's name) and write it down in a script which was not designed for that particular language. So when English language writers (like journalists) have to write down Arabic names in "regular spelling" they inevitably face difficult choices about which letters to use to represent particular sounds. Vowels are particularly difficult creatures to pin down with alphabetic rope (e.g., the whole and sometimes y fiasco).

The act of writing a linguistic term in a foreign script is called transliteration, and it's troublesome enough to have spawned a cottage industry sub-field within computational linguistics. For example, if you wanted to Google information about the currently exiled president of Egypt, you would be wise to Google the term "Hosni Mubarak." That is by far the most common spelling of the man's name on the internet (by a better than 20-1 margin, at least according to Google hit counts). Even if you choose the "Hosny" variant, you're basically just redirected to the "Honsi" results anyway. Yet the tyrant himself, ever the maverick, prefers the road less traveled.

Sadly, there's not much more to say about this than to emphasize the simple fact that transliteration is largely arbitrary and disputes about guidelines are largely trivial. Just flip a coin and move on ... (I just seriously pissed off the world's four transliteration experts).

...and in closing I'd like to repeat my assertion that Hosni/y Mubarak looks suspiciously like The Face of Bo**:

*FYI, I have no independent verification of the truth of this story. If Maddow's staff got punk'd, their bad.
**Damn you Captain Jack!!

Wednesday, February 23, 2011

the linguistics of 404 FILE NOT FOUND

A cute site providing humorous translations of the world's most frustrating search result. Personal favs:

American South - Ah cain't find th' page yer lookin' fer.
Australia - Strewth mate yer bloody page has shot through.
Blond - like omg! ur file has not been found, go paint ur nails and try back later, lol^^....I FOUND A QUARTER!
Cockney - No chance luv, carrnt find it neever.
Pirate - Haaarr, Lubber! I've sailed yon seas with toil and trial, and yet I cannot find ye file!
Pittsburghese - This page needs fixed n'at... it's all caddywhompus! Yinz needs look somewheres else.
Zombie - Arrgrg 404 BrAiNs aAAArrggh No ggrrgrh page brAiNz heRe BrAAAAIIINNSSSS!