The paper is titled Watson Discovery Advisor: Question-answering in an industrial setting.
The Abstract
This work discusses a mix of challenges arising from Watson Discovery Advisor (WDA), an industrial strength descendant of the Watson Jeopardy! Question Answering system currently used in production in industry settings. Typical challenges include generation of appropriate training questions, adaptation to new industry domains, and iterative improvement of the system through manual error analyses.The paper's topic is not surprising given that four of the authors are PhDs (Charley, Graham, Allen, and Kristen). Hence, it was largely a group of fishes out of water: they had an academic bent, but are daily wrestling with the real-word challenges of paying-customers and very messy data.
Here are five take-aways:
- Real-world questions and answers are far more ambiguous and domain-specific than academic training sets.
- Domain tuning involves far more than just retraining ML models.
- Useful error analysis requires deep dives into specific QA failures (as opposed to broad statistical generalizations).
- Defining what counts as an error is itself embedded in the context of the customer's needs and the domain data. What counts as an error to one customer may be acceptable to another.
- Quiz-Bowl evaluations are highly constrained, special-cases of general QA, a point I made in 2014 here (pats self on back). Their lesson's learned are of little value to the industry QA world (for now, at least).
I do hope you will read the brief paper in full (as well as the other excellent papers in the workshop).