We are proud to offer the opportunity of a two-day tutorial on the Watson Deep QA architecture, given by a member of the original IBM research team that won the Jeopardy! challenge.
The tutorial was open to all participants, including but not limited to students and researchers of TU Darmstadt and other universities.
For students of Computer Science at TU Darmstadt: this tutorial is part of an extended seminar (4CP) about Watson, see http://www.ke.tu-darmstadt.de/lehre/ss13/ml-sem for details.
Download the complete deck of slides (28MB).
Name: Alfio Massimiliano Gliozzo
Affiliation: Research Staff Member at IBM T.J. Watson Research Center
Contact Information: 19 Skyline Drive, Hawthorne, NY 10532, gliozzo-at-us.ibm-dot-com
Bio: Dr. Alfio Gliozzo is Research Staff Member at IBM Watson, where he is part of the Deep QA team. His main research focus is Textual Entailment and Domain Adaptation of Question Answering systems using Distributional Semantics. Before joining IBM, Dr. Alfio Gliozzo worked as a researcher for 11 years in both academic research and semantic technology industry.
He is author of 40+ scientific publications in the areas of Computational Linguistics, Information Retrieval, and Semantic Web. He achieved a significant track record in delivering competitive Semantic Technology systems by conducting state-of-the-art applied research and successful coordination of R&D teams. His solutions and technologies have been applied to develop production level systems for Question Answering, Semantic Advertising and Multimedia Retrieval.
Prof. Dr. Chris Biemann, biem(at)cs(dot)tu-darmstadt(dot)de.
The course is structured in 4 modules (2h each), described below. Videos, Demos and other high quality educational material developed by IBM will be presented during sessions, together with technical content describing details of the DeepQA architecture.
All sessions will take place at
Altes Maschinenhaus, S01|05, Lecture Hall 122, Magdalenenstr. 12, 64285 Darmstadt close to the main building of TU Darmstadt.
Please find directions below.
Open domain Question Answering (QA) is a long-standing research problem. Recently, IBM took on this challenge in the context of Jeopardy!, a well-known TV quiz show that has been airing on television in the United States for more than 25 years. It pits three human contestants against one another in a competition that requires answering rich natural language questions over a very broad domain of topics. The development of a system able to compete with grand champions in the Jeopardy! challenge led to the design of the DeepQA architecture and the implementation of Watson.
The DeepQA project shapes a grand challenge in Computer Science that aims to illustrate how the wide and growing accessibility of natural language content and the integration and advancement of Natural Language Processing, Information Retrieval, Machine Learning, Knowledge Representation and Reasoning, and massively parallel computation can drive open-domain automatic Question Answering technology to a point where it clearly and consistently rivals the best human performance.
Natural Language Processing (NLP) plays a crucial role in the overall Deep QA architecture. It allows to “make sense” of both question and unstructured knowledge contained in the large corpora where most of the answers are located. Semantic Web Technology, enhanced by a massive use of open linked data, is another key component of Watson. Linked data and triple stores have been used to generate candidate answers and to score them under multiple points of view such as type coercion and geographic proximity. In addition the connection between linked data and natural language text offered by Wikipedia has been very useful to generate open domain training data for relation detection and entity recognition systems, improving substantially the NLP capabilities of the system and therefore allowing the development of a truly open domain QA system. With Distributional Semantics, a technology is leveraged that allows fast adaptation of the system to new domains by computing semantic similarity from the application domain’s data and linking terms in context automatically to domain-specific ontologies.
The successful performance of Watson in playing Jeopardy! is stimulating a huge debate around semantic technology and its possible applications. At the same time little effort has been spent in explaining how Watson works from a technical perspective, generating a gap between the “external” perception of the Watson technology and the actual “state of the art”. The goal of this tutorial is to fill this gap, by providing the technical background required to understand Watson and its components. The tutorial will be presented in an extended version, covering NLP and Semantic Web topics. Open domain Question Answering is extremely interesting for web mining and knowledge engineering, including topics like Search on both text and linked data, Information Extraction and NLP.
The education activity around Watson is an established series in world wide top conferences. Below a list of selected previous venues. The first three sessions have been partially covered by previous tutorials, the fourth sessions runs for the first time.
Individual traffic:
A 5 / A 67 Exit "Darmstadt Stadtmitte"
B 26 "Rheinstraße" direction Stadtmitte
B 26 "Cityring" section S1 is directly adjacent to Cityring
Public transport:
Bus lines F und H from Hauptbahnhof (Darmstadt central train station)
via central transfer point "Luisenplatz"
to bus stop "Alexanderstraße/TU".
Alternatively, you can take tram nr. 3 to "Willy Brand Platz" and walk through the park.