Lynne J. Cahill
| Gerald Gazdar
Information Technology Research Institute
|
School of Cognitive and Computing Sciences
University of Brighton
|
University of Sussex
Brighton
United Kingdom
(Lynne.Cahill@itri.bton.ac.uk,
geraldg@cogs.susx.ac.uk
)
August 1998
Computational linguists have made significant advances over the last dozen years in developing theoretically motivated techniques for representing the lexicons of individual languages. By contrast, little progress has yet been made in the design of lexicons for two or more related languages. However, such multilingual lexicons will be central to the operation of many of the products of the natural language processing industry that will appear in the next two decades.
In the PolyLex project, we are developing a trilingual computer lexicon for the core vocabulary of Dutch, English and German. From a linguistic perspective, we are ascertaining the extent to which these Germanic languages can be lexically related, examining formal ways of expressing linguistic generalizations that hold across two or more languages, and assessing the degree to which the historical links between languages can be exploited in descriptions of the languages as they are now. From a computational perspective, we are evaluating how well existing techniques for representing monolingual lexicons generalize to the multilingual case and investigating the extent to which multilanguage lexical representation techniques may be applicable within monolingual lexicons.

