Carroll, J., E. Briscoe and A. Sanfilippo (1999) `Parser evaluation: current practice'. In Evaluation of Natural Language Processing Systems: Final Report, EC DG-XIII LRE EAGLES Document EAG-II-EWG-PR.1. 140-150.

A wide variety of parser and/or grammar evaluation methods have been used (and sometimes justified) in the literature. A recent workshop focussed on this topic (The Evaluation of Parsing Systems at the 1st LREC, Granada, Spain).

In this paper (a revised and extended version of Carroll, Briscoe and Sanfilippo's 1998 paper `Parser evaluation: a survey and a new proposal') we present a critical overview of the state-of-the-art in parser evaluation. We go on to argue that no extant method is entirely satisfactory, particularly in view of the wide range of parsing technology currently in use in the natural language processing research community. With this in mind, we motivate and present a higher-level and more task orientated scheme to represent the information a parser should extract from a sentence, together with suitable evaluation measures. This approach is based on EAGLES Computational Lexicons Working Group standards and has been developed and used for four European languages in the CEC Language Engineering project `SPARKLE'.

Download postscript version of complete report.

[Back]