Carroll, J., E. Briscoe and A. Sanfilippo (1999) `Parser evaluation: current practice'. In Evaluation of Natural Language Processing Systems: Final Report, EC DG-XIII LRE EAGLES Document EAG-II-EWG-PR.1. 140-150.
A wide variety of parser and/or grammar evaluation methods have been used
(and sometimes justified) in the literature. A recent workshop focussed on
this topic (The Evaluation of Parsing Systems at the 1st LREC, Granada, Spain).
In this paper (a revised and extended version of Carroll, Briscoe and
Sanfilippo's 1998 paper `Parser evaluation: a
survey and a new proposal') we present a critical overview of the state-of-the-art in
parser evaluation. We go on to argue
that no extant method is entirely satisfactory,
particularly in view of the wide range of parsing technology currently in
use in the natural language processing research community. With this in
mind, we motivate and present a higher-level
and more task orientated scheme to represent the information a parser should
extract from a sentence, together with suitable evaluation measures. This
approach is based on EAGLES Computational Lexicons Working Group standards
and has been developed and used for four European languages in the CEC
Language Engineering project `SPARKLE'.
Download postscript version of complete report.