Briscoe, E. and J. Carroll (1997) `Automatic extraction of subcategorization from corpora'. In Proceedings of the 5th ACL Conference on Applied Natural Language Processing, Washington, DC. 356-363.

We describe a novel technique and implemented system for constructing a subcategorization dictionary from textual corpora. Each dictionary entry encodes the relative frequency of occurrence of a comprehensive set of subcategorization classes for English. An initial experiment, on a sample of 14 verbs which exhibit multiple complementation patterns, demonstrates that the technique achieves accuracy comparable to previous approaches, which are all limited to a highly restricted set of subcategorization classes. We also demonstrate that a subcategorization dictionary built with the system improves the accuracy of a parser by an appreciable amount.

Download pdf version.

[Back]