Subcategorization and the use of features

As the reader may already have noticed, the ungrammatical strings, '*Dr. Chan died patients' and '*MediCenter employed' are among those that Grammar1 claims to be sentences

(linguists use "*" to mark strings that are intuitively ungrammatical in the language under consideration, in this case English). The problem here involves what linguists call "subcategorization". Although 'died' and 'employed' are both verbs, as Grammar1 claims, they belong in different subcategories of the class of verbs. Specifically, 'employed' is transitive and requires a following NP, whereas 'died' is intransitive and cannot tolerate a following NP. We could patch up Grammar1 by replacing V with two categories, IV (intransitive verb) and TV (transitive verb), and revising the VP rules and lexicon accordingly, but such an approach rapidly proliferates a host of distinct parts of speech once a larger class of sentences is considered.

Instead of pursuing the ad hoc solution just mentioned, we will turn to a more principled approach: one that employs syntactic features. Most, if not all, contemporary theories of grammar employ features, and the extent and sophistication of their use has grown massively in the 1980s, although their use in computational linguistics goes back to

the late 1950s. In a modern feature-theoretic syntax, atomic categories such as NP and V are replaced by sets of feature specifications. Each feature specification consists of a feature, say CASE, and a value for that feature, say ACCUSATIVE. The familiar names for categories such as NP and V can then be reintroduced as the value of a particular feature, which we shall call CAT. In fact, we have already made this move in the lexical entries we have shown.

In the light of this, what we will do now is to rebuild Grammar1 using features, thus creating Grammar2, and solving the problem noted earlier with subcategorization. To make Grammar2 a little more interesting than it would otherwise be, we will also introduce a rule for coordination, which will allow us to illustrate the role of recursion in grammars.

Having done that, we will then extend Grammar2 to Grammar3, and in so doing illustrate the descriptive power of feature-theoretic techniques on a range of syntactic phenomena.

We will employ just two features in Grammar2 - namely, CAT(egory) and arg1(ument). cat will have as its values the labels that were used for categories themselves in Grammar1, but with one addition, C, which we will discuss later. arg1 will, for the moment, have just two values: 0 (that is, nothing) and NP. How are we to interpret these features and their values? Well, consider a category that has V as the value of its cat feature, which means that it is a verb, and NP as the value of its arg1 feature. We will interpret the latter to mean that it is the kind of verb that requires a following NP; that is, it is a transitive verb. A verb with 0 as its arg1 value will be a verb that requires nothing to follow; that is, it is an intransitive verb.

We now have the problem of how to write rules, given that we have moved away from unanalysed (monadic) categories to bundles of featural information. To keep things clear, we will continue to write, for example:




Rule {simple sentence formation} S -> NP VP.

But this must now be construed as elliptical for something that would be more verbosely expressed as follows:




Rule {simple sentence formation} X0 -> X1 X2: <X0 cat> = S <X1 cat> = NP <X2 cat> = VP.

So, in this verbose version of our rule notation, we have place holders X0, X1 and X2 in the rule itself, and then a collection of equations such as <X0 cat> = S, which is to be read as saying 'the value of X0's category feature is S'. In the abbreviated rule notation, we simply substitute the value of cat for the place holder in the rule and in accompanying equations. Given these remarks, we can now exhibit the rules employed by Grammar2.



BOX: \Grammar2




Rule {simple sentence formation} S -> NP VP. Rule {intransitive verb} VP -> V: <V arg1> = 0. Rule {single complement verbs} VP -> V X: <V arg1> = <X cat>. Rule {coordination of identical categories} X0 -> X1 C X2: <X0 cat> = <X1 cat> <X0 cat> = <X2 cat> <X0 arg1> = <X1 arg1> <X0 arg1> = <X2 arg1>.

Word died: <cat> = V <arg1> = 0. Word recovered: <cat> = V <arg1> = 0. Word slept: <cat> = V <arg1> = 0. Word employed: <cat> = V <arg1> = NP. Word paid: <cat> = V <arg1> = NP. Word nursed: <cat> = V <arg1> = NP. Word and: <cat> = C. Word or: <cat> = C.



The first rule looks the same as its counterpart in Grammar1 and performs exactly the same function. The second and third rules in Grammar2 perform the desired functions of the second and third rules in Grammar1, but, together with the lexical entries shown, they ensure that a transitive verb will be followed by a noun phrase, whereas an intransitive verb will be followed by nothing. Grammar2 thus makes the correct claims about the grammaticality of the following examples:


Nurses died. *Nurses died patients. *MediCenter employed. MediCenter employed nurses.

The grammar will assign structures to the first and last examples, but no structures are available for the second and third. The verbs 'die' and 'employ' belong to two featurally distinct categories in Grammar2: the former may not be followed by a noun phrase, the latter must be followed by a noun phrase. Hence, as far as Grammar2 is concerned, the second and third examples are ungrammatical.

The fourth and final rule is a schema, which takes us beyond the domain covered by Grammar1. This schema introduces the coordinate construction and says that a given category can consist of two further instances of the same category separated by an item of category C, which will turn out to be realized as 'and' or 'or'.

Turning now to the lexicon for Grammar2, while we can simply carry over the NP entries from Grammar1, we do need to revise the V entries, and add a couple of C entries. Notice that a few extra verbs have been added to enhance the plausibility of the examples given.

Apart from the introduction of features, which at this stage may appear to have been of much more trouble than it is worth, Grammar2 appears very little different from Grammar1. But there is one very fundamental difference between them. Grammar1 claimed that exactly 40 strings of words were grammatical instances of the category S, whereas Grammar2 admits infinitely many strings of words as grammatical instances of S. How can this be? The answer lies in the coordination schema we have introduced into Grammar2. This allows any category to split into two instances of the same category. These new instances may themselves split, and so on. This aspect of the grammar provides an example of recursion in syntactic rules (see the examples following Grammar3 for a rather different example of this phenomenon). Grammar2 permits recursion, whereas Grammar1 did not. This should become clearer from the example in Figure 4.2 which exhibits one of the infinitely many trees that Grammar2 admits as grammatical. As can be seen, we have two sentences coordinated, the second sentence itself containing a coordinate verb phrase. Thus, this example illustrates how the S category can reintroduce the S category, and how the VP category can reintroduce the VP category. The grammar imposes no limit on how many times categories can be reintroduced by the coordination rule, and so there is no limit on the number of expressions that Grammar2 can admit. Hence, Grammar2 admits of the following examples:


Nurses died. Nurses died and patients recovered. Nurses died and patients recovered and Dr. Chan slept. Nurses died and patients recovered and Dr. Chan slept and MediCenter employed nurses. Nurses died and patients recovered and Dr. Chan slept and MediCenter employed nurses and ... .

Although Grammar2 allows for the grammaticality of infinitely many sentences, we would rapidly grow bored with an enumeration of them. This is partly because the lexicon for Grammar2 is very small and so lengthy examples will force us into the repetition of lexical items. Let us therefore expand Grammar2 to Grammar3 and, in so doing, enlarge our lexicon by exploiting more fully the technique of using a feature (arg1) to encode the category of complements that we employed for subcategorization in Grammar2. The rules for Grammar3 are those employed in Grammar2 with one addition, a rule for expanding prepositional phrases:




Rule {prepositional phrases} PP -> P X: <P arg1> = X.

It is only when we come to the lexicon of Grammar3 that differences really become apparent. Grammar3 allows us to employ a much wider range of distinct subtypes of lexical item, including six different kinds of verb, by augmenting the lexicon of Grammar2.




BOX: \Lexicon3




Word approved: <cat> = V <arg1> = PP. Word disapproved: <cat> = V <arg1> = PP. Word appeared: <cat> = V <arg1> = AP. Word seemed: <cat> = V <arg1> = AP. Word had: <cat> = V <arg1> = VP. Word believed: <cat> = V <arg1> = S. Word thought: <cat> = V <arg1> = S. Word of: <cat> = P <arg1> = NP. Word fit: <cat> = AP. Word competent: <cat> = AP. Word well-qualified: <cat> = AP.



Grammar3 provides us with structures for all the following examples, as well as an infinity of others:


Nurses thought Dr. Chan seemed competent. Dr. Chan appeared well-qualified and disapproved of MediCenter. Patients had believed nurses thought Dr. Chan had slept.

This last example illustrates recursion through the sentential (clausal) or verb phrase complements of a verb - we saw an RTN treatment of this phenomenon in Chapter 3. This is a very common form of recursion found in natural languages.

Our final example grammar will maintain the overall structure employed in Grammar2 and Grammar3. As Lexicon4 will be identical in every respect to Lexicon3, we shall not repeat it here. In fact, the only real change we will make in Grammar4 is the addition of a new feature, which we will call slash in virtue of a notational convention whereby we will use X/Y to represent a category X0 whose cat is X and whose slash is Y; that is,<X0 cat> = X and <X0 slash> = Y. The addition of this feature to the system requires certain consequential additions to the rules.

Expressions like S/NP and VP/PP then, stand for a particular kind of category, but what kind? What is the intuitive content of the notation? The answer is quite straightforward: an expression of category X/Y is an expression of category X from which an expression of category Y is missing. Thus, an S/NP (read S-slash-NP) is a sentence (or clause) that has got a noun phrase missing, whereas a VP/PP is a verb phrase that is missing a prepositional phrase. To make use of, or even to make sense of, these slash categories, we will need to make a number of additions to the rules that Grammar3 came equipped with.


BOX:




Rule {simple sentence formation} S -> NP VP: <S slash> = <VP slash> <NP slash> = 0. Rule {intransitive verb} VP -> V: <V arg1> = 0 <V slash> = 0 <VP slash> = 0. Rule {single complement verbs} VP -> V X: <V arg1> = <X cat> <V slash> = 0 <VP slash> = <X slash>. Rule {prepositional phrases} PP -> P X: <P arg1> = <X cat> <P slash> = 0 <PP slash> = <X slash>. Rule {coordination} X0 -> X1 C X2: <X0 cat> = <X1 cat> <X0 cat> = <X2 cat> <C slash> = 0 <X0 slash> = <X1 slash> <X0 slash> = <X2 slash> <X0 arg1> = <X1 arg1> <X0 arg1> = <X2 arg1>. Rule {topicalization} X0 -> X1 X2: <X0 cat> = S <X1 empty> = no <X2 cat> = S <X2 slash> = <X1 cat> <X2 empty> = no <X0 slash> = <X1 slash>. Rule {slash elimination} X0 -> : <X0 cat> = <X0 slash> <X0 empty> = yes.



The first five rules of Grammar4 are simply revisions of those in Grammar3 to allow for our new feature: in each case, the value of slash (if any) on the mother is equated with the value of slash (if any) on the complement daughter, or all the daughters, in the case of the coordination rule. The sixth and seventh rules are wholly new. The essence of the topicalization rule can be more perspicuously expressed using our slash notation:




S -> X S/X.

This rule says that a sentence can consist of some category followed by a sentence which is missing an expression of that category. And, these in turn, taken together with the rest of Grammar4, will permit such sentences as the following (without the commas):


MediCenter, nurses disapproved of _. Of MediCenter, nurses disapproved _. Well-qualified, Dr. Chan had seemed _.

As can readily be seen, the sentences that follow the comma in these examples all have something missing. The rule that is responsible for this missing item in Grammar4 is the sixth one, which can also be notated as follows:




X/X -> .

This just says that an X missing an X can be realized as nothing.

Thus, NP/NP, PP/PP and AP/AP are all allowed to appear as nothing in the structures defined by Grammar4.

Note that our modifications to the rules in Grammar3 mean, for example, that a sentence missing a Y can consist of a noun phrase followed by a verb phrase missing a Y (where Y might be NP, say). So, what the modified rules now permit is a transfer of information about a missing category from mother to daughter. The way this works should become clearer by looking at the relevant tree exhibited in Figure 4.3.


There is, in principle, no limit on the amount of intervening material that can occur between the displaced constituent at the front of the sentence and the empty constituent that corresponds to it and which occurs within the sentence. Such constructions manifest what are known as unbounded dependencies and are surprisingly common in the world's languages. English, for example, makes extensive use of this type of construction, most notably in questions and relative clauses. In Chapter 3 we indicated how an ATN treatment might tackle such constructions using a global register HOLD to remember the displaced material.

Such a procedural approach, which is specialized to parsing, differs substantially from the declarative description of the slash feature given in Grammar4. Although the analysis of topicalization given here is rudimentary, it does serve to illustrate the sort of feature-passing technique that is very common in current computational linguistic work on syntax.

The lexicon of Grammar3 is almost adequate for Grammar4, except for one problem. The grammar rules explicitly require that words like verbs and prepositions have 0 as the value of their slash feature. This is to prevent such words being extraposed from their normal position and untopicalized constructions having gaps. As a result, lexical entries need to state their value of slash explicitly, as in:




Word died: <cat> = V <slash> = 0 <arg1> = 0.

It is straightforward, but laborious, to make the necessary alterations to all the lexical entries for Grammar3. The use of macros in lexical entries, an idea to be introduced in Chapter 7, is one possible way of making such global assignments of features to lexical items.

Notice that our use of features in Grammar2, Grammar3 and Grammar4 has been very restricted. We have simply replaced monadic category names like S and NP with small finite sets of attribute value pairs, where both attribute and value are drawn from small finite sets of atomic elements. Consequently, all these grammars could be converted into equivalent grammars with monadic categories quite unproblematically, although there are no obvious computational reasons for doing so. For instance, the rule:




Rule {simple sentence formation} S -> NP VP: <S slash> = <VP slash>.

could be replaced by a finite number of rules like:




Rule S/S -> NP VP/S. Rule S/NP -> NP VP/NP. Rule S/VP -> NP VP/VP. Rule S/AP -> NP VP/AP.

where S/S, VP/S and so on, are to be read as single symbols, just like NP and S. Since the number of categories in these grammars is finite, and the rules are context free and finite in number, we have not left the domain of CF-PSGs, even though the descriptive apparatus used is much more sophisticated than that employed in Grammar1. We consider a mathematically more far-reaching use of the feature system offered by the PATR formalism at the end of this chapter.

The grammars we have been elaborating are what linguists would call 'toy grammars'. They serve to illustrate something of the way contemporary feature-theoretic grammars work, but they should not be taken too seriously as analyzes of English. The fact that Grammar3 and Grammar4 work as well as they do owes a lot to the particular lexicon that they employ. The reader who has mastered the mechanics of at least Grammar3 might usefully consider the questions that follow.

Exercise 4.4

Send us a comment.



[Contents] [Previous] [Next]
This document was translated by troff2html v0.21 on October 22, 1996.