Pushdown transducers

We saw in the last chapter how an FST is simply an FSA that deals with two tapes. Similarly, we can construct a pushdown transducer (PT) as a pushdown automaton that deals with two tapes. To specify a PT, it suffices to augment the RTN notation so that labels on arcs can denote pairs of symbols, just as in the finite-state case. In general, a label in a PT (or, to be precise, an RTN that is interpreted as a PT) may specify any of the following:


a symbol to be found on each tape - for example, fish_poisson - where either or both symbols may be #. an abbreviation for a set of such symbol pairs. the name of another PT to be traversed.

With the extra power of RTNs, we can write interesting transducers quite easily. For instance, an ELIZA PT can be constructed from the previous RTN translated into a transducer.


BOX:


Name ELIZA: Initial 1 Final 2 From 1 to 2 by EVERYONE-ME From 1 to 2 by ... ...many more arcs...

Name EVERYONE-ME: Initial 1 Final 4 From 1 to 2 by everyone_'who do you think' From 2 to 3 by VERB-GROUP From 3 to 4 by me_you From 4 to 4 by EQUAL.

Name VERB-GROUP: Initial 1 Final 2 From 1 to 11 by has_has From 11 to 2 by V-PERF. ...

EQUAL abbreviates: a_a, about_about, acorn_acorn, ...

V-PERF abbreviates: forgotten_forgotten, punished_punished, hated_hated, loved_loved, ...



It is similarly easy to produce a simple PT for limited English-French translation. Apart from the alterations to deal with gender in noun phrases, this PT is a direct translation of our previous RTN for English sentences.


BOX:


Name S: Initial 0 Final 2 From 0 to 1 by NP From 1 to 2 by VP.

Name NP: Initial 0 Final 2 From 0 to 1 by DET-FEMN From 1 to 2 by N-FEMN From 0 to 4 by DET-MASC From 4 to 2 by N-MASC From 2 to 3 by WH From 3 to 2 by VP.

Name VP: Initial 0 Final 1,2 From 0 to 1 by V From 1 to 2 by NP From 1 to 3 by that_que From 3 to 2 by S.

N-MASC abbreviates: man_homme, horse_cheval, ...

N-FEMN abbreviates: house_maison, table_table, ...

NP abbreviates: John_Jean, Mary_Marie, Jean_Jeanne, ...

DET-MASC abbreviates: a_un, the_le, this_ce, ...

DET-FEMN abbreviates: a_une, the_la, this_cette, ...

V abbreviates: sees_voit, hits_frappe, sings_chante, lacks_manque, ...

WH abbreviates: who_qui, which_qui, that_qui.



Note that, when used to produce French output from English input, this system is doing little more than word-by-word translation, and it is easy to find examples where this will not work for French or other languages. On the other hand, the fact that the transducer incorporates a notion of what a grammatical sentence is in each language means that it could be used for other purposes - for instance, generating random pairs of legal and equivalent sentences.

Given that an RTN has successfully recognized a sentence as grammatical in some language, we might be interested to see which route the RTN interpreter took through the network for this sentence. That is, we might be interested to see a trace of which networks have been entered and exited, which word categories have been found when, and so on.

One possibility would be to augment our RTN interpreter to keep track of this information and display it at the end of recognition. Alternatively, we could construct a transducer that produces this information as its output.


BOX:




Name S: Initial 100 Final 200 From 100 to 0 by #_(S From 0 to 1 by NP From 1 to 2 by VP From 2 to 200 by #_).

Name NP: Initial 100 Final 200 From 100 to 0 by #_(NP From 0 to 1 by DET From 1 to 2 by N From 2 to 3 by WH From 3 to 2 by VP From 2 to 200 by #_).

Name VP: Initial 300 Final 100, 200 From 300 to 0 by #_(VP From 0 to 1 by V From 1 to 2 by NP From 1 to 3 by that_that. From 3 to 2 by S From 1 to 100 by #_) From 2 to 200 by #_).

N abbreviates: man_N, house_N, table_N, ...

NP abbreviates: Mayumi_NP, Maria_NP, Gilbert_NP, ...

DET abbreviates: a_DET, the_DET, that_DET, ...

V abbreviates: sees_V, hits_V, sings_V, lacks_V, ...

WH abbreviates: who_WH, which_WH, that_WH.



This transducer, which is again derived straightforwardly from our English sentence RTN, can be used to transduce a sentence into a trace showing the networks that the RTN would traverse in accepting it. Each network has been augmented with extra arcs at the beginning and end (one for each initial and final state), which produce output on the second tape recording entry to and exit from that network. A symbol like '(S' on the output indicates entry to the S network; the symbol ')' on the output indicates exit from the network last entered. In addition, we have each instance of an abbreviation or particular word encountered leave behind a trace of its occurrence. Thus, for instance, the input sentence 'Mayumi sees that the man sings' would produce essentially the following output on the second tape:


(S (NP NP) (VP V that (S (NP DET N) (VP V))))

At present, our networks are so simple that it is not possible for a sentence to be recognized in more than one way. Hence, this transducer will always generate exactly one trace for each legal sentence. We can change this by allowing prepositional phrases in our sentences. At its simplest, a prepositional phrase consists of a preposition followed by a noun phrase - for instance, 'with the funny hat' and 'behind Washington'. A noun phrase can have any number of prepositional phrases at its end, as in:


The man with the funny hat behind Washington

In this case, the prepositional phrases are telling us more about whoever it is being described by the noun phrase. Prepositional phrases can also occur at the end of verb phrases to indicate where, when or how the action takes place, as in:


Maria sang with the choir in the large room.

Sometimes it is not clear whether a given prepositional phrase belongs to a noun phrase or to a verb phrase, as in:


Maria saw the woman with the telescope.

This ambiguity corresponds to different ways in which this could be recognized as a sentence.

Exercise 3.4

Code:rtrans.pl

Send us a comment.



[Contents] [Previous] [Next]
This document was translated by troff2html v0.21 on October 22, 1996.