We will now present a modified version for RTNs (appearing as "rtnrecg.pl" in the appendix) of the FSTN recognizer program. The program harnesses Prolog's backtracking ability to perform a depth-first search of the search space, as before. In addition, it uses recursion in Prolog to implement recursive network traversal and is hence able to dispense with an explicit push-down stack.
The RTN recognizer several times uses pairs of arguments as difference lists to represent strings. Though initially somewhat unintuitive, difference lists are an efficient way of manipulating lists of symbols in Prolog, and they play a crucial role in the interpretation of the Definite Clause Grammar formalism that we shall discuss in detail in the next chapter. Readers unfamiliar with difference lists should probably take time out to review the coverage of the topic in a Prolog text at this point.
Each main predicate responsible for recognition is given two arguments concerned with sequences of words (lists). The first of the pair of arguments represents an initial list, some initial portion of which is to be recognized. The second of the pair represents the end portion of the first list which is "left over" after the recognition. When such a predicate is invoked during recognition, its task is to find possible initial portions of the first list which can be recognized, "consume" the relevant words from the list and return what remains as the "left over" list.
As you would expect, implementing RTN traversal is slightly more complex than implementing FSTN traversal since, in the former, we need to keep track of which network we are in, and we need some way of dropping down into subnetworks and of climbing back out again. As before, we shall split the recognition task beween two predicates, traverse/3 and recognize/4.
The former is a suitably modified version of our previous FSTN traverse predicate and the latter is similar to the two argument recognize predicate, but with the addition of an extra argument to keep track of the name of the current network and an extra argument to keep track of any string "left over" after the traversal:
% recognize(Net,Node,WordList,Left) % Net: the name of the network to be traversed % Node: the name of the node you wish to start from % WordList: list of words which you want to test % Left: the list of words left over after the traversal
Notice that we have chosen to use two arguments to represent the initial location - the network and the node within it - rather than combining them into a single object, as would be directly suggested by our "Net/Node" notation. As with the FSTN recognize, the predicate recognize has two clauses. Firstly, if we have reached the final node of the (sub)net, we can successfully recognize without consuming any of the string:
recognize(Net,Node,X,X) :- final(Node,Net).
Alternatively, we want to recognize a string X if (i) we can consume part of the string traversing an arc from the current node, leaving string Y over and (ii) we can recognize the remainder Y, continuing from where that arc ended up. The string "left over" Z from such a recognition is then whatever was left over by that.
recognize(Net,Node_1,X,Z) :- arc(Node_1,Node_2,Label,Net), traverse(Label,X,Y), recognize(Net,Node_2,Y,Z).
Thus recognize delegates part of the job to traverse, and it is to the latter that we turn next:
% traverse(Label,WordList,Left) % Label: an arc label specifying a kind of test % WordList: list of words which you want to test % Left: the list of words left over after the traversal of the arc
There are four possibilities for success, depending on what kind of test on the input string the arc label specifies. First of all (SYM'), if the label is not a special symbol, we can consume one word from the string if it is the same as the arc label, returning the rest of the string.
traverse(Word,[Word|X],X) :- not(special(Word)).
Second (ABB'), if the label specifies an abbreviation, we can consume one word from the string if it is the an instance of the appropriate category, returning the rest of the string.
traverse(Category,[Word|X],X) :- word(Category,Word).
Third (PUSH/POP), if the arc label names a subnet we can traverse the arc by traversing the named network:
traverse(Net,String,Left) :- initial(Node,Net), recognize(Net,Node,String,Left).
Notice that when we recursively traverse a subnetwork we do not need explicitly to remember where to return to afterwards. This is because if the recursive call to recognize succeeds Prolog will automatically continue with whatever remains to be done after that (succeeding the traverse goal and then continuing in the recognize clause that called it). So the PUSH/POP operations are conveniently implemented already by Prolog's call/return mechanism. The final clause for traverse (JMP') handles the case where the label on the arc is the jump symbol. In this case we can succeed without consuming any string:
traverse('#',X,X).
Between them, the mutually recursive definitions of predicates traverse and recognize handle the problem of RTN traversal in a compact and elegant manner.
Code:rtnpars.pl
Send us a comment.