So youve stayed up until the small hours, night after night, and you finally have yourself a parser. Players can type in complicated imperative sentences and your program will accept them or generate a stream of expletives. How do you use it, though?
Well, what does it produce? Its fairly trivial to get it to record what parts of speech were assigned to a token you just write the successful ones into a separate array during the parse() stack unwinding. You can do more, though: you can create a parse tree.
A parse tree is a data structure that represents not only what the tokens in an input line mean, but how they were generated. Its where the earlier decision of what to make into the rules of your BNF grammar finally becomes important.
Heres an outline example of what a parse tree looks like:
<input line>
|
<sentence>
|
<command>
|
<command 2>
|
+-----------------+--------+--------+-----------------+
| | | |
| <noun phrase> | <noun phrase>
| | | |
| <noun group> | <noun group>
| | | |
| +-----+-----+ | +-----+-----+
| | | | | |
WITH THE DIAMOND RING THE BELL
preposition article noun verb article noun
WITH THE DIAMOND RING THE BELL
preposition article noun verb article noun
From this data structure (which you can build up piecemeal on the unwind of the recursion a tiresome but not time-consuming exercise) its possible to identify not only what the individual
words refer to, but what the individual
rules do. Thus, if you want to find out what the second noun group of a command is, you really can just read it off the data structure.
If you prefer a brute-force approach, then you dont strictly need the data structure you can rip elements out of the token array by relying on what you know about your grammar (e.g. anything between a preposition and either a verb, preposition or end of sentence is a noun phrase if you ignore the adverbs and hack your way round conjunctions). In the case where your parser is only partially backtracking, like
MUD2s, youll pretty well have to do it this way.
Making meaning
So its possible to group tokens together into meaningful grammatical units. Now you have to figure out what they mean in game terms.
This is where the vocabulary comes in. I described the vocabulary way back in column 11 as a set of triples: <word, part of speech, atom>. Tokenisation used the first two of these it converted the words that players entered into tokens. Now we use the second two we convert the tokens into atoms. Atoms are things the game can use; they refer unambiguously to a single concept (although not necessarily to a single referent of that concept, as we shall see when we discuss the binder).
MUDs are focussed on commands. These are unitary instructions telling the game to do something. The parse tree lets us see how the input line was split into sentences, and how sentences were split into commands, so we know what each individual command consists of in terms of tokens (and therefore atoms). How does that give us a handle on what the sentence means, though?
OK, this next bit comes from AI computational linguistics. Its is pretty obvious anyway, so Im just going to say it and not explain the rationale...
The verb of a command is the function in a function call. The first noun group (if it has one) is the calls first parameter; the second noun group (if it has one) is its second parameter. Adverbs and prepositions are functions that modify the verb; adjectives are functions that modify what the noun refers to; pronouns are dynamically assigned nouns.
So what you need to produce from your parse tree is a data structure of atoms that reflect this information, and which is implemented in a form directly acceptable by your game engine. How you decide to do this is up to you, but for the sake of simplicity Ill assume it to be a list with nested sublists:
[ verb [verb modifiers] [noun phrases] ]
The verb modifiers are just the adverbs and prepositions that the command contains. The noun phrases themselves have their own particular format:
[ noun [noun modifiers] ]
where the noun modifiers are the adjectives and qualifying noun phrases associated with the noun.
Examples
Heres are a couple of examples Ive used before in this series of articles:
WITH THE DIAMOND RING THE BELL
[ ring, [with], [ [diamond, [the]], [bell, [the]] ] ]
TAKE THE GREEN APPLE FROM THE BOX THEN HIT IT WITH MY SWORD
[take, [from], [ [apple, [the, green]], [box, [the]] ] ]
[hit, [with], [ [it, []], [sword, [my]] ] ]
Heres a big, meaty example to show you the kind of thing youll be able to boast your parser can handle. Ill write out its list in a more structured fashion so you can see how its made up without having to count brackets:
VERY QUICKLY DROP THE TREASURE EXCEPT THE EMERALD AND THE SMALL WHITE STONE INSIDE THE OPEN MUSIC BOX SECRETIVELY
[ drop,
[inside, very, quickly, secretively],
[ [treasure,
[the,
[except,
[and,
[ [emerald,
[the]
],
[stone,
[the, small, white]
]
]
]
]
]
],
[box,
[the, open, music]
]
],
]
This list is the result of the parsing module, and its passed on to the binder. The binder has to take the list, and use its own knowledge of the game worlds state to derive individual function calls. You dont have to write the binder in your game worlds definition language, but its a darned sight easier if you do (MUD2s is written in MUDDLE like the rest of the game now; I learned the hard way...). The binder is the final step of the whole parsing process, but it can be among the most tortuous and frustrating of them all because there are so many special cases involved.
Ill begin my discussion of the binder next time...