Notes from the Dawn of Time

Notes from the Dawn of Time #20:

Binding Noun Phrases

by Richard Bartle
June 12, 2002

The issue with binding noun phrases is how to get from something like

[treasure, [the, [except, [and, [[emerald, [the]], [stone, [the, small, white]]]]]]

to something like

[ruby2, goblet0, tapestry1].

The former is a binding for a noun phrase, whereas the latter is a list of those objects that satisfy the noun phrase.

Well, it’s all done using sets. Sets are mathematical entities that are either empty or consist of a finite (in our case) number of objects with no repetitions. What we want is a set of those objects that satisfy the noun phrase.

To get a set from an atom representing a noun is no problem – I explained how to do it last time. Briefly, you regard the “contains” hierarchy of objects in the vicinity as if it were a tree, and you search through it (in an order determined by the verb) to pull out all those objects that are “a kind of” the atom that the noun represents. If you find none, it’s an error – “There is no elephant here”. If you find one or more, it’s a set.

Adjectives

Adjectives act as filters on a set. If binding the noun had given the set [obj1, obj2, obj3], of which obj2 and obj3 were small, then applying the adjective [small] to the set would result in [obj2, obj3] – because they’re the only small ones. This new set would then be filtered by later adjectives to pare it down further.

Articles are linked to the idea of plurals. THE SMALL WHITE STONE means one such stone, whereas THE SMALL WHITE STONES means all of them. A SMALL WHITE STONE also means just one, and A SMALL WHITE STONES will either be flagged as an error because it doesn’t fit your grammar or it will be let through as an act of compassion for illiterate players. Players tend to miss out articles anyway for reasons of speed, so most of the time you’ll be replying on some default (eg. assume THE is present).

In pseudocode, adjective processing would be:

{ adjective list }:
begin
    S is an empty list
    for each element E in the list,
        if adjective(E) is true then
            if E isn't a member of S then
                insert E into S
    return S
end

Articles

To apply the binding for an article, all the remaining noun modifiers must be applied first and the atom representing the noun must be recognisable as being singular or plural (words like EVERYTHING and TREASURE count as plural). For THE with a plural noun, the set produced is untouched; in all other cases, the set is reduced to one in length. You could be picky and in the case of THE complain if the set contained more than one element – GET THE NAIL when there are 50 present doesn’t actually make sense, but treating it like GET A NAIL and choosing one at random is fine (and kind to your players). Numerals work the same way, but they select more than one: GET 20 NAILS truncates the set of possible nails to 20 in length,

The code for articles and numerals (which I’m not going to give here – write it yourself!) is messier than for adjectives because it needs to know not only the binding achieved for the noun phrase thus far, but also the status of the governing noun (plural or singular) and the remaining work to be done on this noun phrase (so it can do it before truncating the set).

Conjunctions

There are two basic kinds of conjunction: inclusive and exclusive.

AND is an inclusive conjunction. Normally in set theory, AND would equate to the intersection of two sets, but here it means their union. If you say DROP BATS AND BALLS you mean drop the bats and drop the balls, not drop those things that are both a bat and a ball.

BUT is an exclusive conjunction. It works as a filter: anything that appears in the second set is removed from the first. DROP TREASURE BUT GEMS takes the set of objects that are bound to TREASURE, the set of objects that are bound to GEMS, and removes all elements present in the latter from the former.

OF is another exclusive conjunction. This one works as a straight intersection. GET CROCK OF GOLD means get the intersection of all those items present that are crocks and all those that are gold. It’s good for implementing playing cards: GET QUEEN OF SPADES.

You can have other conjunctions; in particular, players may try OR. Although DROP TREASURE OR GEMS really ought to choose randomly between the two lists (ie. just the treasure or just the gems), you can make OR a synonym of AND and get away with it.

Pronouns

Pronouns are notoriously tricky to get right in English, so the solution is to accept this and to be methodical about them instead . That way, the players will know what to expect even if it’s not the full extent of what English allows.

In MUD terms, pronouns are of two types: bound and unbound. A bound pronoun is one which is defined earlier in the input line, as in PICK UP THE ARROW AND PUT IT IN THE QUIVER. An unbound pronoun is one that isn’t, as in GET IT. For the former, you have to keep track of the last possible binding for every pronoun (IT, HIM, HER, THEM, ME etc.), which is yet another layer of tiresome complexity for the binder. For the latter, well you could do it by tracking objects mentioned in room descriptions if your system is good enough. Alternatively, you can simply bind it to a general class, eg. IT is treated like it was EVERYTHING and THEM is treated like it was EVERYONE. This isn’t so accurate, but it’s a reasonable hack that players will come to know and love, such that if you ever decide to change it they’ll moan at you...

HIM and HER border on being dynamically assigned. If you have one class for all the male characters and another for all the female characters, that’s fine. If not, though, then you need to use the class of all characters (EVERYONE), then apply male or female as an adjective to it. This would mean the binding involved some functionality, hence the term dynamically assigned.

In practice, any part of speech can be dynamically assigned if its meaning changes depending on who uses it. Examples: ME (pronoun), MY (adjective), FOE (noun – the person you’re fighting), SPEAKER (noun - the person who last spoke to you). Again, these are not difficult to implement in themselves, but they add further special cases to the binding process that complicate it still further.

Other Parts of Speech

How far you want to go with this is up to you. In MUD2 I handle comparatives and superlatives, so you can GET THE BIGGEST BOX. It does this by having an adjective associated with the superlative (SIZE, in this example) which produces an integer as an answer. The list is ordered by the integer results, and the first one is taken. GET THE BIGGEST 2 BOXES works too, as does GET THE LEAST BIGGEST BOX.

The job of the binder is, when it comes down to it, fairly simple. It has to find a functor (from the verb plus its modifiers) and a set of parameters (from the noun phrases plus their modifiers) and apply the functor to each parameter in turn. There are many special cases, which can complicate things hideously, but overall the core algorithm isn’t all that hard.

Next time, I’ll talk about how to make things a little simpler than they might otherwise become.

Recent Discussions on Notes from the Dawn of Time: