Notes from the Dawn of Time

Notes from the Dawn of Time #9:

Introduction to Parsing

by Richard Bartle
January 2, 2002

In the context of MUDs, parsing is the process by which players’ input is converted into action.

Well, OK, there is a second use — talking to mobiles — but that’s a separate topic. I’m going to be talking about command parsing: how your typing KILL OGRE eventually gets to initiate a fight between your character and some nearby ogre.

Natural Language Understanding is a very large and difficult field of Artificial Intelligence. It isn’t made any easier by the fact that people and computers are good at different things. People excel at bringing all kinds of knowledge to bear in understanding an utterance, and all at the same time; computers do well at following fixed grammatical rules.

Good for people: "The judge jailed the politician because he was corrupt". A human usually knows automatically from the context whether "he" means the judge or the politician.

Good for computers: "The mouse the cat the dog chased bit died". Er, poor mouse?

MUDs have to run on computers. They must parse input as quickly as possible, and do so without the aid of parallel processors. This will doubtless change as AI techniques advance, but it hasn’t changed for the past 15 years and might not for the next 15 either. Under these conditions, how can a simple MUD programmer hope to make a reasonable job of getting their game to accept anything other than the most basic of sentences?

The Problem Space

Fortunately, MUDs have a number of things going for them (at least if the mother tongue of their players is English) that general-purpose Natural Language Understanding systems don’t.

Input is mainly in the form of imperative commands (OPEN THE DOOR). These have a very predictable format in most major human languages. There is no requirement to parse indicatives (THE DOOR IS OPEN) or subjunctives (IF THE DOOR IS LOCKED THEN OPEN IT).
All commands are in the present simple tense. Don’t worry about STOLEN THE BAG, STEALING THE BAG, HAD BEEN STEALING THE BAG or anything else bizarre: just STEAL THE BAG.
All commands are in the active voice: WALK WEST rather than BE WALKED WEST.
Because they want to do things quickly, and because they can’t all type very well, players tend to use very short sentences. In my own MUD2, over 60% of input consists of simple one-letter or two-letter movement commands (N, NE, E etc.).
Players want to be understood. If DRINK POTION works and DRINK MY POTION doesn’t work, then they’ll quickly learn to omit the MY.
Games are allowed to understand nonsense. It doesn’t matter if their world reacts when players type COIN BOX IN PUT because no-one is going to type that anyway unless they’re deliberately testing the parser.

The only other type of input likely to be presented to a MUD is the question. In English, some of these can be can be mapped with minor differences into the same format as commands (WHERE IS THE SWORD), but there can be deeper problems for queries such as IS THE SWORD IN THE SCABBARD. I won’t be going into these in detail, but suffice to say they’re doable.

Even given the above constraints, however, MUDs still don’t have a hope of being able to make sense of every command given to them — they can never understand all of English (or whatever). What they can do is understand enough of a subset of the language that players can express their desire for action without having to think too hard about it.

In other words, the parser accepts commands in a language which, although not actually English, is enough like it that it doesn’t give players a problem. The more this language resembles English, the more an English-speaker will be able to use it without being inconvenienced.

Partitioning

Although human beings can and do perform the entire parsing process in real time on-the-fly, computer games don’t get to do that (not just yet!). Instead, the overall parsing process is partitioned into a number of sub-processes (passes), generally performed sequentially in order:

Command line input.
Pre-processing.
Tokenisation.
Parsing.
Binding.

(Execution follows binding).

Command line input delivers an ordered set of ASCII characters constructed at the whim of the player. Basically, people type things, they backspace, retype, delete some more, type some more, and so on until they hit the return key. The entire line (which may equate to several commands) is then made available for pre-processing. Pre-processing tidies up what the player typed so that it fits some standard form. This normally involves compacting white space, converting everything to same-case, performing any macro expansions and stripping anything that’s out of range (e.g. control characters). It makes the step that follows run a lot faster. Tokenisation takes the now standard-form input line and splits it into individual units (symbols). These units will typically concern integers, punctuation, quoted strings and words (separated from each other by white space). For the integers and strings, tokenisation creates a record to represent them; for words or punctuation, it looks them up in a table and extracts a predefined record from there. These records represent "tokens" — semantic entities with potentially several meanings, waiting to be attributed to particular parts of speech. The system reports an "unknown word" error if it can’t find a definition for a word (i.e. can’t tokenise it).

Pre-processing and tokenisation together comprise the lexical analyser module of the system. It sometimes happens that you need to run them in tandem, rather than in sequence. This is particularly the case if you have verbs that imply that everything following them ought to be a string (e.g. some players prefer SAY HI THERE PEEPS! to SAY "HI THERE PEEPS!").

Parsing, although also applied to the entire process, properly refers only to the specific task of taking a list of tokens and from it creating a meaningful data structure. To do this, it follows a grammar — a set of recursive rules that defines all possible valid input sentences. If the parse fails, the input could not be understood and the player who typed it should be given some indication of where the problem lies; if it succeeds, the parser’s output is a series of individual commands each of which is represented as a tree of elements (in MUD2’s case, from the hierarchy that I described in my opening articles to this series).

The final step is binding. This takes the parsed data structure and applies it to the player’s situation in the game world to generate actual commands that can be executed. In many ways, it’s the most challenging part of the whole process. If a player types HIT ELF and there are 20 elves in the game, which one does the player mean? The one in the same room. What if there are several in the same room? What if the input was SUMMON ELF, or some other spell that could apply to any elf wherever it was? Or SUMMON DARK ELF? It’s the binder’s job to decide all this. Once it has done so, parsing is formally over and the resulting world-changing functions are invoked: [SUMMON ELF21]() or whatever.

Note that although I’m talking about parsing primarily in the context of text MUDs, the same principles apply to graphical MUDs. The interface differs and the text generated by the interface differs (it’ll be in the more rigid format of a packet rather than a stream of freeform ASCII characters) but the general process is pretty much the same: read data; clean up data; formalise data; structure data; map data to code; execute code.

OK, well that’s given you a brief overview of the architecture of a parser. Next time, I’ll begin explaining how the individual components work.

Recent Discussions on Notes from the Dawn of Time: