Building Working Models of Full Natural-Language Understanding
in Limited Pragmatic Domains
James A. Mason - 2010 May 17, 20, 26; June 8; 2012 Aug 16; 2012 Sep
Keywords: English language understanding , natural-language
processing , NLP , NLU , computational linguistics , dialog system ,
Java , Augmented Syntax Diagram , ASD , playing card
My long-term research project now is to build working models that
understand English as an English-speaker does, in realistic
pragmatic domains. Of course, for the foreseeable future, such
models will require pragmatic domains which are restricted to ones
that can be modeled completely on a computer. Nevertheless,
limited pragmatic domains can permit us to explore and model
thoroughly many detailed syntactic and semantic structures of
English, most of which structures should generalize well to
less-limited pragmatic domains. In particular, we should be
able to model most of the syntax and semantics of the so-called
"function words" of English -- articles and other determiners in
noun phrases, conjuctions, and prepositions -- as contrasted with
the "content words" -- nouns, adjectives, verbs and adverbs.
Function words are also sometimes referred to as "closed class"
words, belonging to syntactic classes to which new words are almost
never added to the language. Content words are sometimes
referred to as "open class" words, belonging to syntactic classes to
which new words are frequently added.
I have chosen ordinary playing
cards as the basis for a first pragmatic domain for which
to build models of English-language understanding. That domain
is simple enough to be modeled fairly easily in computer software,
yet it is rich enough to allow exploration of many syntactic and
semantic features of English. I am building a succession of
models of English-language understanding for that domain, which I
call CardWorld. The
first two implementations are CardWorld1 and CardWorld2, which are available from
this web site in both compiled and open-source form. The
latest model can also be run from this link
as a Java Web Start applet created by Roxanne Parent. The
CardWorld models can be used with various kinds of input, including
stylus and touch-screen pointing, and English input by keyboard as
well as spoken English input using a program like Dragon Naturally
Speaking as a front-end.
Documentation for the first two versions CardWorld is provided in CardWorld1Documentation.html
and CardWorld2Documentation.html
. It should be noted that, for setting and getting values of
semantic feature variables, CardWorld1 and CardWorld2 use only the
basic tools provided by ASDParser and ASDDecider. They do not require use of the
SemanticValue class hierarchy.
CardWorld models can be extended in many directions, including
Permit other operations on playing cards and collections of cards:
- accepting pointing gestures to more than one location per
input utterance
- moving cards into various kinds of collections -- e.g. hands,
drawing piles, discard piles
- finding specific cards by extrinsic description -- e.g. by
position in a collection
- counting cards [with specific descriptions] in given
- sorting cards according to specific descriptions
- selecting cards at random from a collection
- rotating card images
- following rules of various card games
Introduce additional agents into a card world, give them different
views of that world, and allow various kinds of communication among
Add semantic structures required for the extended pragmatics,
- structures to represent extrinsic and intrinsic description
- structures to represent various kinds of collections
- structures to represent various sorting orders for cards
- structures to represent exact and vague quantities
- structures to represent random selection
- structures to represent angles of rotation of cards, in
addition to position
- structures to represent viewpoints of cards and piles from
various agents
- structures to represent actions in games
Extensions like those may need the SemanticValue class hierarchy and
Add vocabulary and grammar structures required for the extended
pragmatics and semantics, including
- the words "rank", "suit", "deck", "hand", "draw[ing]",
"discard", "deal[er]" and others required for games
- the quantifier "some", and vocabulary for exact (e.g. "five")
and vague (e.g. "a few") quantities
- words "top", "bottom", and ordinals "first", "second", etc.
- words for asking questions -- "how many", "where", "which",
- syntactic structures for prepositional phrases, relative
clauses, conjunctions of noun phrases, and conjunctions of more
than two clauses
All such syntactic extensions can be accomplished with ASDEditor and