ASD grammar files are plain text files. An ASD grammar file
lists
words and phrase types in the form of a lexicon, in alphabetical order
with each word or phrase type followed by a list of its
instances.
Each instance is represented in a way that is useful to the ASD
software
tools. ASD grammar files can be printed and edited (carefully!)
with
any editor for text files. However, use of the ASDEditor tool is
recommended for editing them, to avoid creating corrupted grammar files
which will not be readable by the ASD tools. The file-type
convention
for ASD grammar files is .grm or .asd .
Details
of the format for ASD grammar files are provided below. Earlier,
prototype versions of the ASDParser and ASDEditor
were
implemented in Smalltalk, and a still earlier version of the parser was
implemented in Lisp. The same ASD grammar file format is used for
the implementations in Java, Lisp and Smalltalk.
ASD grammar files can be saved by the ASDEditor in two ways: (1) optimized for parsing and (2) unoptimized for parsing. The
ASDParser can use grammar files in either optimized or unoptimized
format, but it generally requires more parsing steps to parse phrases
when it uses an unoptimized grammar file. Grammars saved in unoptimized file format are
generally smaller than the same grammars saved in optimized file format. The
format of optimized grammar
files is described first below. Diffierences in the format of unoptimized grammar files are
described after that.
A grammar for English cardinal numbers is shown in cardinal.jpg in the graphical form in which it is displayed by ASDEditor. The contents of the file cardinal.grm below show how the same grammar is saved by ASDEditor as a file which can be used by ASDParser. It provides an example of the format of an optimized ASD grammar file. The file is organized as a lexicon, with entries for each word, phrase type, and punctuation mark in the grammar. The entries are listed in alphabetical order beginning with the entry for $$, which represents a null or dummy node. Each entry in the file is enclosed in parentheses, with the word or phrase type for the entry appearing immediately after the opening parenthesis. The list of instances for the entry is also enclosed in parentheses, and each instance is itself enclosed in parentheses. The layout produced by the ASDEditor puts the instances for an entry on separate lines, indented two spaces from the left margin, but that is for human readability only. When an ASD grammar file is read by the ASDEditor or ASDParser, line breaks and indenting (except in quoted strings) are ignored; the structure of the file is indicated entirely by the nested parentheses.
Each instance of a word, phrase type, or punctuation mark is represented by a parenthesized list of seven items:
The contents of the cardinal.grm file are as follows:
($$ (
(1 nil CARDINAL 'valueOfV' '' 586 200)
(2 nil CARDINAL 'valueOfVTimesM' '' 342 388)
))
(, (
(1 nil ((CARDINAL 2 401 321) (and 1 368 294)) (CARDINAL)
'' 319 287)
))
(- (
(1 nil ((UNIT 2 565 143)) (UNIT) '' 535 116)
))
(and (
(1 nil ((CARDINAL 2 455 321)) (CARDINAL) '' 409 287)
))
(CARDINAL (
(1 (CARDINAL) ((MULTIPLIER 1 157 349)) (MULTIPLIER)
'setVNodeValue'
57 342)
(2 nil CARDINAL 'valueOfVTimesMPlusV2' 'cardinal_2_action'
474 342)
))
(DECADE (
(1 (CARDINAL) ((UNIT 2 542 163) ($$ 1 542 185) (- 1 517
143)) (UNIT) 'setVNodeValue' 447 156)
))
(eight (
(1 (UNIT CARDINAL) UNIT '8' '' 10 220)
))
(eighteen (
(1 (CARDINAL) CARDINAL '18' '' 123 244)
))
(eighty (
(1 (DECADE CARDINAL) DECADE '80' '' 300 179)
))
(eleven (
(1 (CARDINAL) CARDINAL '11' '' 128 37)
))
(fifteen (
(1 (CARDINAL) CARDINAL '15' '' 130 156)
))
(fifty (
(1 (DECADE CARDINAL) DECADE '50' '' 299 94)
))
(five (
(1 (UNIT CARDINAL) UNIT '5' '' 17 126)
))
(forty (
(1 (DECADE CARDINAL) DECADE '40' '' 295 65)
))
(four (
(1 (UNIT CARDINAL) UNIT '4' '' 15 97)
))
(fourteen (
(1 (CARDINAL) CARDINAL '14' '' 122 127)
))
(hundred (
(1 (MULTIPLIER) MULTIPLIER '100' '' 439 7)
))
(million (
(1 (MULTIPLIER) MULTIPLIER '1000000' '' 440 64)
))
(MULTIPLIER (
(1 nil ((and 1 338 321) (CARDINAL 2 371 349) ($$ 2 305 372)
(, 1 293 321)) (CARDINAL) 'multiplier_1_action' 194 342)
))
(nine (
(1 (UNIT CARDINAL) UNIT '9' '' 12 253)
))
(nineteen (
(1 (CARDINAL) CARDINAL '19' '' 124 271)
))
(ninety (
(1 (DECADE CARDINAL) DECADE '90' '' 300 207)
))
(one (
(1 (UNIT CARDINAL) UNIT '1' '' 13 5)
))
(seven (
(1 (UNIT CARDINAL) UNIT '7' '' 10 187)
))
(seventeen (
(1 (CARDINAL) CARDINAL '17' '' 114 213)
))
(seventy (
(1 (DECADE CARDINAL) DECADE '70' '' 299 151)
))
(six (
(1 (UNIT CARDINAL) UNIT '6' '' 20 155)
))
(sixteen (
(1 (CARDINAL) CARDINAL '16' '' 129 186)
))
(sixty (
(1 (DECADE CARDINAL) DECADE '60' '' 300 122)
))
(ten (
(1 (CARDINAL) CARDINAL '10' '' 127 7)
))
(thirteen (
(1 (CARDINAL) CARDINAL '13' '' 121 96)
))
(thirty (
(1 (DECADE CARDINAL) DECADE '30' '' 294 35)
))
(thousand (
(1 (MULTIPLIER) MULTIPLIER '1000' '' 439 34)
))
(three (
(1 (UNIT CARDINAL) UNIT '3' '' 8 65)
))
(twelve (
(1 (CARDINAL) CARDINAL '12' '' 128 67)
))
(twenty (
(1 (DECADE CARDINAL) DECADE '20' '' 293 6)
))
(two (
(1 (UNIT CARDINAL) UNIT '2' '' 14 36)
))
(UNIT (
(1 (CARDINAL) CARDINAL 'nodeValue' '' 450 221)
(2 nil CARDINAL 'valueOfV' 'unit_2_action' 586 156)
))
(UNKNOWN (
(1 (UNKNOWNWORD CARDINAL) UNKNOWNWORD '' '' 30 434)
))
(UNKNOWNWORD (
(1 (CARDINAL) CARDINAL 'valueOfV' 'UNKNOWNCARDINAL_action'
252 434)
))
The unoptimized format for
ASD grammar files differs from the optimized format in just two ways,
involving the second and fourth items in the list of seven items
that represents an instance in the grammar.
For the second item:
If the instance corresponds to an initial node in the
grammar,
the second item is just the letter T (as "true"
is represented in the LISP language), rather than being a parenthesized
list of phrase types which can begin,
directly or indirectly, at that initial node. Otherwise, if the
instance does not correspond to an initial node in the grammar,
the second item is nil (as "false"
is represented in the LISP language). That is, the second item
simply indicates whether or not the node is an initial node, without
telling what phrase types can begin at that node.
The cardinal grammar
above, re-saved in unoptimized form, is shown below: (This
example does not include any instances for which the fourth item is nil.)
($$ (
(1 nil CARDINAL 'valueOfV'
'' 586 200)
(2 nil CARDINAL
'valueOfVTimesM' '' 342 388)
))
(, (
(1 nil ((CARDINAL 2 401
321) (and 1 368 294)) T '' 319 287)
))
(- (
(1 nil ((UNIT 2 565 143))
T '' 535 116)
))
(and (
(1 nil ((CARDINAL 2 455
321)) T '' 409 287)
))
(CARDINAL (
(1 T ((MULTIPLIER 1 157
349)) T 'setVNodeValue' 57 342)
(2 nil CARDINAL
'valueOfVTimesMPlusV2' 'cardinal_2_action' 474 342)
))
(DECADE (
(1 T ((UNIT 2 542 163) ($$
1 542 185) (- 1 517 143)) T 'setVNodeValue' 447 156)
))
(eight (
(1 T UNIT '8' '' 10 220)
))
(eighteen (
(1 T CARDINAL '18' '' 123
244)
))
(eighty (
(1 T DECADE '80' '' 300
179)
))
(eleven (
(1 T CARDINAL '11' '' 128
37)
))
(fifteen (
(1 T CARDINAL '15' '' 130
156)
))
(fifty (
(1 T DECADE '50' '' 299 94)
))
(five (
(1 T UNIT '5' '' 17 126)
))
(forty (
(1 T DECADE '40' '' 295 65)
))
(four (
(1 T UNIT '4' '' 15 97)
))
(fourteen (
(1 T CARDINAL '14' '' 122
127)
))
(hundred (
(1 T MULTIPLIER '100' ''
439 7)
))
(million (
(1 T MULTIPLIER '1000000'
'' 440 64)
))
(MULTIPLIER (
(1 nil ((and 1 338 321)
(CARDINAL 2 371 349) ($$ 2 305 372) (, 1 293 321)) T
'multiplier_1_action' 194 342)
))
(nine (
(1 T UNIT '9' '' 12 253)
))
(nineteen (
(1 T CARDINAL '19' '' 124
271)
))
(ninety (
(1 T DECADE '90' '' 300
207)
))
(one (
(1 T UNIT '1' '' 13 5)
))
(seven (
(1 T UNIT '7' '' 10 187)
))
(seventeen (
(1 T CARDINAL '17' '' 114
213)
))
(seventy (
(1 T DECADE '70' '' 299
151)
))
(six (
(1 T UNIT '6' '' 20 155)
))
(sixteen (
(1 T CARDINAL '16' '' 129
186)
))
(sixty (
(1 T DECADE '60' '' 300
122)
))
(ten (
(1 T CARDINAL '10' '' 127
7)
))
(thirteen (
(1 T CARDINAL '13' '' 121
96)
))
(thirty (
(1 T DECADE '30' '' 294 35)
))
(thousand (
(1 T MULTIPLIER '1000' ''
439 34)
))
(three (
(1 T UNIT '3' '' 8 65)
))
(twelve (
(1 T CARDINAL '12' '' 128
67)
))
(twenty (
(1 T DECADE '20' '' 293 6)
))
(two (
(1 T UNIT '2' '' 14 36)
))
(UNIT (
(1 T CARDINAL 'nodeValue'
'' 450 221)
(2 nil CARDINAL 'valueOfV'
'unit_2_action' 586 156)
))
(UNKNOWN (
(1 T UNKNOWNWORD '' '' 30
434)
))
(UNKNOWNWORD (
(1 T CARDINAL 'valueOfV'
'UNKNOWNCARDINAL_action' 252 434)
))
The grammar expression.grm
for arithmetic expressions, saved in optimized
file format is
($$ (
(1 nil TERM
'expression_$$_1_v' '' 186 453)
(2 nil EXPRESSION
'expression_$$_2_v' '' 586 105)
(3 nil FACTOR
'expression_$$_3_v' '' 272 181)
))
(* (
(1 (x) x '' '' 297 392)
))
(+ (
(1 nil ((TERM 1 503 81))
(TERM) 'expression_plus_1' 489 44)
))
(- (
(1 nil ((TERM 1 528 81))
(TERM) 'expression_minus_1' 542 43)
(2 (EXPRESSION FACTOR
TERM) ((FACTOR 2 105 231)) (FACTOR) '' 75 224)
))
(. (
(1 nil ((DIGITSTRING 2 242
124) ($$ 3 243 156)) (DIGITSTRING) '' 206 117)
(2 (EXPRESSION FACTOR
TERM) ((DIGITSTRING 2 242 103)) (DIGITSTRING)
'expression_DECIMALPOINT_2' 206 76)
))
(/ (
(1 nil ((FACTOR 1 126
431)) (FACTOR) 'expression_divided_by_1' 148 395)
))
([ (
(1 (EXPRESSION FACTOR
TERM) ((EXPRESSION 2 107 287)) (EXPRESSION) '' 77 280)
))
(] (
(1 nil FACTOR
'expression_RIGHT_BRACKET_v' '' 249 280)
))
(DIGITSTRING (
(1 (EXPRESSION FACTOR
TERM) ((. 1 181 124) ($$ 3 214 156)) () 'expression_DIGITSTRING_1' 79
117)
(2 nil FACTOR
'expression_DIGITSTRING_2_v' '' 270 117)
))
(EXPRESSION (
(1 nil ((RPAREN 1 273
340)) () 'expression_EXPRESSION' 173 333)
(2 nil ((] 1 228 287)) ()
'expression_EXPRESSION' 128 280)
))
(FACTOR (
(1 (EXPRESSION TERM) ((x 1
112 431) (/ 1 147 431) ($$ 1 166 460)) (x) 'expression_FACTOR_1' 95 453)
(2 nil FACTOR
'expression_FACTOR_2_v' '' 126 224)
))
(LPAREN (
(1 (EXPRESSION FACTOR
TERM) ((EXPRESSION 1 152 340)) (EXPRESSION) '' 80 333)
))
(NUMBER (
(1 (EXPRESSION DIGITSTRING
FACTOR TERM) DIGITSTRING 'expression_NUMBER_1_v' '' 77 35)
))
(RPAREN (
(1 nil FACTOR
'expression_RIGHT_BRACKET_v' '' 293 333)
))
(TERM (
(1 (EXPRESSION) ((+ 1 515
81) (- 1 542 81) ($$ 2 564 112)) () 'expression_TERM_1' 504 105)
))
(x (
(1 nil ((FACTOR 1 93 431))
(FACTOR) 'expression_times_1' 78 395)
))
The same grammar saved in unoptimized file format is shown
below. Notice, for example, that the fourth item is nil in the first (and only) instance
of the phrase type TERM,
indicating that no new subphrase can begin immediately after a part of
a parsed phrase that corresponds to node (TERM 1) in the grammar.
Such a nil has exactly the
same meaning as an empty list ()
in the optimized file
format. In fact, nil and
() can be used
interchangeably in grammar files which are used as input to ASDEditor
and ASDParser.
($$ (
(1 nil TERM
'expression_$$_1_v' '' 186 453)
(2 nil EXPRESSION
'expression_$$_2_v' '' 586 105)
(3 nil FACTOR
'expression_$$_3_v' '' 272 181)
))
(* (
(1 T x '' '' 297 392)
))
(+ (
(1 nil ((TERM 1 503 81)) T
'expression_plus_1' 489 44)
))
(- (
(1 nil ((TERM 1 528 81)) T
'expression_minus_1' 542 43)
(2 T ((FACTOR 2 105 231))
T '' 75 224)
))
(. (
(1 nil ((DIGITSTRING 2 242
124) ($$ 3 243 156)) T '' 206 117)
(2 T ((DIGITSTRING 2 242
103)) T 'expression_DECIMALPOINT_2' 206 76)
))
(/ (
(1 nil ((FACTOR 1 126
431)) T 'expression_divided_by_1' 148 395)
))
([ (
(1 T ((EXPRESSION 2 107
287)) T '' 77 280)
))
(] (
(1 nil FACTOR
'expression_RIGHT_BRACKET_v' '' 249 280)
))
(DIGITSTRING (
(1 T ((. 1 181 124) ($$ 3
214 156)) nil 'expression_DIGITSTRING_1' 79 117)
(2 nil FACTOR
'expression_DIGITSTRING_2_v' '' 270 117)
))
(EXPRESSION (
(1 nil ((RPAREN 1 273
340)) nil 'expression_EXPRESSION' 173 333)
(2 nil ((] 1 228 287)) nil
'expression_EXPRESSION' 128 280)
))
(FACTOR (
(1 T ((x 1 112 431) (/ 1
147 431) ($$ 1 166 460)) T 'expression_FACTOR_1' 95 453)
(2 nil FACTOR
'expression_FACTOR_2_v' '' 126 224)
))
(LPAREN (
(1 T ((EXPRESSION 1 152
340)) T '' 80 333)
))
(NUMBER (
(1 T DIGITSTRING
'expression_NUMBER_1_v' '' 77 35)
))
(RPAREN (
(1 nil FACTOR
'expression_RIGHT_BRACKET_v' '' 293 333)
))
(TERM (
(1 T ((+ 1 515 81) (- 1
542 81) ($$ 2 564 112)) nil 'expression_TERM_1' 504 105)
))
(x (
(1 nil ((FACTOR 1 93 431))
T 'expression_times_1' 78 395)
))