ASD example: a grammar and evaluator for arithmetic expressions

A grammar and evaluator for arithmetic expressions

This example involves an ASD grammar, expression.grm, for arithmetic expressions like
2.5x[100. - (.3 + 27 / (4-1))]
and a Java application, Evaluator.java, which uses the grammar and associated semantic computations to evaluate such expressions.

The expression grammar is best examined with ASDEditor, but here is an image of the grammar::
ASD arithmetic expression grammar image

NUMBER is a phrase type reserved by ASDParser. The parser accepts any string of contiguous digits as a NUMBER [but see Endnote 1 below].   ASDParser permits NUMBER to occur only in initial nodes in an ASD grammar. So the grammar redefines NUMBER here as a DIGITSTRING . [See also Endnote 2 below.] Consequently, the component of the grammar that includes node (DIGITSTRING 1) defines any string of digits with an optional leading, intermediate, or trailing decimal point as a FACTOR [but see Endnote 3 below]. That is on the assumption that an application like Evaluator.java which uses the grammar will first scan an utterance and construct a new utterance in which the decimal points (as well as minus signs) have been separated by spaces from the digits surrounding them to become separate lexical items.

The grammar also defines three other kinds of subexpressions as FACTORs: (1) a negated FACTOR, (2) an EXPRESSION in square brackets, and (3) an EXPRESSION surrounded by left and right parentheses. Because of the way in which ASD grammar files are constructed, using parentheses to delimit items in a file, parentheses cannot be used directly as lexical items in an ASD grammar. So this grammar assumes that the application, in its initial scan of a given utterance, will replace each left parenthesis by LPAREN and each right parenthesis by RPAREN. That same initial scan will separate digit strings, square brackets, and special characters (like those that represent the arithmetic operators) from one another by spaces, prior to the utterance being parsed by ASDParser.

Other components of the grammar define a sequence of one or more FACTORs, separated by operators x or /, as a TERM, and a sequence of one or more TERMs, separated by operators + or -, as an EXPRESSION. Notice how those components of the grammar express repetition and how the dummy nodes ($$ 1) and ($$ 2) are used as terminating nodes.

The isolated node (* 1) illustrates how easily another symbol like the asterisk can be made synonymous with the x that is used as the multiplication symbol elsewhere in the grammar. That purpose does not even require a semantic action or semantic value to be associated with the node.

Most of the nodes in the grammar have been augmented with semantic action or semantic value expressions which correspond to member functions of the Java application, Evaluator.java, that will use the grammar. Those can be seen by examining the grammar with ASDEditor, or by looking at the appropriate fields of the grammar in its text file, expression.grm. The following table lists nodes in the grammar and the semantic actions or semantic values associated with them. The nodes which have no associated semantic actions or values are not listed in the table.

       Grammar Node      Semantic Action            Semantic Value

       ($$ 1)                                       expression_$$_1_v
       ($$ 2)                                       expression_$$_2_v
       ($$ 3)                                       expression_$$_3_v
       (+ 1)             expression_plus_1
       (- 1)             expression_minus_1
       (. 2)             expression_DECIMALPOINT_2
       (/ 1)             expression_divided_by_1
       (] 1)                                        expression_RIGHT_BRACKET_v
       (DIGITSTRING 1)   expression_DIGITSTRING_1
       (DIGITSTRING 2)                            expression_DIGITSTRING_2_v
       (EXPRESSION 1)    expression_EXPRESSION
       (EXPRESSION 2)    expression_EXPRESSION
       (FACTOR 1)      expression_FACTOR_1
       (FACTOR 2)                                   expression_FACTOR_2_v
       (NUMBER 1)                                   expression_NUMBER_1_v
       (RPAREN 1)                                   expression_RIGHT_BRACKET_v
       (TERM 1)          expression_TERM_1
       (x 1)             expression_times_1

For example, the node (DIGITSTRING 1) has an associated semantic action field, 'expression_DIGITSTRING_1', which corresponds to the member function with the header
   public String expression_DIGITSTRING_1()
in Evaluator.java. Similarly, the node ($$ 2) has an associated semantic value field, 'expression_$$_2_v', which corresponds to the member function with header
   public Object expression_$$_2_v()
in Evaluator.java. Note: ASDParser requires semantic action functions to have return type String, while semantic value functions must have return type Object.

Notice the naming conventions that have been used in the example. Each semantic action or semantic value function begins with the name of the grammar file, in this case expression (without the .grm extension), in which the corresponding node occurs. That is done to distinguish functions for similar nodes which may occur in different grammar files of a modularly constructed grammar. Underscores in the function names join the file name to a word or phrase type like DIGITSTRING and to the instance number of its node (1 or 2) Another convention is that the names of semantic value functions end with "_v" to distinguish them from semantic action functions which may be associated with the same grammar node. Remember, these are only conventions. Nothing about ASD grammars or ASDParser requires them to be followed, but they are useful conventions for building large and modular grammars. You will notice that some of the semantic actions and semantic values in the table violate the naming conventions a bit. For instance, nodes (EXPRESSION 1) and (EXPRESSION 2) both happen to invoke the same semantic action expression_EXPRESSION. Similarly, nodes (] 1) and (RPAREN 1) both invoke the same semantic value function expression_RIGHT_BRACKET_v. Also, nodes like (+ 1) and (] 1) that have special characters in them have spelled-out semantic action and value names like expression_plus_1 and expression_RIGHT_BRACKET_v, because the special characters can't be used in Java function names.

The actual functions from Evaluator.java that correspond to the semantic action and semantic value fields in the expression.grm grammar are all listed below. [See also Endnote 4.]. These functions, invoked by an ASDParser as it parses an expression by following the grammar, do the work of calculating the numerical value of an expression. Most of the rest of the source code in Evaluator.java is for managing the graphical user interface, in addition to some for invoking, driving, and accessing an instance of ASDParser. You will notice that several of the functions have bodies identical with those of other functions. The duplication could be eliminated by appropriately renaming one of the functions and changing the semantic actions or semantic values in the relevant nodes of the grammar.

   public Object expression_$$_1_v()
   { return get("previous");
   }

   public Object expression_$$_2_v()
   { return get("previous");
   }

   public Object expression_$$_3_v()
   { String integerPart = (String) get("integerPart");
      return new Double(integerPart);
   }

   public String expression_DECIMALPOINT_2()
   { set("integerPart", "0");
      return null;
   }

   public String expression_DIGITSTRING_1()
   { set("integerPart", nodeValue());
      return null;
   }

   public Object expression_DIGITSTRING_2_v()
   { String integerPart = (String) get("integerPart");
      String fractionPart = (String) nodeValue();
      return new Double(integerPart + "." + fractionPart);
   }

   public String expression_divided_by_1()
   { set("operator", currentWord());
      return null;
   }

   public String expression_EXPRESSION()
   { set("value", nodeValue());
      return null;
   }

   public String expression_FACTOR_1()
   { Double previous = (Double) get("previous");
      String operator = (String) get("operator");
      Double current = (Double) nodeValue();
      if (previous == null) // first factor
         previous = current;
      else
      { double resultDouble = 0;
         if ("*".equals(operator))
            resultDouble = previous.doubleValue() * current.doubleValue();
         else if ("/".equals(operator))
         { if (current.doubleValue() == 0)
            { window.getOutputPane().append(
                  "\n*** Attempt to divide by 0 ***\n");
               window.setValueField("*** Attempt to divide by 0 ***");
               return parser.QUIT;
            }
            resultDouble = previous.doubleValue() / current.doubleValue();
         }
         else // shouldn't happen
            System.out.println("Invalid operator " + operator);
         previous = new Double(resultDouble);
      }
      set("previous", previous);
      return null;
   }

   public Object expression_FACTOR_2_v()
   { Double factor = (Double) nodeValue();
      return new Double(- factor.doubleValue());
   }

   public Object expression_NUMBER_1_v()
   { return currentWord();
   }

   public String expression_minus_1()
   { set("operator", currentWord());
      return null;
   }

   public String expression_plus_1()
   { set("operator", currentWord());
      return null;
   }

   public Object expression_RIGHT_BRACKET_v()
   { return get("value");
   }

   public String expression_TERM_1()
   { Double previous = (Double) get("previous");
      String operator = (String) get("operator");
      Double current = (Double) nodeValue();
      if (previous == null) // first term
         previous = current;
      else
      { double resultDouble = 0;
         if ("+".equals(operator))
            resultDouble = previous.doubleValue() + current.doubleValue();
         else if ("-".equals(operator))
            resultDouble = previous.doubleValue() - current.doubleValue();
         else // shouldn't happen
            System.out.println("Invalid operator " + operator);
         previous = new Double(resultDouble);
      }
      set("previous", previous);
      return null;
   }

   public String expression_times_1()
   { set("operator", "*");
      return null;
   }

Endnotes:

Actually ASDParser will accept any string of contiguous digits (limited as described in Endnote 3 below), including an optional initial minus sign as a NUMBER, but the Evaluator.java application separates minus signs from strings of following digits before it passes an expression to the parser. That is done so the grammar will not incorrectly accept expressions like 23.-45, with a minus sign after a decimal point, ambiguously as single numbers as well as subtractions.
You will notice that the keyword NUMBER does not appear in phrase structure trees that result from parsing with the example grammar. In place of NUMBER, the parser ensures that the actual numbers found in the given utterance appear in their proper places in the phrase structure tree.
Actually, the maximum length accepted by Java for integer constants is 9 digits, or in limited cases 10 digits. So Evaluator.java will accept at most nine or ten digits before a decimal point and at most nine or ten digits after a decimal point. In any case, you will notice that long numbers result in small round-off errors when Evaluator converts numbers to Java double form.
The Java functions for computing semantic actions and semantic values call on several other functions, with names set, get, currentWord, and nodeValue, which are short functions defined in Evaluator.java to abbreviate calls to functions that are defined in ASDParser. For example, the function call get("previous") abbreviates the direct call parser.get("previous"), and the call currentWord() abbreviates parser.currentNode().word().

[link to ASD home page]

page last modified 2005 Aug 23