The operator precedence parsing is a parsing technique for solving numerical expressions.

It begins with the declaration of two stacks: one for the operators (like +, -) and one for numbers.

In my code, an enumeration describes the operators (even T_VAL, which actually stays for a number), while the numbers are stored in double variables.

getNextOperator() scans the input and recognizes the next token or possibly invalid characters.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26

/* * An enumeration describing the operators * T_LPR and T_RPR represent the brackets * T_ERROR is a special value for error conditions */ typedef enum { T_ADD, T_SUB, T_MUL, T_DIV, T_POW, T_MINUS, T_LPR, T_EOF, T_RPR, T_VAL, T_ERROR } Operators; /* * Base structs for the two stacks */ struct operator { Operators sOp; struct operator *next; }; struct value { double n; struct value *next; }; typedef struct operator SOP_t; typedef struct value value_t;

A brief note: most operations are left-associative. If expression is 2 + 1 + 1, first you compute 2 + 1 and then you add 1 to the result. Few are right-associative: in our case an example is the power operator. 2^3^2 = 2^9, because (3^2) is computed first. Of course this behaviour can be modified with the brackets.

At this point, if the current input is a number, it is shifted (that is, pushed) on its stack.

If it's an operator:

- if the current operator has more precedence than the one on the stack's top, it is pushed on the stack.

- Otherwise (less or same precedence), reduce() is called. The operator on top is taken, the right amount (one or two) operands are taken too, the result is computed and pushed again. The current operator is shifted too on its stack.

You should remember than the second number taken from the value stack is actually the first to consider while doing the operation: it's important to note that because some operations (like the division) are not commutative!

- When we have an open bracket, it is shifted on the stack (note than the precedence table always reports a shift as the action).

- When we have a closed bracket, we invoke a reduce() on everything found on the operator stack until the opened bracket. If it's not present, there's an error.

The brackets are always discarded after this.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46

do { curOp = getNextOperator(&equation, num, sHead, vHead); if (curOp == T_VAL) /* values are always shifted on the stack */ { shiftOnValues(&vHead, atof(num)); } else if (curOp < T_EOF) /* we have an operator */ { if (!sHead || precedence[sHead->sOp][curOp] == 'S') { shiftOnOperators(&sHead, curOp); } else { res = reduce(&sHead, &vHead); shiftOnValues(&vHead, res); /* shifts the result of the operation */ shiftOnOperators(&sHead, curOp); } } else if (curOp == T_RPR) { /* First checks if the bracket has been opened */ if (lookFor(sHead, T_LPR) == -1) { exitOnError(STRANGE_BRACKETS, sHead, vHead); } /* * Reduces everything until the bracket */ while (sHead->sOp != T_LPR) { res = reduce(&sHead, &vHead); shiftOnValues(&vHead, res); } curOp = popOperator(&sHead); /* discards the bracket in question */ } if (errorCode != 0) /* an error occured! */ { exitOnError(errorCode, sHead, vHead); } } while (curOp != T_EOF);

When the string has been parsed, we execute a reduce() on everything remained. If everything went fine, the operator stack will be empty and on the value stack only the final result is present.

It's very useful to define a precedence table (a matrix) which help to quickly decide the next action (shift or reduce). The row index are the code for the operator currently on the stack, and the column index is the current operator returned by getNextOperator();

1 2 3 4 5 6 7 8

char precedence[PR_SIZE][PR_SIZE] = {{'R', 'R', 'S', 'S', 'S', 'S', 'S'}, {'R', 'R', 'S', 'S', 'S', 'S', 'S'}, {'R', 'R', 'R', 'R', 'S', 'S', 'S'}, {'R', 'R', 'R', 'R', 'S', 'S', 'S'}, {'R', 'R', 'R', 'R', 'S', 'R', 'S'}, {'R', 'R', 'R', 'R', 'R', 'R', 'S'}, {'S', 'S', 'S', 'S', 'S', 'S', 'S'}};

The whole code is in the Codes section.