This is the final post in this three-part series. It’s been fun revisiting these concepts so they could be presented in a manner comprehensible enough to anyone with an interest in parsers. If you’ve read this far, I hope you’ve found value in my writing.
Let’s close this out.
This post discusses how to build the parse tables from the canonical collection of item sets.
1 Building the parse tables
Given a canonical collection of sets of items, with each set representing a parser state, the parser generator iterates through the items in each set and generates entries for two tables: Action tables and Goto tables.
Action table entries are of three kinds, each kind corresponding to a particular form of item.
[A > B *t D, e] indicates that encountering the terminal symbol ‘t’ would be a valid next step towards recognizing the non-terminal ‘A’. This generates a SHIFT action entry on ‘t’ in the current parser state.
[A > B t D*, e] indicates that the parser has recognized ‘B’, ‘t’ and ‘D’, in that order, and should (depending on precedence rules, more on that later) generate a REDUCE action entry for the production “A > B t D” on the terminal ‘e’ in the current state.
[A > B t D*, EOF] where A is the goal symbol, and EOF is the goal lookahead symbol, indicates the accepting state for the parser. This generates an ACCEPT action entry on EOF in the current state.
Note that in filling the action table, items where ‘*’ precedes a non-terminal symbol are ignored. SHIFT actions are generated when ‘*’ precedes a terminal and REDUCE actions generated when ‘*’ is at the right end of the production.
Goto table entries record the transitions between states, on a REDUCE action. That is, the parser state transition after REDUCE-ing a production RHS to it’s LHS.
Here’s the table-filling algorithm, where ‘i’, ‘j’ are subscripts representing the state index:
for each canonical set 'Ci' in the canonical collection 'CC':
for each item 'I' in 'Ci':
if I is [A > B *C, a] and goto(Ci, C) returns a set in CC, Cj:
generate a SHIFT action; Action[i, C] = SHIFT j
else if I is [A > B C*, a] then:
generate a REDUCE action; Action[i, C] = REDUCE "A > B C"
else if I is [A > B C*, eof]
where A is the goal symbol, eof is the goal lookahead symbol:
generate an ACCEPT action; Action[i, eof] = ACCEPT
for each non-terminal, n:
if goto(Ci, n) returns a set in CC, Cj:
Goto[i, n] = j
1.1 Function Overview: build_tables()
The build_tables() member function implements the table-filling algorithm above. It iterates over the sets of items in canonicalCollection. For each set, it generates entries to gotoTable for each non-terminal, and generates entries to actionTable for each item in the set.
Filling gotoTable is trivial:
Filling actionTable however, is a little more complicated, particularly with resolving conflicts in action entries.
REDUCE-REDUCE conflicts —— resolved in check_symbols_and_productions() —— occur when more than one production share the same RHS. SHIFT-REDUCE conflicts —— resolved here —— occur when the table generation algorithm can generate equally valid SHIFT and REDUCE entries given a particular state and terminal symbol.
When an item in a set is of form #1, the function checks that actionTable doesn’t already contain an entry for the current state and current terminal symbol. If it doesn’t, it generates a SHIFT entry to the next state returned on invoking goto_function() with the current state and current terminal symbol.

If an entry already exists *and* this existing entry is of type REDUCE, it invokes set_last_terminal() to initialize local variables with the last terminal precedence and last terminal associativity values. It then checks the following conditions in order:
If current item production precedence is greater than the current terminal precedence or last terminal precedence is 0, the existing REDUCE entry is valid.
If the last terminal precedence is greater than the current terminal precedence, the existing REDUCE entry is valid.
If the last terminal precedence is equal to the current terminal precedence *and* last terminal is left-associative, the existing REDUCE entry is valid.
Otherwise, the existing REDUCE entry is invalid and should be updated with a SHIFT entry to the next state returned on invoking goto_function() with the current state and current terminal symbol.
Here’s what that looks like in actual code:

When, instead, an item is of form #2 (that is ‘*’ is at the end of a production), the function checks that index 0 into the item production (a std::vector<std::string_view>) is equal to the goal production LHS symbol *and* the item lookahead is equal to the goal production lookahead symbol/terminal. If it is, an ACCEPT entry is generated and added to actionTable.

Otherwise, the function has to decide based on precedence and associativity rules, whether it should REDUCE the item production RHS to the item production LHS *or* it should SHIFT to the item lookahead symbol/terminal. It checks the following conditions in order:
If current item production precedence is greater than the current terminal precedence or last terminal precedence is 0, generate a REDUCE entry.
If the last terminal precedence is greater than the current terminal precedence, generate a REDUCE entry.
If the last terminal precedence is equal to the current terminal precedence *and* last terminal is left-associative, generate a REDUCE entry.
Otherwise, generate a SHIFT entry to the next state returned on invoking goto_function() with the current state and current item lookahead symbol/terminal.
Here’s what that looks like in actual code:

It should be noted that the value to SHIFT entries in actionTable is the parser next state index, while the value to REDUCE entries in actionTable is an index to reduce_info. This index, when accessed, contains the LHS symbol to be reduced to and how many RHS symbols to be popped off the parser stack.
And that’s it. You now have a working parser generator for an LR(1) grammar in 878 lines of intermediate C++ code.
Thanks for reading. Leave a star on the repo if you’d like!
https://github.com/wldfngrs/parser-generator