Faster Shift-Reduce Constituent Parsing with a Non-Binary, Bottom-Up Strategy
We propose a novel non-binary shift-reduce algorithm for constituent parsing. Our parser follows a classical bottom-up strategy but, unlike others, it straightforwardly creates non-binary branchings with just one Reduce transition, instead of requiring prior binarization or a sequence of binary transitions. As a result, it uses fewer transitions per sentence than existing transition-based constituent parsers, becoming the fastest such system. Using static oracle training and greedy search, the accuracy of this novel approach is on par with state-of-the-art transition-based constituent parsers and outperforms all top-down and bottom-up greedy shift-reduce systems on WSJ and CTB. Additionally, we develop a dynamic oracle for training the proposed transition-based algorithm, achieving further improvements in both benchmarks and obtaining the best accuracy to date on CTB.
READ FULL TEXT