Scaling Up Toward Automated Black-box Reverse Engineering of Context-Free Grammars

08/11/2023
by   Mohammad Rifat Arefin, et al.
0

Black-box context-free grammar inference is a hard problem as in many practical settings it only has access to a limited number of example programs. The state-of-the-art approach Arvada heuristically generalizes grammar rules starting from flat parse trees and is non-deterministic to explore different generalization sequences. We observe that many of Arvada's generalization steps violate common language concept nesting rules. We thus propose to pre-structure input programs along these nesting rules, apply learnt rules recursively, and make black-box context-free grammar inference deterministic. The resulting TreeVada yielded faster runtime and higher-quality grammars in an empirical comparison.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset