MERF: Morphology-based Entity and Relational Entity Extraction Framework for Arabic

09/17/2017
by   Amin A. Jaber, et al.
0

Rule-based techniques and tools to extract entities and relational entities from documents allow users to specify desired entities using natural language questions, finite state automata, regular expressions, structured query language statements, or proprietary scripts. These techniques and tools require expertise in linguistics and programming and lack support of Arabic morphological analysis which is key to process Arabic text. In this work, we present MERF; a morphology-based entity and relational entity extraction framework for Arabic text. MERF provides a user-friendly interface where the user, with basic knowledge of linguistic features and regular expressions, defines tag types and interactively associates them with regular expressions defined over Boolean formulae. Boolean formulae range over matches of Arabic morphological features, and synonymity features. Users define user defined relations with tuples of subexpression matches and can associate code actions with subexpressions. MERF computes feature matches, regular expression matches, and constructs entities and relational entities from user defined relations. We evaluated our work with several case studies and compared with existing application-specific techniques. The results show that MERF requires shorter development time and effort compared to existing techniques and produces reasonably accurate results within a reasonable overhead in run time.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset