Natural Language Aggregate Query over RDF Data
Natural language question-answering over RDF data has received widespread attention. Although there have been several studies that have dealt with a small number of aggregate queries, they have many restrictions (i.e., interactive information, controlled question or query template). Thus far, there has been no natural language querying mechanism that can process general aggregate queries over RDF data. Therefore, we propose a framework called NLAQ (Natural Language Aggregate Query). First, we propose a novel algorithm to automatically understand a users query intention, which mainly contains semantic relations and aggregations. Second, to build a better bridge between the query intention and RDF data, we propose an extended paraphrase dictionary ED to obtain more candidate mappings for semantic relations, and we introduce a predicate-type adjacent set PT to filter out inappropriate candidate mapping combinations in semantic relations and basic graph patterns. Third, we design a suitable translation plan for each aggregate category and effectively distinguish whether an aggregate item is numeric or not, which will greatly affect the aggregate result. Finally, we conduct extensive experiments over real datasets (QALD benchmark and DBpedia), and the experimental results demonstrate that our solution is effective.
READ FULL TEXT