Weisfeiler-Leman in the BAMBOO: Novel AMR Graph Metrics and a Benchmark for AMR Graph Similarity

08/26/2021
by   Juri Opitz, et al.
0

Several metrics have been proposed for assessing the similarity of (abstract) meaning representations (AMRs), but little is known about how they relate to human similarity ratings. Moreover, the current metrics have complementary strengths and weaknesses: some emphasize speed, while others make the alignment of graph structures explicit, at the price of a costly alignment step. In this work we propose new Weisfeiler-Leman AMR similarity metrics that unify the strengths of previous metrics, while mitigating their weaknesses. Specifically, our new metrics are able to match contextualized substructures and induce n:m alignments between their nodes. Furthermore, we introduce a Benchmark for AMR Metrics based on Overt Objectives (BAMBOO), the first benchmark to support empirical assessment of graph-based MR similarity metrics. BAMBOO maximizes the interpretability of results by defining multiple overt objectives that range from sentence similarity objectives to stress tests that probe a metric's robustness against meaning-altering and meaning-preserving graph transformations. We show the benefits of BAMBOO by profiling previous metrics and our own metrics. Results indicate that our novel metrics may serve as a strong baseline for future work.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/14/2022

SBERT studies Meaning Representations: Decomposing Sentence Embeddings into Explainable AMR Meaning Features

Metrics for graph-based meaning representations (e.g., Abstract Meaning ...
research
05/24/2022

A Dynamic, Interpreted CheckList for Meaning-oriented NLG Metric Evaluation – through the Lens of Semantic Similarity Rating

Evaluating the quality of generated text is difficult, since traditional...
research
01/29/2020

AMR Similarity Metrics from Principles

Different metrics have been proposed to compare Abstract Meaning Represe...
research
07/06/2022

Identifying and Mitigating Flaws of Deep Perceptual Similarity Metrics

Measuring the similarity of images is a fundamental problem to computer ...
research
07/31/2022

Evaluating Table Structure Recognition: A New Perspective

Existing metrics used to evaluate table structure recognition algorithms...
research
09/21/2023

ContextRef: Evaluating Referenceless Metrics For Image Description Generation

Referenceless metrics (e.g., CLIPScore) use pretrained vision–language m...
research
03/09/2021

Graph Metrics for Internet Robustness – A Survey

Research on the robustness of the Internet has gained critical importanc...

Please sign up or login with your details

Forgot password? Click here to reset