News from Sapienza NLP

Sapienza NLP @ EMNLP 2022

3 papers at EMNLP!

Sapienza NLP has 3 papers accepted at EMNLP 2022! Check out our works on Semantic Role Labeling with Definition Modeling, Machine Translation Evaluation, and Euphemism detection! Here's the list of the EMNLP 2022 accepted papers:

  • Semantic Role Labeling Meets Definition Modeling: Using Natural Language to Describe Predicate-Argument Structures
  • MATESE: Machine Translation Evaluation as a Sequence Tagging Problem
  • EUREKA: EUphemism Recognition Enhanced through Knn-based methods and Augmentation


Semantic Role Labeling Meets Definition Modeling: Using Natural Language to Describe Predicate-Argument Structures

by S. Conia, E. Barba, A. Scirè, and R. Navigli

One of the common traits of past and present approaches for Semantic Role Labeling (SRL) is that they rely upon discrete labels drawn from a predefined linguistic inventory to classify predicate senses and their arguments. However, we argue this need not be the case. In this paper, we present an approach that leverages Definition Modeling to introduce a generalized formulation of SRL as the task of describing predicate-argument structures using natural language definitions instead of discrete labels. Our novel formulation takes a first step towards placing interpretability and flexibility foremost, and yet our experiments and analyses on PropBank-style and FrameNet-style, dependency-based and span-based SRL also demonstrate that a flexible model with an interpretable output does not necessarily come at the expense of performance. We release our software for research purposes at https://github.com/SapienzaNLP/dsrl .


MATESE: Machine Translation Evaluation as a Sequence Tagging Problem

by S. Perrella, L. Proietti, A. Scirè, N. Campolungo, and R. Navigli

Starting from last year, WMT human evaluation has been performed within the Multidimensional Quality Metrics (MQM) framework, where human annotators are asked to identify error spans in translations, alongside an error category and a severity. In this paper, we describe our submission to the WMT 2022 Metrics Shared Task, where we propose using the same paradigm for automatic evaluation: we present the MATESE metrics, which reframe machine translation evaluation as a sequence tagging problem. Our submission also includes a reference-free metric, denominated MATESE-QE. Despite the paucity of the openly available MQM data, our metrics obtain promising results, showing high levels of correlation with human judgements, while also enabling an evaluation that is interpretable. Moreover, MATESE-QE can also be employed in settings where it is infeasible to curate reference translations manually.


Eureka: EUphemism Recognition Enhanced Through KNN-based Methods and Augmentation

by S. S. Keh, R. Bharadwaj, E. Liu, S. Tedeschi, V. Gangal, and R. Navigli

We introduce EUREKA, an ensemble-based approach for performing automatic euphemism detection. We (1) identify and correct potentially mislabelled rows in the dataset, (2) curate an expanded corpus called EuphAug, (3) leverage model representations of Potentially Euphemistic Terms (PETs), and (4) explore using representations of semantically close sentences to aid in classification. Using our augmented dataset and kNN-based methods, EUREKA1 was able to achieve state-of-the-art results on the public leaderboard of the Euphemism Detection Shared Task, ranking first with a macro F1 score of 0.881.