WHOSe Heritage: Classification of UNESCO World Heritage "Outstanding Universal Value" Documents with Smoothed Labels

04/12/2021
by   Nan Bai, et al.
0

The UNESCO World Heritage List (WHL) is to identify the exceptionally valuable cultural and natural heritage to be preserved for mankind as a whole. Evaluating and justifying the Outstanding Universal Value (OUV) of each nomination in WHL is essentially important for a property to be inscribed, and yet a complex task even for experts since the criteria are not mutually exclusive. Furthermore, manual annotation of heritage values, which is currently dominant in the field, is knowledge-demanding and time-consuming, impeding systematic analysis of such authoritative documents in terms of their implications on heritage management. This study applies state-of-the-art NLP models to build a classifier on a new real-world dataset containing official OUV justification statements, seeking an explainable, scalable, and less biased automation tool to facilitate the nomination, evaluation, and monitoring processes of World Heritage properties. Label smoothing is innovatively adapted to transform the task smoothly between multi-class and multi-label classification by adding prior inter-class relationship knowledge into the labels, improving the performance of most baselines. The study shows that the best models fine-tuned from BERT and ULMFiT can reach 94.3 which is promising to be further developed and applied in heritage research and practice.

READ FULL TEXT
research
06/12/2023

Imbalanced Multi-label Classification for Business-related Text with Moderately Large Label Spaces

In this study, we compared the performance of four different methods for...
research
03/11/2022

verBERT: Automating Brazilian Case Law Document Multi-label Categorization Using BERT

In this work, we carried out a study about the use of attention-based al...
research
03/02/2023

Adopting the Multi-answer Questioning Task with an Auxiliary Metric for Extreme Multi-label Text Classification Utilizing the Label Hierarchy

Extreme multi-label text classification utilizes the label hierarchy to ...
research
08/15/2020

Label-Wise Document Pre-Training for Multi-Label Text Classification

A major challenge of multi-label text classification (MLTC) is to stimul...
research
02/04/2022

Extracting Software Requirements from Unstructured Documents

Requirements identification in textual documents or extraction is a tedi...
research
03/22/2021

Hybrid Model for Patent Classification using Augmented SBERT and KNN

Purpose: This study aims to provide a hybrid approach for patent claim c...
research
06/15/2020

Document Classification for COVID-19 Literature

The global pandemic has made it more important than ever to quickly and ...

Please sign up or login with your details

Forgot password? Click here to reset