A Joint Model for Multimodal Document Quality Assessment

01/04/2019
by   Aili Shen, et al.
26

The quality of a document is affected by various factors, including grammaticality, readability, stylistics, and expertise depth, making the task of document quality assessment a complex one. In this paper, we explore this task in the context of assessing the quality of Wikipedia articles and academic papers. Observing that the visual rendering of a document can capture implicit quality indicators that are not present in the document text --- such as images, font choices, and visual layout --- we propose a joint model that combines the text content with a visual rendering of the document for document quality assessment. Experimental results over two datasets reveal that textual and visual features are complementary, achieving state-of-the-art results.

READ FULL TEXT
research
06/05/2019

Towards Document Image Quality Assessment: A Text Line Based Framework and A Synthetic Text Line Image Dataset

Since the low quality of document images will greatly undermine the chan...
research
10/24/2019

Comparison of Quality Indicators in User-generated Content Using Social Media and Scholarly Text

Predicting the quality of a text document is a critical task when presen...
research
08/13/2020

Cognitive Representation Learning of Self-Media Online Article Quality

The automatic quality assessment of self-media online articles is an urg...
research
01/24/2022

Cross-Domain Document Layout Analysis via Unsupervised Document Style Guide

The document layout analysis (DLA) aims to decompose document images int...
research
08/15/2023

MultiSChuBERT: Effective Multimodal Fusion for Scholarly Document Quality Prediction

Automatic assessment of the quality of scholarly documents is a difficul...
research
01/28/2023

Layout-aware Webpage Quality Assessment

Identifying high-quality webpages is fundamental for real-world search e...
research
09/19/2019

An Edit-centric Approach for Wikipedia Article Quality Assessment

We propose an edit-centric approach to assess Wikipedia article quality ...

Please sign up or login with your details

Forgot password? Click here to reset