Resource-Size matters: Improving Neural Named Entity Recognition with Optimized Large Corpora

07/26/2018

∙

This study improves the performance of neural named entity recognition by a margin of up to 11 German, thereby outperforming existing baselines and establishing a new state-of-the-art on each single open-source dataset. Rather than designing deeper and wider hybrid neural architectures, we gather all available resources and perform a detailed optimization and grammar-dependent morphological processing consisting of lemmatization and part-of-speech tagging prior to exposing the raw data to any training process. We test our approach in a threefold monolingual experimental setup of a) single, b) joint, and c) optimized training and shed light on the dependency of downstream-tasks on the size of corpora used to compute word embeddings.

READ FULL TEXT

Resource-Size matters: Improving Neural Named Entity Recognition with Optimized Large Corpora

Sign in with Google

Consider DeepAI Pro