08/12/2020

A Semi-Supervised BERT Approach for Arabic Named Entity Recognition

Chadi Helwe, Ghassan Dib, Mohsen Shamas, Shady Elbassuoni

Keywords:

Abstract: Named entity recognition (NER) plays a significant role in many applications such as information extraction, information retrieval, question answering, and even machine translation. Most of the work on NER using deep learning was done for non-Arabic languages like English and French, and only few studies focused on Arabic. This paper proposes a semi-supervised learning approach to train a BERT-based NER model using labeled and semi-labeled datasets. We compared our approach against various baselines, and state-of-the-art Arabic NER tools on three datasets: AQMAR, NEWS, and TWEETS. We report a significant improvement in F-measure for the AQMAR and the NEWS datasets, which are written in Modern Standard Arabic (MSA), and competitive results for the TWEETS dataset, which contains tweets that are mostly in the Egyptian dialect and contain many mistakes or misspellings.

The video of this talk cannot be embedded. You can watch it here:
https://underline.io/lecture/6532-a-semi-supervised-bert-approach-for-arabic-named-entity-recognition
(Link will open in new window)
 0
 0
 0
 0
This is an embedded video. Talk and the respective paper are published at COLING Workshops 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment
no comments yet
code of conduct: tbd Characters remaining: 140

Similar Papers