Classifier Ensembles for Dialect and Language Variety Identification

08/14/2018
by   Liviu P. Dinu, et al.
0

In this paper we present ensemble-based systems for dialect and language variety identification using the datasets made available by the organizers of the VarDial Evaluation Campaign 2018. We present a system developed to discriminate between Flemish and Dutch in subtitles and a system trained to discriminate between four Arabic dialects: Egyptian, Levantine, Gulf, North African, and Modern Standard Arabic in speech broadcasts. Finally, we compare the performance of these two systems with the other systems submitted to the Discriminating between Dutch and Flemish in Subtitles (DFS) and the Arabic Dialect Identification (ADI) shared tasks at VarDial 2018.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 4

page 5

09/23/2015

Automatic Dialect Detection in Arabic Broadcast Speech

We investigate different approaches for dialect identification in Arabic...
03/17/2011

Identification of arabic word from bilingual text using character features

The identification of the language of the script is an important stage i...
11/08/2016

An Automated System for Essay Scoring of Online Exams in Arabic based on Stemming Techniques and Levenshtein Edit Operations

In this article, an automated system is proposed for essay scoring in Ar...
09/29/2017

UTD-CRSS Submission for MGB-3 Arabic Dialect Identification: Front-end and Back-end Advancements on Broadcast Speech

This study presents systems submitted by the University of Texas at Dall...
01/19/2022

Interpreting Arabic Transformer Models

Arabic is a Semitic language which is widely spoken with many dialects. ...
09/11/2018

Studying the History of the Arabic Language: Language Technology and a Large-Scale Historical Corpus

Arabic is a widely-spoken language with a long and rich history, but exi...
09/21/2017

Speech Recognition Challenge in the Wild: Arabic MGB-3

This paper describes the Arabic MGB-3 Challenge - Arabic Speech Recognit...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Discriminating between national language varieties and dialects is an important task that is often integrated in natural language processing pipelines and applications. The problem has attracted more attention in recent years, as evidenced by a number of research papers published on languages such as English

[Lui and Cook2013], Portuguese [Zampieri et al.2016], and Romanian [Ciobanu and Dinu2016] and shared tasks such as tweetLID [Zubiaga et al.2016], the PAN lab on author profiling [Rangel et al.2017], and the DSL shared task [Zampieri et al.2014].

Due to its important dialectal variation, the application of automatic dialect identification methods has been widely studied for Arabic, one of the languages we work on in this paper. A number of studies have been published on the identification of Arabic dialects and Modern Standard Arabic using user generated content (e.g. microblog and social media posts), speech transcripts, and other corpora [Elfardy and Diab2013, Zaidan and Callison-Burch2014, Malmasi et al.2015, Tillmann et al.2014].

The other language pair we investigate in this paper is Dutch and Flemish. To the best of our knowledge, methods to discriminate between these two languages haven’t been substantially investigated, a notable exception is the work by vanderlee-vandenbosch:2017:VarDial. Taking the scientific literature into account, these two are widely considered to be two national varieties of the same language, one variety is spoken in the Netherlands while the other is spoken in Belgium. In this paper, we refer to these two varieties as Flemish and Dutch but we acknowledge that in related work (e.g. peirsman10), other terms are used to define these two language varieties, such as Belgium Dutch for Flemish and Netherlandic Dutch for Dutch.111For a comprehensive survey on language and dialect identification see jauhiainen2018automatic.

In this paper we present ensemble-based machine learning systems to discriminate between four Arabic dialects and Modern Standard Arabic in speech broadcasts and to discriminate between Dutch and Flemish in subtitles. We build on the experience of our previous work by improving a system that we have previously applied to similar text classification tasks such as author profiling

[Ciobanu et al.2017] and native language identification [Zampieri et al.2017a]. In our experiments, we used the datasets made available by the organizers of the Arabic, and Dutch and Flemish shared tasks of the VarDial Evaluation Campaign 2018 [Zampieri et al.2018].222http://alt.qcri.org/vardial2018/index.php?id=campaign Finally, we compare the performance of our methods with the performance obtained by the other teams who participated in the two shared tasks.

2 Methodology and Data

2.1 Data

We used the data released by the organizers of two shared tasks of the VarDial Evaluation Campaign 2018, namely the third edition of the Arabic Dialect Identification (ADI) shared task and the first edition of the Discriminating between Dutch and Flemish in Subtitles (DFS) shared task.

The Arabic dataset made available by the organizers of the ADI shared task [Ali et al.2016] included four Arabic dialects: Egyptian (EGY), Levantine (LEV), Gulf (GLF), North African (NOR), and Modern Standard Arabic (MSA). The data released for training and development was the same data as the data released in the 2017 edition of the VarDial evaluation campaign [Zampieri et al.2017b]. For testing, two new datasets were prepared: an in-domain test set and an out-of-domain dataset. The two test sets were merged and the organizers did not inform the participants that out-of-domain test data was included. In the training, development, and test sets, we were provided with acoustic features, ASR output, and phonetic features.

The Dutch and Flemish data comes from the SUBTIEL corpus [van der Lee and van den Bosch2017]. It consists of short excerpts of texts from subtitles of documentaries, films, and TV shows produced by a localization company that produces content for television channels in Belgium and in the Netherlands. The dataset made available by the organizers of the DFS shared task consists of 320,500 instances split into 300,000 instances for training, 20,000 for testing, and 500 for development. This amounts to a total of a little over 11 million tokens.

2.2 Systems and Features

We developed ensemble-based systems for dialect and language variety identification, following the methodology proposed by malmasi-dras:2015:LT4VarDial. The system that we propose uses multiple SVM classifiers – each using a different type of features – and combines their output to provide predictions. Such ensembles have proven to be useful for a number of related classification tasks [Malmasi et al.2016b, Malmasi et al.2016a, Malmasi and Zampieri2017b].

We used the Scikit-learn [Pedregosa et al.2011] machine learning library to implement our system. We employed the SVM implementation based on the Liblinear library [Fan et al.2008], LinearSVC333http://scikit-learn.org/stable/modules/generated/sklearn.svm.LinearSVC.html, with a linear kernel, for the individual classifiers. We further employed the majority rule VotingClassifier444http://scikit-learn.org/stable/modules/generated/sklearn.ensemble.VotingClassifier.html to combine the output of the SVM systems. This ensemble chooses the label that is predicted by the majority of the classifiers (which were assigned uniform weights in the ensemble). In case of ties, the ensemble chooses the label based on the ascending sort order of all labels.

We experimented with the following features, using TF-IDF weighting:

  • Character -grams, with in ;

  • Word -grams, with in ;

  • Word -skip bigrams, with in .

Firstly, we trained a classifier for each type of feature. We report the individual performance of each classifier in Table 1. For DFS, the best performing classifier obtained 0.701 F1 score on the development dataset, using word 3-grams as features. For ADI, the best performing classifier obtained 0.486 F1 score on the development dataset, using character 4-grams as features.

Secondly, we trained multiple ensembles, using various combinations of features, and performed a grid search to determine the optimal value for the SVM regularization parameter searching in . For DFS, the optimal value turned out to be

, and the optimal feature combination was: character n-grams with

in and word 3-grams. The best performing ensemble for DFS obtained 0.687 F1 score on the development dataset. For ADI, the optimal value turned out to be 1, and the optimal feature combination was: character n-grams with in . The best performing ensemble for ADI obtained 0.482 F1 score on the development dataset.

Feature F1 (macro)
DFS ADI
Character 1-grams 0.575 0.268
Character 2-grams 0.585 0.394
Character 3-grams 0.591 0.456
Character 4-grams 0.625 0.486
Character 5-grams 0.627 0.466
Character 6-grams 0.653 0.449
Character 7-grams 0.649 0.433
Character 8-grams 0.645 0.405
Word 1-grams 0.639 0.451
Word 2-grams 0.663 0.392
Word 3-grams 0.701 0.292
Word 1-skip bigrams 0.645 0.397
Word 2-skip bigrams 0.669 0.391
Word 3-skip bigrams 0.660 0.385
Table 1: Classification F1 score for individual classifiers on the development dataset.

3 Results

3.1 Arabic Dialect Identification

The ADI shared task 2018 is the third edition of the competition. Previous iterations were organized in 2016 [Malmasi et al.2016c] and in 2017 [Zampieri et al.2017b]. Related shared tasks include last year’s PAN lab on author profiling which included Arabic dialects [Rangel et al.2017], and the MGB-3 challenge on Arabic dialect identification [Ali et al.2017].

In the 2016 edition of the ADI shared task, 18 teams submitted results in the closed submission track and four teams were ranked in the first position, taking statistical significance tests into account. The teams which were ranked first in the ADI 2016 ordered by absolute performance were: MAZA [Malmasi and Zampieri2016] which used a SVM ensemble system similar to the one we applied in this paper, UnibucKernel [Ionescu and Popescu2016] which submitted a system based on string kernels, and QCRI [Eldesouki et al.2016] and ASIREM [Adouane et al.2016] which submitted systems based on single SVM classifiers.

In the ADI 2017, six teams submitted their system outputs and no statistical significance was calculated. The two best entries in the closed submission track of the ADI 2017 were two recurring teams, UnibucKernel [Ionescu and Butnaru2017] and MAZA [Malmasi and Zampieri2017a] respectively which applied adaptations of their own systems that performed well in 2016.

Next, we report the results obtained on the official test set provided by the ADI 2018 organizers. We trained our system using only the training data provided by the organizer and we compare our best system with the entries submitted to the ADI 2018. Results are presented in Table 2.

Rank Team F1 (Macro) System Description
1 UnibucKernel 0.589 [Butnaru and Ionescu2018]
2 safina 0.576 [Ali2018]
3 BZU 0.534 [Naser and Hanani2018]
3 SYSTRAN 0.529 [Michon et al.2018]
3 Tübingen-Oslo 0.514 [Çöltekin et al.2018]
4 Best Ensemble 0.500
Random Baseline 0.200
Table 2: ADI results. Teams were ranked taking statistical significance into account.

The best performing system was submitted by the UnibucKernel team [Butnaru and Ionescu2018], building on the experience of their previous submissions to the ADI 2016 and 2017. Our system achieved F1 score of 0.500 which significantly outperforms the baseline. It also outperforms our best individual SVM classifier (using character 4-grams as features), which achieved 0.493 F1 score on the test set. Even though the performance of our system was not much lower than the other five teams in the ADI 2018, we expected better performance from our ensemble-based system which is very similar to the entries submitted by the MAZA team which performed well in the ADI 2016 and ADI 2017.

We did not observe the expected influence of out-of-domain data included in the test set. Including out-of-domain data typically makes tasks more challenging than when using only in-domain data. However, the best performance of our system in the development set was actually lower than the performance obtained on the test, 0.482 F1 score against 0.500 F1 score.

For a better understanding of our systems’ performance on the test set, we present the confusion matrix in Figure

2.

Figure 1: Confusion Matrix on the ADI shared task 2018 test set.

We observed that our system achieved its best results identifying the Egyptian and North African dialects and the worst results identifying Gulf Arabic. The biggest confusion occurred between Levantine and North African.

3.2 Discriminating between Dutch and Flemish

The DFS shared task was organized for the first time in 2018. Dutch and Flemish haven’t been included in the multilingual DSL shared task [Malmasi et al.2016c] and to the best of our knowledge, no study has been published on discriminating between Dutch and Flemish with the exception of vanderlee-vandenbosch:2017:VarDial. This makes the results of this shared task described in vardial2018report very relevant for future research.

In this section, we report the results obtained on the official test set provided by the DFS shared task organizers. Our system was trained using only the training data provided by the organizer, which makes our results comparable to the results obtained by the teams who submitted their system outputs to the closed submission track. Therefore, we compare our best system with the entries submitted to the DFS shared task and we present the results of the twelve systems plus the random baseline in Table 2.

Rank Team F1 (Macro) System Description
1 Tübingen-Oslo 0.660 [Çöltekin et al.2018]
2 Taurus 0.646 [van Halteren and Oostdijk2018]
3 CLiPS 0.636 [Kreutz and Daelemans2018]
3 LaMa 0.633
3 XAC 0.632 [Barbaresi2018]
3 safina 0.631
4 STEVENDU2018 0.623 [Du and Wang2018]
4 mmb_lct 0.620 [Kroon et al.2018]
5 SUKI 0.613 [Jauhiainen et al.2018a]
6 Best Ensemble 0.596
7 dkosmajac 0.567
7 benf 0.558
Random Baseline 0.500
Table 3: DFS results. Teams were ranked taking statistical significance into account.

Our system achieved 0.596 F1 score. It was ranked sixth in the competition, taking statistical significance into account. It outperforms the 0.500 baseline by a large margin and our best individual SVM classifier (which achieved 0.576 F1 score on the test set, using word 3-grams as features), but its performance is unfortunately not comparable to the performance obtained by the teams which were ranked in the first positions in this competition.

For a better understanding of our systems’ performance in discriminating between Dutch and Flemish on the test set, we report the confusion matrix in Figure 2.

Figure 2: Confusion Matrix on the DFS shared task 2018 test set.

We observed that our system was slightly better in identifying Flemish (BEL) than Dutch (DUT). Overall, the performance of our ensemble system is, just like for ADI, below what we would expect from a SVM ensemble-based system which, as previously stated, proved to perform well in similar shared tasks. More on that is discussed in the next section.

4 Conclusion and Future Work

In this paper, we present the results obtained by our ensemble-based system when discriminating between Dutch and Flemish in subtitles and when identifying dialects of Arabic using the datasets made available by the organizers of the VarDial Evaluation Campaign. We report results of 0.500 F1 score in discriminating between four Arabic dialects and MSA and 0.596 F1 score in discriminating between Dutch and Flemish.

The results obtained by our method outperform the task’s baseline but we see room for improvement. For example, variations of the systems presented in this paper have been submitted to other shared tasks at VarDial 2018 achieving more competitive performance. One of the shared tasks was the Indo-Aryan Language Identification (ILI) shared task in which our system [Ciobanu et al.2018b] was trained to discriminate between five closely-related languages spoken in India: Awadhi, Bhojpuri, Braj Bhasha, Hindi, and Magahi. Our system was ranked third among eight systems that competed in the task. The other shared task was the second iteration of the German Dialect Identification (GDI) shared task in which our system [Ciobanu et al.2018a] was trained to discriminate between four (Swiss) German dialects, from Basel, Bern, Lucerne, and Zurich. In the GDI shared task our system also ranked third among eight systems.

We are currently carrying out an analysis of the most informative features learned by the classifiers and an error analysis to improve the performance of our system for future shared tasks.

Acknowledgements

We would like to thank the organizers of the ADI shared task and the DFS shared task for making available the datasets used in this paper.

References

  • [Adouane et al.2016] Wafia Adouane, Nasredine Semmar, and Richard Johansson. 2016. ASIREM Participation at the Discriminating Similar Languages Shared Task 2016. In Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial3), pages 163–169, Osaka, Japan.
  • [Ali et al.2016] Ahmed Ali, Najim Dehak, Patrick Cardinal, Sameer Khurana, Sree Harsha Yella, James Glass, Peter Bell, and Steve Renals. 2016. Automatic Dialect Detection in Arabic Broadcast Speech. In Proceedings of INTERSPEECH, pages 2934–2938.
  • [Ali et al.2017] Ahmed Ali, Stephan Vogel, and Steve Renals. 2017. Speech Recognition Challenge in the Wild: Arabic MGB-3. arXiv preprint arXiv:1709.07276.
  • [Ali2018] Mohamed Ali. 2018.

    Character Level Convolutional Neural Network for Arabic Dialect Identification.

    In Proceedings of the Fifth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial).
  • [Barbaresi2018] Adrien Barbaresi. 2018.

    Computationally Efficient Discrimination Between Language Varieties with Large Feature Vectors and Regularized Classifiers.

    In Proceedings of the Fifth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial).
  • [Butnaru and Ionescu2018] Andrei M. Butnaru and Radu Ionescu. 2018. UnibucKernel Reloaded: First Place in Arabic Dialect Identification for the Second Year in a Row. In Proceedings of the Fifth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial).
  • [Çöltekin et al.2018] Çağrı Çöltekin, Taraka Rama, and Verena Blaschke. 2018. Tübingen-Oslo Team at the VarDial 2018 Evaluation Campaign: An Analysis of N-gram Features in Language Variety Identification. In Proceedings of the Fifth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial).
  • [Ciobanu and Dinu2016] Alina Maria Ciobanu and Liviu P Dinu. 2016. A Computational Perspective on the Romanian Dialects. In Proceedings of Language Resources and Evalution (LREC).
  • [Ciobanu et al.2017] Alina Maria Ciobanu, Marcos Zampieri, Shervin Malmasi, and Liviu P Dinu. 2017. Including Dialects and Language Varieties in Author Profiling. Working Notes of CLEF.
  • [Ciobanu et al.2018a] Alina Maria Ciobanu, Shervin Malmasi, and Liviu P. Dinu. 2018a. German Dialect Identification Using Classifier Ensembles. In Proceedings of the Fifth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial).
  • [Ciobanu et al.2018b] Alina Maria Ciobanu, Marcos Zampieri, Shervin Malmasi, Santanu Pal, and Liviu P. Dinu. 2018b. Discriminating between Indo-Aryan Languages Using SVM Ensembles. In Proceedings of the Fifth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial).
  • [Du and Wang2018] Steven Du and Yuan Yuan Wang. 2018. STEVENDU2018’s System in VarDial 2018: Discriminating between Dutch and Flemish in Subtitles. In Proceedings of the Fifth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial).
  • [Eldesouki et al.2016] Mohamed Eldesouki, Fahim Dalvi, Hassan Sajjad, and Kareem Darwish. 2016. QCRI DSL 2016: Spoken Arabic Dialect Identification Using Textual Features. In Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial3), pages 221–226, Osaka, Japan.
  • [Elfardy and Diab2013] Heba Elfardy and Mona T Diab. 2013. Sentence Level Dialect Identification in Arabic. In Proceedings of ACL.
  • [Fan et al.2008] Rong-En Fan, Kai-Wei Chang, Cho-Jui Hsieh, Xiang-Rui Wang, and Chih-Jen Lin. 2008. LIBLINEAR: A Library for Large Linear Classification. Journal of Machine Learning Research, 9:1871–1874.
  • [Ionescu and Butnaru2017] Radu Tudor Ionescu and Andrei Butnaru. 2017. Learning to Identify Arabic and German Dialects using Multiple Kernels. In Proceedings of the Fourth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial), pages 200–209, Valencia, Spain, April.
  • [Ionescu and Popescu2016] Radu Tudor Ionescu and Marius Popescu. 2016. UnibucKernel: An Approach for Arabic Dialect Identification Based on Multiple String Kernels. In Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial3), pages 135–144, Osaka, Japan.
  • [Jauhiainen et al.2018a] Tommi Jauhiainen, Heidi Jauhiainen, and Krister Lindén. 2018a. HeLI-based Experiments in Discriminating Between Dutch and Flemish Subtitles. In Proceedings of the Fifth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial).
  • [Jauhiainen et al.2018b] Tommi Jauhiainen, Marco Lui, Marcos Zampieri, Timothy Baldwin, and Krister Lindén. 2018b. Automatic Language Identification in Texts: A Survey. arXiv preprint arXiv:1804.08186.
  • [Kreutz and Daelemans2018] Tim Kreutz and Walter Daelemans. 2018. Exploring Classifier Combinations for Language Variety Identification. In Proceedings of the Fifth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial).
  • [Kroon et al.2018] Martin Kroon, Maria Medvedeva, and Barbara Plank. 2018. When Simple n-gram Models Outperform Syntactic Approaches: Discriminating between Dutch and Flemish. In Proceedings of the Fifth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial).
  • [Lui and Cook2013] Marco Lui and Paul Cook. 2013. Classifying English Documents by National Dialect. In Proceedings of ALTA.
  • [Malmasi and Dras2015] Shervin Malmasi and Mark Dras. 2015. Language Identification using Classifier Ensembles. In Proceedings of the Joint Workshop on Language Technology for Closely Related Languages, Varieties and Dialects (LT4VarDial), pages 35–43, Hissar, Bulgaria.
  • [Malmasi and Zampieri2016] Shervin Malmasi and Marcos Zampieri. 2016. Arabic Dialect Identification in Speech Transcripts. In Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial3), pages 106–113, Osaka, Japan.
  • [Malmasi and Zampieri2017a] Shervin Malmasi and Marcos Zampieri. 2017a. Arabic Dialect Identification Using iVectors and ASR Transcripts. In Proceedings of the Fourth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial), pages 178–183, Valencia, Spain, April.
  • [Malmasi and Zampieri2017b] Shervin Malmasi and Marcos Zampieri. 2017b. German Dialect Identification in Interview Transcriptions. In Proceedings of the Fourth VarDial Workshop, pages 164–169, Valencia, Spain.
  • [Malmasi et al.2015] Shervin Malmasi, Eshrag Refaee, and Mark Dras. 2015. Arabic Dialect Identification using a Parallel Multidialectal Corpus. In Proceedings of the 14th Conference of the Pacific Association for Computational Linguistics (PACLING 2015), pages 209–217, Bali, Indonesia, May.
  • [Malmasi et al.2016a] Shervin Malmasi, Mark Dras, and Marcos Zampieri. 2016a. LTG at SemEval-2016 Task 11: Complex Word Identification with Classifier Ensembles. In Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval 2016).
  • [Malmasi et al.2016b] Shervin Malmasi, Marcos Zampieri, and Mark Dras. 2016b. Predicting Post Severity in Mental Health Forums. In Proceedings of the Third Computational Linguistics and Clinical Psychology Workshop (CLPsych), pages 133–137, San Diego, California, USA.
  • [Malmasi et al.2016c] Shervin Malmasi, Marcos Zampieri, Nikola Ljubešić, Preslav Nakov, Ahmed Ali, and Jörg Tiedemann. 2016c. Discriminating between Similar Languages and Arabic Dialect Identification: A Report on the Third DSL Shared Task. In Proceedings of the 3rd Workshop on Language Technology for Closely Related Languages, Varieties and Dialects (VarDial), Osaka, Japan.
  • [Michon et al.2018] Elise Michon, Minh Quang Pham, Josep Crego, and Jean Senellart. 2018. Neural Network Architectures for Arabic Dialect Identification. In Proceedings of the Fifth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial).
  • [Naser and Hanani2018] Rabee Naser and Abualsoud Hanani. 2018. Birzeit Arabic Dialect Identification System for the 2018 VarDial Challenge. In Proceedings of the Fifth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial).
  • [Pedregosa et al.2011] Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, et al. 2011. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research, 12:2825–2830.
  • [Peirsman et al.2010] Yves Peirsman, Dirk Geeraerts, and Dirk Speelman. 2010. The Automatic Identification of Lexical Variation Between Language Varieties. Natural Language Engineering, 16:469–491.
  • [Rangel et al.2017] Francisco Rangel, Paolo Rosso, Martin Potthast, and Benno Stein. 2017. Overview of the 5th Author Profiling Task at PAN 2017: Gender and Language Variety Identification in Twitter. Working Notes Papers of the CLEF.
  • [Tillmann et al.2014] Christoph Tillmann, Saab Mansour, and Yaser Al-Onaizan. 2014. Improved Sentence-Level Arabic Dialect Classification. In Proceedings of the VarDial Workshop (VarDial).
  • [van der Lee and van den Bosch2017] Chris van der Lee and Antal van den Bosch. 2017. Exploring Lexical and Syntactic Features for Language Variety Identification. In Proceedings of the Fourth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial), pages 190–199, Valencia, Spain.
  • [van Halteren and Oostdijk2018] Hans van Halteren and Nelleke Oostdijk. 2018. Identification of Differences between Dutch Language Varieties with the VarDial2018 Dutch-Flemish Subtitle Data. In Proceedings of the Fifth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial).
  • [Zaidan and Callison-Burch2014] Omar F Zaidan and Chris Callison-Burch. 2014. Arabic Dialect Identification. Computational Linguistics, 40(1):171–202.
  • [Zampieri et al.2014] Marcos Zampieri, Liling Tan, Nikola Ljubešić, and Jörg Tiedemann. 2014. A Report on the DSL Shared Task 2014. In Proceedings of the First Workshop on Applying NLP Tools to Similar Languages, Varieties and Dialects (VarDial), pages 58–67, Dublin, Ireland.
  • [Zampieri et al.2016] Marcos Zampieri, Shervin Malmasi, Octavia-Maria Sulea, and Liviu P Dinu. 2016. A Computational Approach to the Study of Portuguese Newspapers Published in Macau. In Proceedings of Workshop on Natural Language Processing Meets Journalism (NLPMJ).
  • [Zampieri et al.2017a] Marcos Zampieri, Alina Maria Ciobanu, and Liviu P. Dinu. 2017a. Native Language Identification on Text and Speech. In Proceedings of the 12th Workshop on Innovative Use of NLP for Building Educational Applications, pages 398–404, Copenhagen, Denmark, September. Association for Computational Linguistics.
  • [Zampieri et al.2017b] Marcos Zampieri, Shervin Malmasi, Nikola Ljubešić, Preslav Nakov, Ahmed Ali, Jörg Tiedemann, Yves Scherrer, and Noëmi Aepli. 2017b. Findings of the VarDial Evaluation Campaign 2017. In Proceedings of the Fourth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial), Valencia, Spain.
  • [Zampieri et al.2018] Marcos Zampieri, Shervin Malmasi, Preslav Nakov, Ahmed Ali, Suwon Shon, James Glass, Yves Scherrer, Tanja Samardžić, Nikola Ljubešić, Jörg Tiedemann, Chris van der Lee, Stefan Grondelaers, Nelleke Oostdijk, Antal van den Bosch, Ritesh Kumar, Bornini Lahiri, and Mayank Jain. 2018. Language Identification and Morphosyntactic Tagging: The Second VarDial Evaluation Campaign. In Proceedings of the Fifth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial), Santa Fe, USA.
  • [Zubiaga et al.2016] Arkaitz Zubiaga, Inaki San Vicente, Pablo Gamallo, José Ramom Pichel, Inaki Alegria, Nora Aranberri, Aitzol Ezeiza, and Víctor Fresno. 2016. TweetLID: A Benchmark for Tweet Language Identification. Language Resources and Evaluation, 50(4):729–766.