Abstract: Attention-based models have proved their superiority on many NLP tasks, especially for English. Despite their great potential and importance of language models, little attention has been paid to attention-based language modeling for Persian. In this paper, we fine-tuned two language models, namely BERT and Persian GPT-2 on Persica corpus. We then evaluated these models by computing their perplexity on a 5-million-word dataset. Both models outperform previous SOTA results on the measure of perplexity. Our results indicate that GPT-2 performs slightly better by approximately 10 percent improvement of perplexity and seems to be a better fit for language modeling. We have proposed a modified version of perplexity, bi-perplexity, which can be a measure for comparison of language models trained with Masked Language Modeling objective. We have also introduced an innovative way of using BERT as a language model by devising a new strategy for sampling.
Publication Year: 2022
Publication Date: 2022-02-23
Language: en
Type: article
Indexed In: ['crossref']
Access and Citation
AI Researcher Chatbot
Get quick answers to your questions about the article from our AI researcher chatbot