Title: Dynamic Topic Model for Short Texts about Hot Issues on Microblog Based on TFIDF
Abstract:Short texts such as posts on Microblog contain much semantic information such as time, attitude, as well as topics, and how to extract topics in these texts has been widely concerned recently. However...Short texts such as posts on Microblog contain much semantic information such as time, attitude, as well as topics, and how to extract topics in these texts has been widely concerned recently. However, on the one hand, most topic models are aimed at long texts, and cannot show a good performance on short texts because of its semantic sparsity. On the other hand, traditional topic models ignore the time factor of documents. Nevertheless, hot issues on the Internet have an opinion life cycle, topics around an issue tend to emerge, develop and vanish rapidly in a short period of time. As a result, ignoring time factor is not consistent with reality. Besides, traditional topic models are likely to extract topics with unnecessarily repeated keywords because pronouns about the same issue have a high frequency in different posts. To solve these problems, we propose a topic model for short texts about hot issues on Microblog based on Dynamic Topic Model(DTM). By introducing Term Frequency and Inverse Document Frequency(TF-IDF) to extract more characteristic words, TFIDF-DTM could reduce the input dimension for DTM and make the extracted topics more representative in the final. The model is tested on a self-crawled dataset, compared with standard DTM, The extraction process of TFIDF-DTM costs less time, and the result has a higher topic discrimination, as well as more explicit semantics.Read More
Publication Year: 2022
Publication Date: 2022-03-01
Language: en
Type: article
Indexed In: ['crossref']
Access and Citation
AI Researcher Chatbot
Get quick answers to your questions about the article from our AI researcher chatbot