Title: A Marine Literature Classification Method Based on Co-training
Abstract: It always takes a large number of manual work to label marine papers when using supervised machine learning method.To address this issue,we take advantage of Co-training,which is a kind of semi-supervised learning method,for building the marine paper classification.We train two different classifiers from two views.One view is made up of the feature set of abstract,and the other is made up of the feature sets of title,subject,major and class code.On this basis,we use a small initial labeled set to obtain useful information from a large set of unlabeled documents,and boost the performance of two classifiers by Co-training.Experiments shows that even if there are only 2 labeled samples in the training set,the F1 value and error rate of the classification system could reach about 85.88% and 14.35%.They are close to the performance of supervised classifier(90.20% and 9.13%) which is trained by more than 1 500 labeled samples.These show that the application of Co-training on marine papers classification can significantly reduce the manual work,and also has well performance.Thus,it is very suitable for practical applications.
Publication Year: 2010
Publication Date: 2010-01-01
Language: en
Type: article
Access and Citation
AI Researcher Chatbot
Get quick answers to your questions about the article from our AI researcher chatbot