Title: A data cleaning solution by Perl scripts for the KDD Cup 2003 task 2
Abstract: In this paper, we present our solution for the KDD CUP 2003 task 2 competition. Our approach is based on a data cleaning methodology using Perl scripts. These scripts contain regular expression for automatically extracting relevant information from the 35472 LaTeX texts. These expressions were optimized by statistical investigations on the texts. Our solution has permitted us to obtain 144,087 associations.
Publication Year: 2003
Publication Date: 2003-12-01
Language: en
Type: article
Indexed In: ['crossref']
Access and Citation
Cited By Count: 6
AI Researcher Chatbot
Get quick answers to your questions about the article from our AI researcher chatbot