Title: Lung Cancer Survivability prediction with Recursive Feature Elimination using Random Forest and Ensemble Classifiers
Abstract: Accurate prediction of the survival rates of cancer patients is often crucial to stratifying patients for prognosis and treatment. This study presents a detailed methodology for predicting lung cancer five-year survivability using the SEER 2020 database, the most comprehensive cancer incidence database available. Recursive feature elimination using the random forest technique was employed for dimensionality reduction. The classifiers were then trained on the reduced dataset using the Synthetic Minority Over-sampling Technique (SMOTE) and 5-fold cross-validation. The predictive effectiveness of boosting ensemble techniques was also investigated, in addition to the performance of popular classifiers. Experimental results indicate that boosted ensemble classifiers offer improved accuracy over other applied classifiers, with the Light Gradient Boosting Machine (Light GBM) yielding the highest accuracy of 87.45% and an AUC score of 0.91.
Publication Year: 2022
Publication Date: 2022-04-15
Language: en
Type: article
Indexed In: ['crossref']
Access and Citation
Cited By Count: 2
AI Researcher Chatbot
Get quick answers to your questions about the article from our AI researcher chatbot