Application of Feature Selection and Random Forest Classifier on a Medical Dataset
Author(s):
Aayush Kamath , Sardar Patel Institute of Technology, Mumbai; Gokul Nambiar, Sardar Patel Institute of Technology, Mumbai; Mohammad Izhan, Sardar Patel Institute of Technology, Mumbai; Radha Shankarmani, Sardar Patel Institute of Technology, Mumbai
Keywords:
Feature Selection, Random Forest, Overfitting, Medical Dataset, Classification
Abstract:
For a given medical data set, there is a huge possibility that the data includes hundreds of features, each representing a symptom or a parameter based on which diagnosis can be carried out. While a lot of these features contribute towards the results, it is often the case that quite a few of these features turn out to be either irrelevant or have very little bearing in terms of their overall impact on the results and only end up crowding the data set. Feature selection provides a solution to this problem as the features that provide the highest contribution while predicting an output are retained and the irrelevant features are identified and subsequently eliminated. This helps in the model being trained faster and leads to a better interpretation of the model further allowing better diagnosis of the disease. Apart from feature selection, random forest classifier is being used as a means to predict the outcomes. Since random forest is made up of decision trees, it helps in better classification for a given problem.
Other Details:
| Manuscript Id | : | IJSTEV6I10015
|
| Published in | : | Volume : 6, Issue : 10
|
| Publication Date | : | 01/05/2020
|
| Page(s) | : | 29-32
|
Download Article