Feature selection is a widely studied technique whose goal is to reduce the dimensionality of the problem by removing irrelevant features. It has multiple benefits, such as improved efficacy, efficiency and interpretability of almost any type of machine learning model. Feature selection techniques may be divided into three main categories, depending on the process used to remove the features known as Filter, Wrapper and Embedded. Embedded methods are usually the preferred feature selection metho…
Read moreFeature selection is a widely studied technique whose goal is to reduce the dimensionality of the problem by removing irrelevant features. It has multiple benefits, such as improved efficacy, efficiency and interpretability of almost any type of machine learning model. Feature selection techniques may be divided into three main categories, depending on the process used to remove the features known as Filter, Wrapper and Embedded. Embedded methods are usually the preferred feature selection method that efficiently obtains a selection of the most relevant features of the model. However, not all models support an embedded feature selection that forces the use of a different method, reducing the efficiency and reliability of the selection. Neural networks are an example of a model that does not support embedded feature selection. As neural networks have shown to provide remarkable results in multiple scenarios such as classification and regression, sometimes in an ensemble with a model that includes an embedded feature selection, we attempt to embed a feature selection process with a general-purpose methodology. In this work, we propose a novel general-purpose layer for neural networks that removes the influence of irrelevant features. The Feature-Aware Drop Layer is included at the top of the neural network and trained during the backpropagation process without any additional parameters. Our methodology is tested with 17 datasets for classification and regression tasks, including data from different fields such as Health, Economic and Environment, among others. The results show remarkable improvements compared to three different feature selection approaches, with reliable, efficient and effective results.