Machine Learning Pitfalls: 12 Features That Can Undermine Model Performance

btd
6 min readNov 16, 2023

Identifying features that can be harmful to your machine learning models is context-dependent and often requires a thorough understanding of your specific dataset, problem, and modeling techniques. However, there are certain types of features that are commonly considered potentially harmful in various scenarios. Here is a list of features that might have a detrimental impact on your models:

1. Irrelevant Features:

  • Irrelevant features pose a significant threat to the integrity of machine learning models, as they lack meaningful connections to the target variable and can introduce detrimental noise into the learning process. This noise, if not mitigated, has the potential to pave the way for overfitting — a scenario where the model hones in on irrelevant patterns present in the training data, compromising its ability to generalize effectively to new, unseen data.

Mitigation Strategies:

  • Feature Selection: Prudent removal of irrelevant features through techniques like backward elimination or recursive feature elimination.
  • Regularization Techniques: Incorporating penalties for model complexity to discourage the inclusion of unnecessary features.
  • Domain Expertise

--

--