Building machine learning models is a pivotal aspect of data science, involving the application of algorithms to make predictions or uncover patterns in data. This list provides concise code snippets for various stages of the machine learning process, from data preparation and model selection to evaluation and hyperparameter tuning. Whether you’re a beginner or an experienced practitioner, these one-liners cover popular machine learning tasks using libraries like Scikit-Learn, XGBoost, CatBoost, LightGBM, and even include advanced techniques like hyperparameter optimization with Hyperopt and model interpretation with SHAP values. Use these quick references to streamline your machine learning workflow and develop effective predictive models across different domains.
Data Handling and Exploration
- Load dataset:
data = pd.read_csv('dataset.csv')
- Explore data:
data.head()
,data.info()
,data.describe()
- Handle missing values:
data.dropna()
,data.fillna()
- Encode categorical variables:
pd.get_dummies(data)
- Split data into train/test sets:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)