Member-only story

Mastering Data Science: 100 Quick Python One-liner Codes for Data Cleaning

5 min readNov 28, 2023

Data cleaning is a crucial step in the data preprocessing pipeline, ensuring that the dataset is accurate and ready for analysis or modeling. Here’s a list of 100 concise Python code snippets for various data cleaning tasks. These one-liners cover tasks such as handling missing values, removing duplicates, converting data types, extracting information from datetime columns, and transforming text data. Each line provides a quick solution to a common data cleaning challenge, aiding in the efficient preparation of your data for further exploration and analysis.

Remove duplicate rows: df.drop_duplicates()
Handle missing values with mean: df.fillna(df.mean())
Remove rows with missing values: df.dropna()
Replace missing values with zero: df.fillna(0)
Rename columns: df.rename(columns={'old_name': 'new_name'})
Remove whitespaces from column names: df.columns = df.columns.str.strip()
Convert data type of a column: df['column'] = df['column'].astype('new_type')
Remove special characters from strings: df['column'] = df['column'].str.replace('[^a-zA-Z0-9]', '')
Convert string to lowercase: df['column'] = df['column'].str.lower()

Mastering Data Science: 100 Quick Python One-liner Codes for Data Cleaning

Written by btd

No responses yet