Member-only story
Navigating the world of machine learning often involves dealing with diverse datasets, and effective data preprocessing is crucial. In this curated list, you’ll find 100 concise R codes to streamline data cleaning, exploratory analysis, and statistical operations.
Data Loading and Inspection
- Load Data:
data <- read.csv('file.csv')
- View Data Structure:
str(data)
- Summary Statistics:
summary(data)
Handling Missing Values
- Handle Missing Values:
na.omit(data)
- Impute Missing Values:
data$variable[is.na(data$variable)] <- mean(data$variable, na.rm = TRUE)
Data Cleaning and Transformation
- Remove Duplicates:
unique_data <- unique(data)
- Subset Data:
subset_data <- data[data$condition == 'value', ]
- Rename Columns:
colnames(data) <- c('new_name', 'new_name2')
- Convert Factor to Numeric:
data$variable <- as.numeric(as.character(data$variable))
- Convert Date:
data$date <- as.Date(data$date, format='%Y-%m-%d')
- Convert Character to Factor:
data$variable <- as.factor(data$variable)