Member-only story
Data transformation in R refers to the process of modifying or restructuring the original dataset to make it more suitable for analysis, modeling, or visualization. This process involves applying a series of operations to the data, such as cleaning, aggregating, merging, and reshaping, to prepare it for further exploration or statistical analysis.
1. Subset Data:
new_data <- old_data[old_data$column_name == value, ]
2. Filter with dplyr:
library(dplyr)
new_data <- filter(old_data, column_name == value)
3. Sort Data:
new_data <- arrange(old_data, column_name)
4. Select Columns:
new_data <- old_data[, c("column1", "column2")]
5. Rename Columns:
names(new_data)[names(new_data) == "old_name"] <- "new_name"
6. Create New Variables:
new_data$sum_column <- new_data$column1 + new_data$column2
7. Convert Data Types:
new_data$numeric_column <- as.numeric(new_data$character_column)