Member-only story
Here are 100 one-liners in R grouped by various concepts in data science:
Data Manipulation and Cleaning
- Subset a dataframe:
subset(df, column == value)
- Remove missing values:
na.omit(df)
- Rename a column:
names(df)[names(df) == "old_name"] <- "new_name"
- Convert factor to numeric:
as.numeric(as.character(factor_column))
- Create a new variable:
df$new_var <- df$var1 + df$var2
- Filter rows based on condition:
df[df$column > 10,]
- Remove duplicates:
unique(df)
- Convert character to date:
as.Date(character_column, format="%Y-%m-%d")
- Pivot data from wide to long:
melt(df, id.vars=c("id"), measure.vars=c("var1", "var2"))
- Aggregate data:
aggregate(value ~ group, data=df, FUN=mean)
Data Visualization
- Plot a histogram:
hist(df$column)
- Scatter plot:
plot(df$var1, df$var2)
- Boxplot:
boxplot(df$column)
- Line chart:
plot(df$time, df$value, type="l")
- Bar chart:
barplot(df$counts, names.arg=df$categories)