Member-only story
The Tidyverse is a collection of R packages designed for data science that share a common philosophy and grammar, making it easier to learn and use them together. Key packages in the Tidyverse include dplyr
, ggplot2
, tidyr
, purrr
, and others.
I. Advanced Data Cleaning with tidyr
:
1. Handling missing data with complete()
and drop_na()
.
library(tidyr)
# Complete missing combinations of variables
complete(df, group_var1, group_var2)
# Drop rows with missing values
df %>%
drop_na()
2. Advanced pivoting and gathering with pivot_longer()
and pivot_wider()
.
library(tidyr)
# Pivot longer with multiple columns
pivot_longer(df, cols = starts_with("value"), names_to = "variable", values_to = "value")
# Pivot wider with multiple columns
pivot_wider(df_long, names_from = "variable", values_from = "value")
II. Advanced Data Transformation with dplyr
:
3. Window functions for advanced grouping and summarization.
library(dplyr)
# Calculate rolling mean using window…