Often data you’re working with has abstract column names, such as (x1, x2, x3…). Typically, the first step I take when renaming columns with r is opening my web browser.
The dataset cars is data from the 1920s on “Speed and Stopping Distances of Cars”. There is only 2 columns shown below.
colnames(datasets::cars)
[1] "speed" "dist"
If we wanted to rename the column “dist” to make it easier to know what the data is/means we can do so in a few different ways.
Using Base r:
colnames(cars)[2] <-"Stopping Distance (ft)"
[1] "speed" "Stopping Distance (ft)"
colnames(cars)[1:2] <-c("Speed (mph)","Stopping Distance (ft)")
[1] "Speed (mph)" "Stopping Distance (ft)"
Stopping_Over_60 <- subset(cars, dist > 60)
Stopping_Under_61 <- subset(cars, dist <= 60)
names(Stopping_Over_60)[2] <- names(Stopping_Under_61)[2] <- "Stopping Distance (ft)"
names(Stopping_Over_60)
[1] "speed" "Stopping Distance (ft)"
names(Stopping_Under_61)
[1] "speed" "Stopping Distance (ft)"
Using dplyr:
cars %>%
rename("Stopping Distance (ft)" = dist) %>%
colnames()
[1] "speed" "Stopping Distance (ft)"
cars %>%
rename("Stopping Distance (ft)" = dist, "Speed (mph)" = speed) %>%
colnames()
[1] "Speed (mph)" "Stopping Distance (ft)"
Using GREP:
colnames(cars)[grep("dist", colnames(cars))] <-"Stopping Distance (ft)"
"speed" "Stopping Distance (ft)"
Using data.table
As mentioned in the comments below, using data.tables is a more efficient way of renaming columns in r due to the handling of memory. Memory is important as the data you start working with grows in size.
library(data.table)
cars_DT <- cars
cars_DT <-setDT(cars_DT)
# Note have to copy over to a new data structure due to cars being locked as its a dataset built into a package
setnames(cars_DT, c("speed","dist"), c("Speed (mph)","Stopping Distance (ft)"))
names(cars_DT)
[1] "Speed (mph)" "Stopping Distance (ft)"