Manipulating Data Frames in R

Learn To Manipulate Data Frames Using The “mtcars” Dataset

Task 1: Create a new column to find Displacement per Cylinder 

Create a new variable (DisplacementPerCylinder), to calculate the total displacement per cylinder in cubic inches for each vehicle from the mtcars dataset.

# "str" allows you to display the internal structure of an R object
str(mtcars) 
## 'data.frame':    32 obs. of  11 variables:
##  $ mpg : num  21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
##  $ cyl : num  6 6 4 6 8 6 8 4 4 6 ...
##  $ disp: num  160 160 108 258 360 ...
##  $ hp  : num  110 110 93 110 175 105 245 62 95 123 ...
##  $ drat: num  3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
##  $ wt  : num  2.62 2.88 2.32 3.21 3.44 ...
##  $ qsec: num  16.5 17 18.6 19.4 17 ...
##  $ vs  : num  0 0 1 1 0 1 0 1 1 1 ...
##  $ am  : num  1 1 1 0 0 0 0 0 0 0 ...
##  $ gear: num  4 4 4 3 3 3 3 4 4 4 ...
##  $ carb: num  4 4 1 1 2 1 4 2 2 4 ...
# As a backup we can copy the original data frame into a new one to work with
# That way if there is any issues we can go back

my_mtcars <- mtcars
# Calculate Displacement Per Cylinder by dividing the values (disp) and (cyl)

my_mtcars$DisplacementPerCylinder <- my_mtcars$disp / my_mtcars$cyl

# Report a summary of the variable
summary(my_mtcars$DisplacementPerCylinder)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   17.77   26.92   34.48   35.03   43.19   59.00

Task 2: Create your own data frame

Gather data from family & friends on the number of pets they have, the birth order they are in their family and the number of siblings. 

# Family/Friends ID
friendID  <- c(1, 2, 3, 4, 5)

# Number of pets they have
Pets <- c(4, 4, 2, 3, 1)

# The birth order they are in their family
Order <- c(1, 2, 2, 1, 1)

# Number of Siblings 
Siblings <- c(2, 2, 1, 2, 0)

# Binding the vectors into a data frame called myFriends
myFriends <- data.frame(friendID, + Pets, + Order, + Siblings)

# Command to report the structure of the data frame myFriends
str(myFriends)

## 'data.frame':    5 obs. of  4 variables:
##  $ friendID  : num  1 2 3 4 5
##  $ X.Pets    : num  4 4 2 3 1
##  $ X.Order   : num  1 2 2 1 1
##  $ X.Siblings: num  2 2 1 2 0
# Rename the columns to get rid of the "x." in front of the names
colnames(myFriends) <- c("FriendID", "Pets", "Order", "Siblings")
str(myFriends)
## 'data.frame':    5 obs. of  4 variables:
##  $ FriendID: num  1 2 3 4 5
##  $ Pets    : num  4 4 2 3 1
##  $ Order   : num  1 2 2 1 1
##  $ Siblings: num  2 2 1 2 0
# Listing the values of the vector friendID from the data frame myFriends
myFriends$FriendID 
## [1] 1 2 3 4 5
# Listing the values of the vector Pets from the data frame myFriends
myFriends$Pets
## [1] 4 4 2 3 1
# Listing the values of the vector Order from the data frame myFriends
myFriends$Order
## [1] 1 2 2 1 1
# Listing the values of the vector Siblings from the dataframe myFriends
myFriends$Siblings
# [1] 2 2 1 2 0
# Report a summary of the dataframe
summary(myFriends)
##     FriendID      Pets         Order        Siblings  
##  Min.   :1   Min.   :1.0   Min.   :1.0   Min.   :0.0  
##  1st Qu.:2   1st Qu.:2.0   1st Qu.:1.0   1st Qu.:1.0  
##  Median :3   Median :3.0   Median :1.0   Median :2.0  
##  Mean   :3   Mean   :2.8   Mean   :1.4   Mean   :1.4  
##  3rd Qu.:4   3rd Qu.:4.0   3rd Qu.:2.0   3rd Qu.:2.0  
##  Max.   :5   Max.   :4.0   Max.   :2.0   Max.   :2.0