Learn To Manipulate Data Frames Using The “mtcars” Dataset
Below is an introduction to programming with r, all code in this exercise is only using base r and no other libraries are needed.
Task 1: Create a new column to find Displacement per Cylinder
Create a new variable (DisplacementPerCylinder), to calculate the total displacement per cylinder in cubic inches for each vehicle from the
# "str" allows you to display the internal structure of an R object str(mtcars)
## 'data.frame': 32 obs. of 11 variables: ## $ mpg : num 21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ... ## $ cyl : num 6 6 4 6 8 6 8 4 4 6 ... ## $ disp: num 160 160 108 258 360 ... ## $ hp : num 110 110 93 110 175 105 245 62 95 123 ... ## $ drat: num 3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ... ## $ wt : num 2.62 2.88 2.32 3.21 3.44 ... ## $ qsec: num 16.5 17 18.6 19.4 17 ... ## $ vs : num 0 0 1 1 0 1 0 1 1 1 ... ## $ am : num 1 1 1 0 0 0 0 0 0 0 ... ## $ gear: num 4 4 4 3 3 3 3 4 4 4 ... ## $ carb: num 4 4 1 1 2 1 4 2 2 4 ...
# As a backup we can copy the original data frame into a new one to work with # That way if there is any issues we can go back my_mtcars <- mtcars
# Calculate Displacement Per Cylinder by dividing the values (disp) and (cyl) my_mtcars$DisplacementPerCylinder <- my_mtcars$disp / my_mtcars$cyl # Report a summary of the variable summary(my_mtcars$DisplacementPerCylinder)
## Min. 1st Qu. Median Mean 3rd Qu. Max. ## 17.77 26.92 34.48 35.03 43.19 59.00
Task 2: Create your own data frame
Gather data from family & friends on the number of pets they have, the birth order they are in their
To store the results you can create vectors in r with multiple data points by using c(“Data”, “Data).
# Family/Friends ID friendID <- c(1, 2, 3, 4, 5) # Number of pets they have Pets <- c(4, 4, 2, 3, 1) # The birth order they are in their family Order <- c(1, 2, 2, 1, 1) # Number of Siblings Siblings <- c(2, 2, 1, 2, 0) # Binding the vectors into a data frame called myFriends myFriends <- data.frame(friendID, + Pets, + Order, + Siblings) # Command to report the structure of the data frame myFriends str(myFriends)
## 'data.frame': 5 obs. of 4 variables: ## $ friendID : num 1 2 3 4 5 ## $ X.Pets : num 4 4 2 3 1 ## $ X.Order : num 1 2 2 1 1 ## $ X.Siblings: num 2 2 1 2 0
To change the column names in the data frame you can go here to see a couple of different ways to rename columns in r.
# Rename the columns to get rid of the "x." in front of the names colnames(myFriends) <- c("FriendID", "Pets", "Order", "Siblings") str(myFriends)
## 'data.frame': 5 obs. of 4 variables: ## $ FriendID: num 1 2 3 4 5 ## $ Pets : num 4 4 2 3 1 ## $ Order : num 1 2 2 1 1 ## $ Siblings: num 2 2 1 2 0
# Listing the values of the vector friendID from the data frame myFriends myFriends$FriendID
## [1] 1 2 3 4 5
# Listing the values of the vector Pets from the data frame myFriends myFriends$Pets
## [1] 4 4 2 3 1
# Listing the values of the vector Order from the data frame myFriends myFriends$Order
## [1] 1 2 2 1 1
# Listing the values of the vector Siblings from the dataframe myFriends myFriends$Siblings
# [1] 2 2 1 2 0
# Report a summary of the dataframe summary(myFriends)
## FriendID Pets Order Siblings ## Min. :1 Min. :1.0 Min. :1.0 Min. :0.0 ## 1st Qu.:2 1st Qu.:2.0 1st Qu.:1.0 1st Qu.:1.0 ## Median :3 Median :3.0 Median :1.0 Median :2.0 ## Mean :3 Mean :2.8 Mean :1.4 Mean :1.4 ## 3rd Qu.:4 3rd Qu.:4.0 3rd Qu.:2.0 3rd Qu.:2.0 ## Max. :5 Max. :4.0 Max. :2.0 Max. :2.0