Data Science Using R

R Shiny Application Split Into Multiple Files

R Shiny Application Split Into Multiple Files

Blog, Data Science, Data Science Using R, Recent Posts
An R Shiny application can be completely made of just one file that includes your UI and server code. Depending on the application you're creating this can get messy as the number of lines grows. Splitting the app across multiple files also helps with debugging and being able to reuse code for another project. For this post, the actual code and what it's doing will not be covered. The code is more of a placeholder and this is creating a project framework to start Shiny applications. Creating a new project The best practice is to create a new project for each application. This is simple with RStudio and using the "File" dropdown to select "New Project" Select "New Directory" Choose "Shiny Web Application" Give your directory a name and the location you want to save i...
Renaming Columns with R

Renaming Columns with R

Recent Posts, Data Science Using R
Often data you’re working with has abstract column names, such as (x1, x2, x3…). Typically, the first step I take when renaming columns with r is opening my web browser.  For some reason no matter the amount of times doing this it’s just one of those things. (Hoping that writing about it will change that) The dataset cars is data from the 1920s on "Speed and Stopping Distances of Cars". There is only 2 columns shown below. colnames(datasets::cars) [1] "speed" "dist" If we wanted to rename the column "dist" to make it easier to know what the data is/means we can do so in a few different ways. Using dplyr: cars %>% rename("Stopping Distance (ft)" = dist) %>% colnames() [1] "speed" "Stopping Distance (ft)" cars %>% rename("Stopping Di

How To Select Multiple Columns Using Grep & R

Data Science Using R, Recent Posts
Why you need to be using Grep when programming with R. There's a reason that grep is included in most if not all programming language to this day 44 years later from creation. It's useful and simple to use. Below is an example of using grep to make selecting multiple columns in R simple and easy to read. The dataset below has the following column names. names(data) # Column Names [1] "fips" "state" "county" "metro_area" [5] "population" "med_hh_income" "poverty_rate" "population_lowaccess" [9] "lowincome_lowaccess" "no_vehicle_lowaccess" "s_grocery" "s_supermarket" [13] "s_convenience" "s_specialty" "s_farmers_market" "r_fastfood" [17] "r_full_servi...

Creating Excel Workbooks with multiple sheets in R

Data Science Using R, Data Science, Recent Posts
Create Excel Workbooks Generally, when doing anything in R I typically work with .csv files, their fast and straightforward to use. However, I find times, where I need to create a bunch of them to output and having to go and open each one individually, can be a pain for anyone. In this case, it's much better to create a workbook where each of the .csv files you would have created will now be a separate sheet. Below is a simple script I use frequently that gets the job done. Also included is the initial process of creating dummy data to outline the process. EXAMPLE CODE: Libraries used library(tidyverse) library(openxlsx) Creating example files to work with products <- c("Monitor", "Laptop", "Keyboards", "Mice") Stock <- c(20,10,25,50) Computer_Supplies <...

Exploring Employee Attrition and Performance with R

Data Science Using R, Data Science
Based on IBM's fictional data set created by their data scientists. Introduction: Employee Attrition is when an employee leaves a company due to normal means, (loss of customers, retirement, and resignation), and there is not someone to fill the vacancy. Can a company identify employee’s that are likely to leave a company? A company with a high employee attrition rate is a good sign of underlying problems and can affect a company in a very negative way. One such way is the cost related to finding and training a replacement, as well as the possible strain it can put on other workers that in the meantime have to cover. Preprocessing: This dataset was produced by IBM and has just under 1500 observations of 31 different variables including attrition. 4 of the variables (EmployeeNumber, Over18