Data Science

What is Data Science?

What is Data Science?

Recent Posts, Data Science
The data science field encompasses a wide scope, ranging from collecting data to data management, analysis, and visualization. Pulling all these areas together, a data scientist can gather information from obtained data and create visualizations to communicate results. Collect and organize data The collection and organization of data is arguably the most important factor within the data science field. You cannot do anything without having data to work with, so you must have a method of collecting data. This can be done independently/on your own, for example scraping the web or applications or even conducting a survey for respondents to take. You may also have access to data that has already been collected either by open source repositories, or sites such as Kaggle. You may get the d...

Turing In-Complete (part 1)

Recent Posts, Data Science, Hardware, Software
Before man-built machines that could be used to manually calculate all the same mathematical problems we now regard as computation, we – humans were regarded as the “computers”, not the artificial machines. This explains the label “manually” calculated. Man built the machines. This has only been true for a relatively short period of time when compared to the timeline man has existed in the current evolutionary state. This technology goes back much farther than the existence of our most popular desktop pc, laptops, tablets, or smartphones. Major developments in the twentieth century progressed at a very rapid pace, not with the help of Extraterrestrial beings, but by some very brilliant humans. Maybe you could make a case for “math” from outer space in ancient history, and you’d be technic

Creating Excel Workbooks with multiple sheets in R

Data Science Using R, Data Science, Recent Posts
Create Excel Workbooks Generally, when doing anything in R I typically work with .csv files, their fast and straightforward to use. However, I find times, where I need to create a bunch of them to output and having to go and open each one individually, can be a pain for anyone. In this case, it's much better to create a workbook where each of the .csv files you would have created will now be a separate sheet. Below is a simple script I use frequently that gets the job done. Also included is the initial process of creating dummy data to outline the process. EXAMPLE CODE: Libraries used library(tidyverse) library(openxlsx) Creating example files to work with products <- c("Monitor", "Laptop", "Keyboards", "Mice") Stock <- c(20,10,25,50) Computer_Supplies <...

Sort

Recent Posts, Data Science, Linux
The command sort is used to sort files line by line.  Lines starting with a number go first. Lines that come next in order go alphabetical with uppercase letters appearing before lowercase ones. Use cat to create "testsort" for the example. ~/Test>cat testsort A line 1 a line 2 8 line 3 line 4 5 line 5 ~/Test>sort testsort 5 line 5 8 line 3 A line 1 a line 2 line 4 R sorts by using a random hash of keys ~/Test>sort -R testsorta line 25 line 5A line 18 line 3line 4 ~/Test>sort -R testsort5 line 5A line 1a line 2line 48 line 3

Egrep & Fgrep

Recent Posts, Data Science, Linux
EGREP:             The Command egrep is the same as running grep –E. egrep is used to search for a pattern using extended regular expressions. Terry@f:~/FinderDing>cat testsort A line 1 a line 2 8 line 3 line 4 5 line 5 Terry@f:~/FinderDing>egrep '^[a-zA-Z]' testsort A line 1 a line 2 line 4 *Show lines that start with a letter from alphabet Terry@f:~/FinderDing>cat html <!DOCTYPE html> <html> <body> <h1>My First Heading</h1> <p>My first paragraph.</p> </body> </html> Terry@f:~/FinderDing>egrep "My|first" html <h1>My First Heading</h1> <p>My first paragraph.</p> `*Find lines with pattern My first from html file FGREP: The command fgrep is the same as running grep –F. The Command searches for fixed character strings in a