[1] 5 5 10
Fall 2022
September 14, 2022
To introduce R and R Studio, as well as basic data terms.
types, namely:
logical - TRUE or FALSEinteger - whole numbers like 1, 5, 100numeric - numbers with decimal places like 5.25character - anything with lettersfactors - for categorical data, numbers with descriptive labelsNA - NAs are missing valuesWe can ask R to do things using the language it understands, R code.
Say we want to ask R to:
Take the sum of 5, 5, and 10, and put in in a box called “total.”
Take the sum of 5, 5, and 10, and put the results in a box called “total.”
R would understand
total <- sum(5, 5, 10)
total <- sum(5, 5, 10)
object <- function(arguments)
You can always learn more about a function using ?, for example ?sum.
A vector is an organized arrangement of data.
type of datac()A dataframe is a square organization of vectors.
| X. | example_vector | example_colors |
|---|---|---|
| 1 | 5 | purple |
| 2 | 5 | orange |
| 3 | 10 | periwinkle |
A subset of data is a smaller selection of the total data set.
Learning how to effectively subset is one of the most foundational skills in data science.
You can ask for data in a specific position in a vector by giving it the number of that position. For example:
vector <- c(1, 3, 5, 7, 9, 11)
Ask for a subset of a vector using the following format. In English:
Give me vector, such that position is equal to X.
In R Code:
vector[position]
Ask for a sunset of a dataframe using the following format. In English:
Give me dataframe, such that rows are equal to X, and columns are equal to Y.
In R Code:
dataframe[rows, columns] OR dataframe$column
You can use the same tools to take parts from a dataframe to add to it.
example_dataframe
name number
1 Sam 7
2 Frodo 8
3 Pippin 3
4 Merry 6
Combine them!
name number new_column
1 Sam 7 blue
2 Frodo 8 green
3 Pippin 3 yellow
4 Merry 6 red
Conditionals help you ask for things when a condition is TRUE.
== - Equal to!= - Not equal to> - Greater than>= - Greater than or equal to< - Less than<= - Less than or equal toFor example:
vector <- c(1, 3, 5, 7, 9, 11)
Lab 1: Working with R
Finish problem set
SDS 192-03: Intro to Data Science