[1] 5 5 10
Fall 2022
September 14, 2022
To introduce R and R Studio, as well as basic data terms.
type
s, namely:
logical
- TRUE
or FALSE
integer
- whole numbers like 1, 5, 100numeric
- numbers with decimal places like 5.25character
- anything with lettersfactors
- for categorical data, numbers with descriptive labelsNA
- NA
s are missing valuesWe can ask R to do things using the language it understands, R code.
Say we want to ask R to:
Take the sum of 5, 5, and 10, and put in in a box called “total.”
Take the sum of 5, 5, and 10, and put the results in a box called “total.”
R would understand
total <- sum(5, 5, 10)
total <- sum(5, 5, 10)
object <- function(arguments)
You can always learn more about a function using ?
, for example ?sum
.
A vector
is an organized arrangement of data.
type
of datac()
A dataframe
is a square organization of vectors.
X. | example_vector | example_colors |
---|---|---|
1 | 5 | purple |
2 | 5 | orange |
3 | 10 | periwinkle |
A subset of data is a smaller selection of the total data set.
Learning how to effectively subset is one of the most foundational skills in data science.
You can ask for data in a specific position in a vector by giving it the number of that position. For example:
vector <- c(1, 3, 5, 7, 9, 11)
Ask for a subset of a vector using the following format. In English:
Give me vector, such that position is equal to X.
In R Code:
vector[position]
Ask for a sunset of a dataframe using the following format. In English:
Give me dataframe, such that rows are equal to X, and columns are equal to Y.
In R Code:
dataframe[rows, columns] OR dataframe$column
You can use the same tools to take parts from a dataframe to add to it.
example_dataframe
name number
1 Sam 7
2 Frodo 8
3 Pippin 3
4 Merry 6
Combine them!
name number new_column
1 Sam 7 blue
2 Frodo 8 green
3 Pippin 3 yellow
4 Merry 6 red
Conditionals help you ask for things when a condition is TRUE
.
==
- Equal to!=
- Not equal to>
- Greater than>=
- Greater than or equal to<
- Less than<=
- Less than or equal toFor example:
vector <- c(1, 3, 5, 7, 9, 11)
Lab 1: Working with R
Finish problem set
SDS 192-03: Intro to Data Science