[[1]]
[1] 1
[[2]]
[1] 2
[[3]]
[1] 3
[[4]]
[1] 4
Fall 2022
October 26, 2022
To learn the differences and use cases for lists and the apply family of functions.
In R, iterating on something is working through a vector one element at a time.
Vector = c(2, 4, 6, 8, 10)
for(X in Y) { Do Z }
Useful when:
Lists are kinda like super-vectors (JSON-like).
They can contain anything in their elements. You could have:
Getting the content of lists requires special syntax!
Each list element is accessed using double square brackets [[ ]]
```{r}
test_list = list("num_vec" = c(1, 2, 3, 4, 5),
"let_vec" = c("a", "b", "c", "c"),
"df" = head(mtcars))
test_list
```$num_vec
[1] 1 2 3 4 5
$let_vec
[1] "a" "b" "c" "c"
$df
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
IF
example_vec_1 = c(1, 2, 3)
example_vec_2 = c(“a”, “b”, “c”)
AND
example_list = list(example_vec_1, example_vec_2)
THEN
example_list[[1]] == example_vec_1 == c(1, 2, 3)
example_list[[2]][3] == example_vec_2[3] == “c”
The apply family of functions take every element of a sequence, and does the same thing to all parts.
apply(X, FUN = function)
Apply does the same thing to each element (roughly) all at once.
Apply FUN to element 1 in X.
Apply FUN to element 2 in X.
Apply FUN to element 3 in X.
Apply FUN to element 4 in X.
Apply FUN to element 5 in X.
Apply FUN to element 6 in X.
Apply FUN to element 7 in X.
…
Loops iterate through every element of a sequence one element at a time.
This allows dependence.
Apply functions apply the given functions to every element (roughly) at the same time.
This does not allow dependence.
c( 2, 4, 6, 8, 10 )
lapplyreturns a list of the same length as X, each element of which is the result of applying FUN to the corresponding element of X.
For every column in mtcars, apply the mean() function.
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
sapplyis similar tolapply, but it returns a vector if it can. Be careful as it’s results can surprise you!
For every column in mtcars, apply the mean() function.
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
applyis used for matrices or dataframes. You can supply theMARGINargument to make it work over rows or columns.
For every column and then every row in mtcars, apply the mean() function.
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
mpg cyl disp hp drat wt qsec
20.090625 6.187500 230.721875 146.687500 3.596563 3.217250 17.848750
vs am gear carb
0.437500 0.406250 3.687500 2.812500
You can pass any function to FUN, including one you write!
This means you can do anything over a large collection of data.
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
```{r}
lapply(X = mtcars, FUN = function(car){
# get the largest value
largest = max(car)
# get the smallest value
smallest = min(car)
# get the difference
result = largest - smallest
# return the difference
return(result)
})
```$mpg
[1] 23.5
$cyl
[1] 4
$disp
[1] 400.9
$hp
[1] 283
$drat
[1] 2.17
$wt
[1] 3.911
$qsec
[1] 8.4
$vs
[1] 1
$am
[1] 1
$gear
[1] 2
$carb
[1] 7
The built-in parallel package in R offers several tools to run code in parallel.
Mostly, these take the form on apply family functions.
There can be no dependence between elements.
Lab 6 & Quiz 2 Open
SDS 192-03: Intro to Data Science