[[1]]
[1] 1
[[2]]
[1] 2
[[3]]
[1] 3
[[4]]
[1] 4
Fall 2022
October 26, 2022
To learn the differences and use cases for lists and the apply family of functions.
In R, iterating on something is working through a vector one element at a time.
Vector = c(2, 4, 6, 8, 10)
for(X in Y) { Do Z }
 Useful when:
Lists are kinda like super-vectors (JSON-like).
They can contain anything in their elements. You could have:
Getting the content of lists requires special syntax!
Each list element is accessed using double square brackets [[ ]]
```{r}
test_list = list("num_vec" = c(1, 2, 3, 4, 5),
                 "let_vec" = c("a", "b", "c", "c"),
                 "df" = head(mtcars))
test_list
```$num_vec
[1] 1 2 3 4 5
$let_vec
[1] "a" "b" "c" "c"
$df
                   mpg cyl disp  hp drat    wt  qsec vs am gear carb
Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1
IF
example_vec_1 = c(1, 2, 3)
example_vec_2 = c(“a”, “b”, “c”)
AND
example_list = list(example_vec_1, example_vec_2)
THEN
example_list[[1]] == example_vec_1 == c(1, 2, 3)
example_list[[2]][3] == example_vec_2[3] == “c”
The apply family of functions take every element of a sequence, and does the same thing to all parts.
 apply(X, FUN = function)
Apply does the same thing to each element (roughly) all at once.
Apply FUN to element 1 in X.
Apply FUN to element 2 in X.
Apply FUN to element 3 in X.
Apply FUN to element 4 in X.
Apply FUN to element 5 in X.
Apply FUN to element 6 in X.
Apply FUN to element 7 in X.
…
Loops iterate through every element of a sequence one element at a time.
This allows dependence.
Apply functions apply the given functions to every element (roughly) at the same time.
This does not allow dependence.
c( 2, 4, 6, 8, 10 )
lapplyreturns a list of the same length as X, each element of which is the result of applying FUN to the corresponding element of X.
For every column in mtcars, apply the mean() function.
                   mpg cyl disp  hp drat    wt  qsec vs am gear carb
Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1
sapplyis similar tolapply, but it returns a vector if it can. Be careful as it’s results can surprise you!
For every column in mtcars, apply the mean() function.
                   mpg cyl disp  hp drat    wt  qsec vs am gear carb
Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1
applyis used for matrices or dataframes. You can supply theMARGINargument to make it work over rows or columns.
For every column and then every row in mtcars, apply the mean() function.
                   mpg cyl disp  hp drat    wt  qsec vs am gear carb
Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1
       mpg        cyl       disp         hp       drat         wt       qsec 
 20.090625   6.187500 230.721875 146.687500   3.596563   3.217250  17.848750 
        vs         am       gear       carb 
  0.437500   0.406250   3.687500   2.812500 
You can pass any function to FUN, including one you write!
This means you can do anything over a large collection of data.
                   mpg cyl disp  hp drat    wt  qsec vs am gear carb
Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1
```{r}
lapply(X = mtcars, FUN = function(car){
  
  # get the largest value
  largest = max(car)
  
  # get the smallest value
  smallest = min(car)
  
  # get the difference
  result = largest - smallest
  
  # return the difference
  return(result)
})
```$mpg
[1] 23.5
$cyl
[1] 4
$disp
[1] 400.9
$hp
[1] 283
$drat
[1] 2.17
$wt
[1] 3.911
$qsec
[1] 8.4
$vs
[1] 1
$am
[1] 1
$gear
[1] 2
$carb
[1] 7
The built-in parallel package in R offers several tools to run code in parallel.
Mostly, these take the form on apply family functions.
There can be no dependence between elements.
Lab 6 & Quiz 2 Open
SDS 192-03: Intro to Data Science