#<REPLACE THIS COMMENT WITH YOR ANSWER>
Lab 1. Fundamental R Coding
Introduction
Click here to access the lab on Github Classroom: Github Classroom Assignment for Lab 1: R Coding
This lab will provide an opportunity to practice fundamental R coding skills with real-world data. We will be looking at data from the United Nations (UN) World Population Prospects 20221, which for 27 years has been providing population estimates for countries around the world. This year their estimates span 237 countries, with historical data from 1950-2022, and projections out to 2100. Such population estimates play a crucial function in many long-term planning activities. This includes projects such as predicting and preparing for climate crises, anticipating global health needs, and even projecting targets for business expansion.
This lab marks your first step into understanding the world through your own data analyses. The first step is often the hardest. But by the end of this lab you will already have more data skills than the vast majority of the population. With this, take your first step towards understanding, and through understanding towards change.
Understanding the Data Source
Before we start to work with the data, we need to understand it. What do all of the variables (columns) mean? Let’s start by looking at the data itself. In this project repo (repository/folder), navigate to the “data” folder, and open the file named “WPP2022_Demographic_Indicators_Medium.csv”. This is the raw data file from the United Nations. There are a lot of numbers here, and some of the variable names are nonsense. What does “LE15” even mean?!
This is where data documentation comes in. It is all the meta-data about the data you are looking at. It often explains how the data was made, and what all the data means. Open the file named “WPP2022_Demographic_Indicators_notes.csv” to start looking at some documentation. This file explains what each of the main data columns mean in terms of what is measured, and the units they are measured in. For example, the aforementioned “LE15” stands for “Life Expectancy at Age 15, both sexes,” and is measured in years. That means in the data file, for row 2 and column LE15, for the whole world in 1950, a person at 15 years old had a life expectancy of 47.0395
years.
We can also look into the sources of the data compiled by the UN. You can find that in the file called “WPP2022_F02_METADATA.XLSX.” The UN relies on a lot of different sources. This can result in impressive looking results, but may hide some issues.
Loading in the Data
Now that we have an idea of what the data is, let’s work on getting it into R. Using a function we learned this week, read the dataframe in the data folder of this project into R.
Understanding the Data
Now that we have our UN Data in R, let’s start to explore it a bit.
#<REPLACE THIS COMMENT WITH YOR ANSWER>
Learning from the Data
Now that we have the data loaded into R, let’s see what we can learn from this data.
#<REPLACE THIS COMMENT WITH YOR ANSWER>
#<REPLACE THIS COMMENT WITH YOR ANSWER>
#<REPLACE THIS COMMENT WITH YOR ANSWER>
#<REPLACE THIS COMMENT WITH YOR ANSWER>
#<REPLACE THIS COMMENT WITH YOR ANSWER>
#<REPLACE THIS COMMENT WITH YOR ANSWER>
Footnotes
United Nations, Department of Economic and Social Affairs, Population Division (2022). World Population Prospects 2022, Online Edition.↩︎