Overview Common Settings Overview R is a programming language and computing environment specialized for statistical analysis and data manipulation. It’s commonly used for performing statistical tests, creating data visualizations, and writing data analysis reports. Despite focusing on statistics, it’s a full-fledged programming language, and relatively easy to learn. You should have gotten R and R studio install on the first data of SDS 100. If you did not, please follow the guide here.
Overview git on Windows git on Mac Overview git is a tool for version control and collaboration. It is the tool used by data science teams big and small to keep track of code. Think of it like track changes in Word or Google docs, but for code files. You will also need an account on Github. Please create one here. git on Windows Follow these step-by-step instructions if you’re installing Git on a Windows machine:
Overview Create an Account Creating SSH Keys and Adding to Github Overview Github is a online code repository that great expands the utility of git. It acts as a clearinghouse for code, and is used worldwide by researchers, government, and industry. Create an Account First up, we need to create an account on github.com. Navigate to the site, and click the Sign up button in the upper right. Enter your email and create a password.
Overview DB Browser on Windows DB Browser on Mac Overview DB Browser is an ultra lightweight viewer for SQLite databases. It is made to allow those familiar with spreadsheets to work more easily with the common SQLite format. However, that is all that it does; you cannot use it on other database types. DB Browser on Windows First head to the DB Browser download page, and select the version that matches your system.
Overview OpenRefine on Windows OpenRefine on Mac Overview OpenRefine is an open source tool used to clean and pre-process messy data. While most people are familiar with data cleaning in their coding tool of choice (R, Python, Julia, etc.), OpenRefine is designed to provide powerful cleaning capabilities with minimal overhead. One of the most helpful capabilities of OpenRefine is the ability to check for possible duplicates and misspellings of text data using it’s text facet tools.
Overview Overview R uses a number of packages to work with data, which are largely community created. This means many of them do not come pre-installed with R. Here is a list of packages we will use this semester. You should be able to paste this into the R console and press enter to install them all at once. install.packages("tidyverse"); install.packages("dplyr"); install.packages("skimr"); install.packages("ggplot2"); install.packages("mosaic"); install.packages("plotly"); install.packages("todor"); install.packages("compareDF"); install.packages("future"); install.packages("rvest"); install.
Overview Check your windows version Windows version larger than (or equal to) 19041 Windows version smaller than 19041 Verifying your install Overview Windows now has the ability to install a linux operating system on your machine without the use of an emulator. This gives you a full-featured linux environment that can interact with your normal files. Check your windows version First, please check the build version of Windows that you are using.