OpenRefine

Overview

OpenRefine is an open source tool used to clean and pre-process messy data. While most people are familiar with data cleaning in their coding tool of choice (R, Python, Julia, etc.), OpenRefine is designed to provide powerful cleaning capabilities with minimal overhead. One of the most helpful capabilities of OpenRefine is the ability to check for possible duplicates and misspellings of text data using it’s text facet tools.

OpenRefine on Windows

Open your web browser of choice and navigate to the OpenRefine homepage at https://openrefine.org/. Click on the download button in the left sidebar.

On the download page, scroll to the latest version of OpenRefine and select the Windows kit. If you are unsure if you have Java installed on your system, choose the Windows kit with embedded Java instead.

Once the download has completed, open the zip and move the contents to a convenient location on your computer.

Open the resulting directory, and double click on the openrefine.exe executable.

The OpenRefine executable will start a terminal window, and shortly after launch a tab in your default web browser with the OpenRefine interface.

OpenRefine on Mac

First, head to the download page for OpenRefine and choose the latest version for mac.

Once the townload has finished, open the downloaded file. Your borwser will most likely show an alert.

Open your Applications folder in the finder, and drag OpenRefine into the folder.

Once you have dragged the application into the Applications folder, try to open it. If you receive an alert like the following, continue to the next step.

Hold down the Control key and click on OpenRefine. Click open in the menu.

It will give you an option to open OpenRefine. Click Open.

It will ask if you want OpenRefine to control Safari and access your files. Click OK.

A safari window will then open, and should look like the following. If that is the case you are all done!

Thanks to the UC Davis DataLab’s Install Guide for providing a portion of this guide.