Pre-class Prep
Please be sure to complete the following before class:
Install required packages
install.packages("palmerpenguins") # a package containing the `penguins` data set, which we'll use for plotting practice
install.packages("tidyverse") # a collection of packages used for data wrangling / manipulation and visualization (including {ggplot2})
Create your EDS-240-class-examples
repository
We’ll be coding together quite a bit throughout this course. To stay organized, we’ll complete all of our in-class examples in one repository (repo). Create and clone a GitHub repository named EDS-240-class-examples
. For step-by-step instructions, unfold the following note (collapsed to save space):
Lecture Materials
Week 1 instruction is broken down into three lessons:
Course logistics & syllabus
Intro to data visualization
{ggplot2}
review
Discussion Materials
Pre-discussion Prep
Before coming to section, you’ll need to install some packages and download data. For step-by-step instructions, unfold the following note (collapsed to save space):
Background
By now, you may have heard / read something like, “Data scientists spend 80% of their time preparing their data for analysis and / or visualization.” And while that may not be totally accurate for all data scientists or all projects, you will spend lots of time wrestling with data. You’ll spend this week’s discussion cleaning up a messy data set on hydraulic fracturing (aka fracking), with the goal of (re)familiarizing yourselves with some of commonly-used tidyverse functions.
This week’s data comes courtesy of Jeremy Singer-Vine’s Data is Plural weekly newsletter of useful / curious data sets (the 2023.09.27 edition). Singer-Vine’s description:
Since launching in 2011, FracFocus has become the largest registry of hydraulic fracturing chemical disclosures in the US. The database, available to explore online and download in bulk, contains 210,000+ such disclosures from fracking operators; it details the location, timing, and water volume of each fracking job, plus the names and amounts of chemicals used. The project is managed by the Ground Water Protection Council, “a nonprofit 501(c)6 organization whose members consist of state ground water regulatory agencies”. As seen in: The latest installment of the New York Times’ Uncharted Water series.
Interested in reading more about fracking? Check out this communications piece from USGS to start.
Solution
You’ll get the most out of discussion section if you physically type out the code yourself (rather than copying / pasting)!
Note: Some of the wrangling in the solution (collapsed, below) may seem a bit superfluous – we’ve included lots of steps to (re)introduce as many of the common wrangling functions as possible, given our data set.
Assignment Reminders
Assignment Type | Assignment Title | Date Assigned | Date Due |
---|---|---|---|
EOC | End-of-class survey (week 1) | Mon 01/08/2024 | Mon 01/08/2024, 11:55pm PT |
SR | Pre-course reflection (SR #1) | Mon 01/08/2024 | Sat 01/13/2024, 11:59pm PT |
HW | Homework Assignment #1 | Mon 01/08/2024 | Sat 01/20/2024, 11:59pm PT |