Tyler Simko

Fundamentals of Data Science

South Amboy Enrichment: Summer 2021
July 6th-29th. Mon., Tu., Thu. 9-11am EST

From presidential campaigns to medical trials to Netflix recommendations, data analysis is at the heart of the 21st century world. Fundamentals of Data Science is a six-week summer course that will teach you how to collect, analyze, and visualize data. Students will conduct an independent research project into any topic of their choice, culminating in a final analysis that can be used as a writing sample for college, internship, and job applications.

Students will learn:

Content Schedule

Lesson 0: The Tools of Data Science

First, we'll setup and get familiar with the tools you'll be using throughout this course - a programming language called R and a free application to make working with R easier called RStudio.

Lesson 1: First Steps of Data Science

What is data? What are the different types of data? How do we store and interact with data?

Lesson 2: Working with Data

Here, we dive more deeply into data manipulation to learn how to work with data to answer questions that we are interested in.

Lesson 3: Data Visualization

We enter the world of data visualization in R. How can we use data visualization to communicate ideas and analyses? What makes a successful graphic?

Lesson 4: Summarizing Data

How do we learn from data? We improve our data manipulation and visualization skills by calculating statistics and visualizing trends.

Lesson 5: Mapping and Merging Data

So far, we have worked exclusively with a single dataset in normal spreadsheet format. How do we work with more complex data, like geographic data for maps? How can we combine information from multiple data sources?

Lesson 6: Review and Advanced Plotting

We have already covered an astounding amount of material. Here, we take a step back and review what we have learned so far. We also cover a few advanced plotting techniques.

Lesson 7: Text Data

Many interesting research questions involve text - song lyrics, books, campaign speeches, Tweets, etc. Here, we'll learn how to work with text data in more detail:

Lesson 8: Relationships Between Variables

A great deal of scientific research explores the relationship between multiple variables. Does increased school funding improve student outcomes? Does smoking cause lung cancer? These are fundamentally questions of causal inference, which we will discuss here:

Final Project

The purpose of data science is to gain knowledge and insights from data. The ultimate goal of this course is to prepare you to use the tools of data science to conduct an independent analysis of your own. This is your opportunity to apply everything we’ll learned in the class to a topic that you are passionate about.

The final project instructions are found here.