Data Bootcamp: Undergrad Fall 2018

This page is your key resource for the course. Everything you need is here! Below are links to key documents such as the syllabus, the book, the blog, and my GitHub repository for the class. Moreover, there is a date by date list of topics, and links to material used in each class. Please watch this site regularly to stay up to date.

Last update: 1/22/2019


Where and When



Important Dates


Week By Week Guide…


Topic 1. Introduction: Data + Python = Magic!

Handouts: Book | Three ideas
Summary: It’s nice to have skills; installing Anaconda; Jupyter/IPython; data; questions; idea machines.


Topic 2. Python fundamentals 1

Handouts: Book chapter | Code
Summary: Calculations; assignments; strings; lists; tuples; built-in functions; objects; methods; tab completion.


Topic 3. Python fundamentals 2

Handouts: Book chapter | Code
Summary: True and False; comparisons; conditionals; slicing; loops; function definitions and returns; dictionaries.


Topic 4. Python fundamentals 3

Handouts: Book chapter | Code
Summary: True and False; comparisons; conditionals; slicing; loops; function definitions and returns; dictionaries.


Topic 5. Intro to Pandas

Handouts: Book chapter | Code
Summary: Packages; import; Pandas; csv files; reading csv/xls files; dataframes; columns; index; APIs.


Topic 6. Python graphics: Matplotlib fundamentals

Handouts: Book chapter | Code
Summary: Approach to graphics focused on the fig/ax objects and methods; lines, scatters, bars, horizontal bars, histograms, styles.

In class code/lectures:


Topic 7. Thinking about projects

Handouts: Outline | Project Examples | Code (examples | current indicators | demography | Airbnb)
Summary: Projects: say something interesting with data. Idea machines. Examples.


Topic 8 More Pandas: Combining

Handouts: Code
Summary: Often we need to combine data from two or more dataframes. We explore the merge feature of Pandas. Along the way we take an extended detour to review methods for downloading and unzipping compressed files


Topic 9 More Pandas: Cleaning

Handouts: Code
Summary: Pandas has incredible facilities for managing data. We look at fixing numbers misidentified as strings, managing missing observations, selecting variables and observations, and the isin and contains methods. Application: What is the price of Guacamole at Chipotle?

Topic 10 More Pandas: Shaping

Handouts: Code
Summary: Understand and be able to apply the melt/stack/unstack/pivot methods.


Optional Topic: Stats Models

Optional Topic: Mapping and GeoPandas

Optional Topic: Time Series Methods

Optional Topic: Basic Machine Learning