Data Bootcamp: Undergraduate Fall 2019
Where and When
 Instructor: Benjamin Zweig (bzweig@stern.nyu.edu)
 Teaching Fellow: Richard Li (rjl448@nyu.edu)
 Meeting times: Tues/Thurs (2:00PM  3:15PM; 3:30PM  4:45PM)
 Meeting place: KMEC Room: 390, Washington Square
Important Links

THE SYLLABUS All the important details about the course, procedures, important dates, etc.

THE BOOK The topics in the first half are all in the book. We will follow this closely. At the book link, click the large blue Read button to read online – or download the pdf. Both come with links.

NOTEBOOKS Github repository of notebooks used in class.

DISCUSSION GROUP Post your doubts on NYU Classes forum tab.

Final Project (Due Dec 19, 2019)
Problem Set Submissions
Assignments will be posted on NYU Classes. Submit your python code in ipython notebook format on NYU Classes.
Week By Week Topic Guide…
Python Fundamentals: (9/3 & 9/5)
Handouts: 9/3 Outline  9/5 Outline  Book  Three ideas
Examples: Gapminder  Cancer Screening  Uber in NYC  Medical Expenditures  Mortality  Earthquake  Gender Pay Gap  Fertility  Vaccines
Summary: Intro; calculations; assignments; strings; lists; tuples; builtin functions; objects; methods; tab completion; True and False; comparisons; conditionals; slicing; loops; function definitions and returns; dictionaries.
What’s due:
Python Fundamentals II, Intro to Packages: (9/10 & 9/12)
Handouts: Outline  Book chapter
Summary: Slicing; loops; function definitions and returns; dictionaries; packages; import; Pandas.
What’s due: Problem Set 1 (9/12);
9/17  NO CLASS
Cleaning & Filtering: (9/19, 9/24 & 9/26)
Handouts: Outline Code_Pandas_CleaningApplications)
Summary: Cleaning and filtering data.
What’s due: Problem Set 2 (9/26); Team submission (Just team member names..) (9/26)
10/1  NO CLASS (HOLIDAY)
Shaping and Matplotlib: (10/3, 10/8, 10/10 & 10/17)
Handouts: Shaping Outline  Matplotlib Outline  Book chapter
Code_Pandas_Shaping 
Code_Matplotlib (Download “Raw” as ipynb)
Code (examples  current indicators  demography  Airbnb)
Summary: Aggregations and grouping data; three approaches to graphics: dataframe plot methods, plot(x,y), and fig/ax objects and methods; lines, scatters, bars, horizontal bars, styles.
What’s due: Problem Set 3 (10/8); Project ideas submission (10/8); Problem Set 4 (10/17);
10/15 & 10/22  NO CLASS (HOLIDAY)
10/24  EXAM REVIEW
10/29  MIDTERM EXAM
Merging and Data Analysis Workflow I: (10/31, 11/5 & 11/7)
Handouts: Outline
Summary: Merging and Data Analysis Workflow I.
What’s due: Problem Set 5
11/12  PROJECT TOUCHPOINT
Regression: (11/14, 11/19 & 11/21)
Handouts:
Summary: Basic Regression Analysis
What’s due: Nothing!
11/26  PROJECT TOUCHPOINT
11/28  NO CLASS (THANKSGIVING)
Machine Learning: (12/3, 12/5 & 12/10)
Handouts:
(Code_Pandas_Combining  Summarizing)
Summary: Combining dataframes (merge, concatenate). We will also cover Scikitlearn, Machine Learning package to model various classification, regression and clustering algorithms.
What’s due: Problem Set 6; Submit project data & show input with basic diagnostics
Wrap Up & Data Analysis Workflow: (12/12)
Handouts: (Code_Pandas_Combining  Summarizing)
Summary: More into ML and project discussions.
Walk through a data analysis pipeline from importing, exploring, cleaning, visualizing and forming analysis.
What’s due: Problem Set 7
Final Project Due Date: Dec 19, 2019