Data Bootcamp: MBA Summer 2019
Where and When
 Instructor: Benjamin Zweig (bzweig@stern.nyu.edu)
 Teaching fellow: Supriya Jha(sj2685@nyu.edu)
 Meeting times: Mon & Wed 6PM to 9PM (May 22 to Jul 3)
 Meeting place: KMEC Room:280, Washington Square
Important Links

THE SYLLABUS. All the important details about the course, procedures, important dates, etc.

THE BOOK. The topics in the first half are all in the book. We will follow this closely. At the book link, click the large blue Read button to read online – or download the pdf. Both come with links.

NOTEBOOKS Github repository of notebooks used in class.

DISCUSSION GROUP Post your doubts on NYU Classes forum tab.

Final Project Due Date: Jul 10, 2019
Problem Set Submissions
Assignments will be posted on NYU Classes. Submit your python code in ipython notebook format on NYU Classes.
Week By Week Guide…
Class 1 (May 22, 2019): Python Fundamentals 1
Handouts: Outline  Book  Three ideas
Examples: Gapminder  cancer screening  Uber in NYC  medical expenditures  mortality  earthquake  Gender pay gap  Fertility  Vaccines
Summary: It’s nice to have skills; installing Anaconda; Spyder and Jupyter/IPython; data; questions; idea machines.
Class 2 (May 29, 2019): Python fundamentals 2
Handouts: Outline  Book chapter
Summary: Calculations; assignments; strings; lists; tuples; builtin functions; objects; methods; tab completion.
What’s due: Problem Set 1
Class 3 (Jun 3, 2019): Python fundamentals 3, Intro to packages and Pandas
Handouts: Outline  Book chapter 
Summary: True and False; comparisons; conditionals; slicing; loops; function definitions and returns; dictionaries.
Packages; import; Pandas;
What’s due: Problem Set 2; Team submission (Just team member names..)
Class 4 (Jun 5, 2019): Cleaning & Filtering
Handouts: Outline Code_Pandas_Cleaningapplications)
Summary: Cleaning and filtering data.
What’s due: Problem Set 3; Project ideas submission
Class 5 (Jun 12, 2019): Matplotlib
Handouts: Outline  Book chapter  Code (Download “Raw” as ipynb) 
Summary: Three approaches to graphics: dataframe plot methods, plot(x,y), and fig/ax objects and methods; lines, scatters, bars, horizontal bars, styles.
What’s due: Problem Set 4
Class 6 (Jun 17, 2019): Shaping and Continuation of Matplotlib
Handouts:
(https://github.com/nyusterndatabootcamp/teaching_materials/blob/master/documents/bootcamp_topic_pandasshape.pdf)  Code_Pandas_Shaping
Code (examples  current indicators  demography  Airbnb)
Summary: Aggregations and grouping the data
What’s due: Problem Set 5
Class 7 (Jun 19, 2019): Regression
Handouts:
Summary: Basic Regression Analysis
What’s due: Nothing!
Class 8 (Jun 24, 2019): Merging
Handouts: Outline Code_Pandas_Cleaningapplications)
Summary: Merging
What’s due: Problem Set 6
Class 9 (Jun 26, 2019): Machine Learning 1
Handouts: [Outline] What’s due: Nothing!
Class 10 (Jul 1, 2019): Machine Learning 2
Handouts:
(Code_Pandas_Combiningsummarizing)
Summary: Combining dataframes (merge, concatenate). We will also cover Scikitlearn, Machine Learning package to model various classification, regression and clustering algorithms.
What’s due: Problem Set 7, Submit project data & show input with basic diagnostics
Class 11 (Jul 3, 2019): Wrap Up & Data Analysis workflow
Handouts: (Code_Pandas_Combiningsummarizing)
Summary: More into ML and project discussions.
Walk through a data analysis pipeline from importing, exploring, cleaning, visualizing and forming analysis.
What’s due: Problem Set 8
Final Project Due Date: Jul 10, 2019