You are here

Intro to Data Science Bootcamp (6-Week Course - 2 Live Weekends)

Intro to Data Science Bootcamp (6-Week Course - 2 Live Weekends)

Overview

265 West 37th Street 206
New York, NY 10018
Register for Course
Saturday, November 10, 2018 - 9:00am to Sunday, November 18, 2018 - 5:00pm
$1,500

Details

This bootcamp is designed to introduce you to data science and machine learning using the Python programming language. Through this intense, 6-week program you will begin your mastery of the skills necessary to manipulate, visualize, explore, and apply machine learning to datasets to extract valuable insights. Expert Instructor This course is taught by Ted Petrou, an expert at data exploration and machine learning using Python. He is the author of Pandas Cookbook, a thorough step-by-step guide to accomplish a variety of data analysis tasks with Pandas. He is ranked in the top .1% of Stack Overflow users of all time. He is the author of the Python data exploration libraries Dexplo and Dexplot. Small Class Size This is a small class with at most 15 participants, which will allow everyone to fully participate and ask questions that will get answered quickly. Schedule Oct 22 - Nov 4: Three weeks prior to live classes, students will receive a very thorough, 200-page precourse assignment on the fundamentals of Python. There are over 100 exercises and a final project where a poker game is built complete with a simple AI. Nov 5-9: One week prior to live classes, students will receive another assignment on the basics of the Pandas library with over 100 pages and 100 exercises. Two live online classes from 7 - 10 p.m. will be held on Nov 5th and 7th. Nov 10-11, 17-18 - 9 a.m. - 5 p.m.: Live classes held near Times Square New York City. Over 1,000 pages of material, 400 questions with detailed solutions, several mini-projects, and a few major case studies will be available. During the week, students will work on completing two of these case studies. Nov 20 - Dec 2: Students will complete three end-to-end data analyses which will be personally reviewed by Ted. Students will learn how to create interactive dashboards and present their results with them. Structure of Course Learning is accomplished by working through difficult assignments and receiving and reviewing modeled solutions. Using a 'flipped classroom', students will prepare and read each day's material before coming to class. In class, students will rotate from instructor guided lessons to student-focused exercises and projects. Ted is a very active participant and will sit alongside students as they complete the exercises. Live Classes Syllabus Day 1: Minimally Sufficient Pandas The Pandas library is powerful yet confusing as there are always multiple operations to complete the same task. Students will learn a small, yet powerful subset of Pandas that will allow them to complete many tasks without getting distracted by syntax. Students will also learn a simple yet effective process for building a workflow in a Jupyter Notebook. Day 2: Split-Apply-Combine Insights within datasets are often hidden amongst different groupings. The split-apply-combine paradigm is the fundamental procedure to explore differences amongst distinct groups within datasets. During the week: Tidy Data Real-world data is messy and not immediately available for aggregation, visualization or machine learning. Identifying messy data and transforming it into tidy data provides a structure to data for making further analysis easier. Day 3: Exploratory Data Analysis Exploratory data analysis is a process to gain understanding and intuition about datasets. Visualizations are the foundations of EDA and communicate the discoveries within. Matplotlib, the workhorse for building visualizations will be covered, followed by pandas effortless interface to it. Finally, the Seaborn library, which works directly with tidy data, will be used to create effortless and elegant visualizations. Day 4: Applied Machine Learning After tidying, exploring, and visualizing data, machine learning models can be applied to gain deeper insights into the data. Workflows for preparing, modeling, validating and predicting data with Python's powerful machine learning library Scikit-Learn will be built. The very latest workflows for Scikit-Learn have been incorporated into this material. See this blog post from Ted for more info. Instructor Ted Petrou is the author of Pandas Cookbook and founder of both Dunder Data and the Houston Data Science Meetup group. He worked as a data scientist at Schlumberger where he spent the vast majority of his time exploring data and building data products. Ted received his Master's degree in statistics from Rice University and used his analytical skills to play poker professionally and teach math before becoming a data scientist.

Register for Course