You are here

Data Science with Python: Data Analysis & Visualization

Data Science with Python: Data Analysis & Visualization


NYC Data Science Academy
500 8th Ave
Ste 905
New York, NY 10018
Register for Course
Sunday, June 2, 2019 - 9:00am


This five week course is an introduction to data analysis with the Python programming language, and is aimed at beginners. We introduce how to work with different data structure in Python. We cover the most popular modules, including Numpy, Scipy, Pandas, matplotlib, and Seaborn, to do data analytics and visualization. We use Ipython notebook to demonstrate the results of codes and change codes interactively during the class. Our past students include people have no programming experience and people have little exposure by taking Python class. Students told us our classes are very engaging, interactive, hands-on and have tons of content. Day 1 Introduction to Python Python is a high-level programming language.You will learn the basic syntax and data structures in Python. Ipython provides a robust and productive environment for interactive and exploratory computing, which is great tool to do scientific computation and education. Introduction to Ipython Basic objects in Python Variables and self-defining functions Control flow Advanced data structures Day 2 Explore deeper with Python Python is a object-oriented programming language. Learn a little about OOP will help you understand how Python codes work. To do data analysis, the first thing you need to know is how to deal with files which contains data. Sometime the data is dirty and unstructured, you will learn text processing including regular expressions to deal with them. Introduction to object-oriented programming How to deal with files Run Python scripts Handling and processing strings Day 3 Scientific computation tools There are three modules for scientific computation that make Python as powerful as Matlab: Numpy, Matplotlib and Scipy. Numpy, short for Numerical Python, is the fundamental package for scientific computing in Python. Matplotlib is the most popular Python library for producing plots and other 2D data visualizations. SciPy is a collection of packages addressing a number of different standard problem domains in scientific computing. Numpy Matplotlib (mainly the sub-module "pyplot") Scipy (mainly the sub-module "stats) Day 4 Data Visualization Python can also generate graphics easily using "Seaborn". Seaborn is a Python visualization library based on matplotlib. It provides a high-level interface for drawing attractive statistical graphics. Seaborn Day 5 Data manipulation with Pandas Pandas provides rich data structures and functions designed to make working with structured data fast, easy, and expressive. The "DataFrame" object in pandas is just like the "data.frame" object in R. Pandas makes data manipulation(filter, select, group, aggregate, etc.) as easier as in R. Panda

Register for Course

Additional Sessions