You are here

Data Science with R: Machine Learning

Data Science with R: Machine Learning


NYC Data Science Academy
500 8th Ave
Ste 905
New York, NY 10018
Register for Course
Saturday, September 8, 2018 -
10:00am to 5:00pm


This class introduces a number of statistical models for supervised and unsupervised learning using R programming language. The goal is to understand the concepts, methods, and applications of the general predictive modeling and unsupervised learning and how they are implemented in the R language environment. A selection of important models (e.g. tree-based models, support vector machines) will be introduced in an intuitive manner to illustrate the process of training and evaluating models. Week 1: Introducing Data mining 7 hours What is data mining and how to do it Steps to apply data mining to your data Primary statistical methods and tests Supervised versus unsupervised learning Regression versus classification problems Review of linear models Simple linear regression Logistic regression Generalized linear models Week 2: Performance Measures and Dimension Reduction 7 hours Evaluating model performance Confusion matrices Beyond accuracy Estimating future performance Extension of linear models Subset selection Shrinkage methods Dimension reduction methods Week 3: kNN and Naive Bayes models 7 hours The k-Nearest Neighbors model Understanding the kNN algorithm Calculating distance Choosing an appropriate k Case study Naive Bayes models Understanding joint probability The Naive Bayes algorithm The Laplace estimator Case study Week 4: Tree models and SVMs 7 hours Tree models Regression trees and classification trees Tree models with party Tree models with rpart Random Forest models GBM models Support Vector Machines Maximal margin classifiers Support vector classifiers Support vector machines Week 5: The Association Rule and More Models 7 hours Market Basket Analysis Understanding association rules The A priori algorithm Case study Unsupervised learning K-means clustering Hierarchical clustering Time series models Stationary time series The ARIMA model The seasonal model

Register for Course

Additional Sessions