Machine Learning

Course Page for CS550 (Machine Learning) to be taught at IIT Bhilai, India in the Monsoon Semester of 2023.
Course Instructor: Dr. Gagan Raj Gupta
Office Location: ED1 Room 412

Motivation and Objectives

Machine Learning is concerned with computer programs that automatically improve their performance through experience (e.g., programs that learn to recognize objects, Analyze sentiments, recommend music and movies, and drive autonomous robots). This course covers the theory and practical algorithms for machine learning from a variety of perspectives.

Topics include:

Supervised Learning (Regression/Classification), Linear models: Linear Regression, Logistic Regression, Generalized Linear Models, Support Vector Machines, Nonlinearity and Kernel Methods, Multi-class/Structured Outputs, Ranking/Grading
Evaluating Machine Learning algorithms and Model Selection, Ensemble Methods (Boosting, Bagging, Random Forests), Sparse Modeling and Estimation,
Unsupervised Learning, Clustering: K-means/Kernel K-means, Dimensionality Reduction: PCA and kernel PCA, Matrix Factorization and Matrix Completion,
Generative Models (mixture models and latent factor models), Diffusion models, GAN, (Variational) Autoencoders
Assorted Topics: learning theory (bias/variance tradeoffs, practical advice)
Deep Learning and Feature Representation Learning: CNN, RNN, GNN

The course will also discuss recent machine-learning applications, such as computer vision, medical imaging, time-series mining, bioinformatics, web and industrial data processing.

Programming assignments include hands-on experiments with various learning algorithms. This course is designed to give a graduate-level student a thorough grounding in the methodologies, technologies, mathematics, and algorithms currently needed by people researching machine learning.

Learning Outcomes

Implement and analyze existing learning algorithms, including well-studied methods for classification, regression, structured prediction, clustering, and representation learning
Integrate multiple facets of practical machine learning in a single system: data preprocessing, learning, regularization, and model selection
Describe the formal properties of models and algorithms for learning and explain the practical implications of those results
Compare and contrast different paradigms for learning (supervised, unsupervised, etc.)
Design experiments to evaluate and compare different machine learning techniques on real-world problems
Employ probability, statistics, calculus, linear algebra, and optimization in order to develop new predictive models or learning methods
Given a description of an ML technique, analyze it to identify
1. the expressive power of the formalism;
2. the inductive bias implicit in the algorithm;
3. the size and complexity of the search space;
4. the computational properties of the algorithm:
5. any guarantees (or lack thereof) regarding termination, convergence, correctness, accuracy or generalization power.

Class Timings

Lectures (Room LHC 202): Wednesday and Friday 11:30 am to 12:20 p.m., Thursday 8:30 am to 9:20 am
Lab (Room LHC 202): Tuesday 3:30-5:30 p.m.

Grading Policy

The grading policy has been designed to give students as much hands-on practice while ensuring the fundamentals are strong. Students who are first to submit PRs for correct solutions to assignments/homework will receive Class Participation credits (bonus)

Mid-sem Exam: 20%
End-sem Exam: 20%
Theory/Conceptual Homework: 10%
Programming Assignments: 20%
Major Project: 20%
Surprise Quiz(es): 10% (5 in-class/lab assessments, typically after every segment). You won't get any prior intimation.

2 assignments (Scaled up to a total of 600 points)

Asg 1: Regression + Classification Concepts: Feature engineering, Train/Val/Test Splits, Hyper-parameter tuning, Understanding of performance metrics, Reporting Optional Concepts: Random Forests, Boosting, Mixing Models,

Asg 2: Image Classification Concepts: CNN architectures, Transfer Learning, Model customization, Regularization, Trade-off between model size and accuracy

Groups Policy

Students are encouraged to discuss homework and assignments with each other, but the submissions have to be original. If we find plagiarism, your grade will be reduced to D or F.
Assignments will be individual effort only
Major Project can be done in groups of 2 or 3 with appropriate justification

Late submission policy

Every homework and/or assignment will have plenty of time to complete. There will be no late days allowed for any homework/assignment/project deadline.
Students are encouraged to make regular submissions to Canvas portal and not wait for the last minute.
In exceptional circumstances, the student can seek the instructor's permission for skipping/late submission of an assignment. This will be done MAX one time for any student in the semester.

Attendance Policy

Students are expected to attend each class and lab session. There will be surprise quizzes and attendance may also be taken.
If a student has attendance less than 75% in the lecture component, they will not be allowed to appear for the exam.

Pre-requisites

Students entering the class are expected to have a pre-existing working knowledge of probability, linear algebra, statistics and algorithms, though the class has been designed to allow students with a strong numerate background to catch up and fully participate.

Programming experience in a general programming language. Specifically, you need to have written from scratch programs consisting of few hundred lines of code. Note: For each programming assignment, you will be required to use Python. You will be expected to know, or be able to quickly pick up, that programming language.
Basic familiarity with probability and statistics: (Conditional probability, Bayes Rule, Random variable, independence, conditional independence, Expectation, Variance, Concentration Inequalities, Distributions, Gaussian, Multi-variate)
Linear Algebra: Vectors and matrices, inner product, projection, Basis, (complete, orthonormal), Orthogonality, linear (in)dependence eigenvalues and eigenvectors; singular values and vectors; SVD
Discrete mathematics: (Proofs, Induction, Logic, Combinatorics, Graphs)
You must strictly adhere to these pre-requisites! Even if IIT Bhilai's registration system does not prevent you from registering for this course, it is still your responsibility to make sure you have all of these prerequisites before you register.

Books (Textbook)

[HML] Hands on Machine Learning Aurélien Géron, 3rd edition.
[ISLPy] Introduction to Statistical Learning with applications in Python.

Reference Books

[ML] Machine Learning, Tom Mitchell.
[PRML] Pattern recognition and machine learning, Chistopher Bishop
[PML] Probabilistic Machine Learning, Kevin Murphy (2nd edition)
[CIML] A Course in Machine Learning, Hal Daumé III
[MML] Mathematics for Machine Learning, Marc Peter Deisenroth, A. Aldo Faisal, and Cheng Soon Ong

Class Materials

Google Drive Link:
Canvas Link:
CheatSheet: https://stanford.edu/~shervine/teaching/cs-229/cheatsheet-supervised-learning
Example Projects: https://cs229.stanford.edu/proj2021spr/

Similar Courses

Stanford CS229: https://cs229.stanford.edu/syllabus-fall2021.html
Harvard CS181: https://harvard-ml-courses.github.io/cs181-web/schedule
MIT 6.036 : https://canvas.mit.edu/courses/7509
IIT Delhi COL341: https://www.cse.iitd.ernet.in/~rahulgarg/Teaching/COL341.htm
UW CSE446: https://courses.cs.washington.edu/courses/cse446/22wi/
UCB CS189: https://people.eecs.berkeley.edu/~jrs/189/

Detailed Schedule

Legend:

#	Week	Topics planned in this week	Text Book Reference	Reading	Notebooks
1	Jul 29	Tutorial 1: Linear Algebra and Python Libraries	MML Ch2-4		Linear Algebra; numpy; pandas
2	Aug 1	The Machine Learning Landscape: Applications, Types, challenges	HML Ch1, CIML Ch1	AlgoVsModel
2	Aug 1	End to End Approach: Data collection and preparation	HML Ch 2	Data Carpentry
2	Aug 1	Tutorial 2: Vector Calculus	MML Ch 5		Calculus
3	Aug 8	Limits of Learning	CIML Ch2
3	Aug 8	Regression: Linear, Polynomial, Regularization, Logistic	HML Ch4	Demo1 Demo2,UnderFitOverFit
3	Aug 8	Tutorial 3: Continuous Optimization	MML Ch 7
4	Aug 15	Geometry and Nearest Neighbors	CIML Ch3
4	Aug 15	Perceptron, Practical Issues in ML	CIML Ch4, Ch5
4	Aug 15	SVM: Hard-margin, Soft-margin, Linear, Non-linear, SVM Regression	HML Ch5
4	Aug 15	Tutorial 4: Probability and Distributions	MML Ch 6	Probability
5	Aug 22	SVM: Kernelized SVM, Online SVM	HML Ch5
5	Aug 22	Decision Trees: Entropy, Regularlization	HML Ch6
5	Aug 22	Tutorial 5: TensorFlow/PyTorch
6	Aug 29	Ensemble Learning: Bagging, Random Forests	HML Ch7
6	Aug 29	Ensemble Learning: Boosting, AdaBoost, Gradient Boosting	HML Ch7
7	Aug 29	Probabilistic Modeling	CIML Ch9
6	Sep 2	Project:Build an ensemble of models
6	Sep 7	Tierce 1 Exam
7	Sep 12	Neural Networks Introduction	HoML Ch 10
7	Sep 12	Compute Graph, Auto-diff	HoML Appendix D
8	Sep 19	Batch Normalization, Graident Clipping, Regularization, Optimization Techniques	HoML Ch11
8	Sep 19	CNNs, Main Archiectures	HoML Ch 14
8	Sep 26	RNN, Attention Models	HoML Ch 15
9	Sep 26	Forecasting Time Series	HoML Ch 15
9	Oct 10	Autoencoders, GAN
10	Oct 10	Generative Models
10	Oct 17	Learning Theory
11	Oct 17
11		Project: Applications/Paper with code
12	Oct 1 -Oct9	Mid Sem Break
12	Oct 21	Tierce 2 Exam
13	Oct 27	Reinforcement Learning
13	Oct 27		IVB Ch10
14	Nov 4	Basic concepts in RL, value iteration, policy iteration.
14	Nov 4
14	Nov 11	Model-based RL, value function approximator.
15	Nov 18	Fairness, algorithmic bias, explainability, privacy		Bias
15	Nov 18		IVB Ch14
15	Nov 25	Fairness, algorithmic bias, explainability, privacy
16	Nov 25
17	Nov 25	Project: Reinforcement learning Application

Resources

Linear Algebra Notes from Stanford - short
3blue1brown -beautiful animated explanations
Linear Algebra Notes
inear Algebra Review
The Matrix Cookbook - It won't teach you linear algebra, but this free desktop reference on matrices may come in handy.
Probability Notes from Stanford - short
Review of probability from a course by David Blei at Princeton
Andrew Moore's Probability tutorial slides (somewhat incomplete)
Another probability review, from UCI
http:https://www.cs.utoronto.ca/~fidler/teaching/2015/slides/CSC411/
http:https://www.cs.cmu.edu/~epxing/Class/10701/lecture.html
http:https://web.cs.ucla.edu/~sriram/courses/cs188.winter-2017/html/index.html
https://people.eecs.berkeley.edu/~jrs/189/
http:https://alex.smola.org/teaching/cmu2013-10-701/
http:https://sli.ics.uci.edu/Classes/2015W-273a

Datasets

We will benefit from other people’s efforts:

Google Dataset Search
Amazon’s AWS datasets
Kaggle datasets
Wikipedia’s list
UC Irvine Machine Learning Repository
Quora.com
Reddit
Dataportals.org
Opendatamonitor.eu
Quandl.com

Name		Name	Last commit message	Last commit date
Latest commit History 190 Commits
Handouts		Handouts
Homework		Homework
Labs_M23		Labs_M23
Lectures		Lectures
Prog_Assignments		Prog_Assignments
Tutorial		Tutorial
.DS_Store		.DS_Store
GNN_Tutorial_NyAI.ipynb		GNN_Tutorial_NyAI.ipynb
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Machine Learning

Motivation and Objectives

Learning Outcomes

Class Timings

Grading Policy

2 assignments (Scaled up to a total of 600 points)

Groups Policy

Late submission policy

Attendance Policy

Pre-requisites

Books (Textbook)

Reference Books

Class Materials

Similar Courses

Detailed Schedule

Resources

Datasets

About

Releases

Packages

Contributors 9

Languages

License

gagan-iitb/CS550

Folders and files

Latest commit

History

Repository files navigation

Machine Learning

Motivation and Objectives

Learning Outcomes

Class Timings

Grading Policy

2 assignments (Scaled up to a total of 600 points)

Groups Policy

Late submission policy

Attendance Policy

Pre-requisites

Books (Textbook)

Reference Books

Class Materials

Similar Courses

Detailed Schedule

Resources

Datasets

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 9

Languages

Packages