Course Page for CS550 (Machine Learning) to be taught at IIT Bhilai, India in the Monsoon Semester of 2022.
Course Instructor: Dr. Gagan Raj Gupta
Machine Learning is concerned with computer programs that automatically improve their performance through experience (e.g., programs that learn to recognize objects, Analyze sentiments, recommend music and movies, and drive autonomous robots). This course covers the theory and practical algorithms for machine learning from a variety of perspectives.
Topics include:
- Supervised Learning (Regression/Classification), Linear models: Linear Regression, Logistic Regression, Generalized Linear Models, Support Vector Machines, Nonlinearity and Kernel Methods, Multi-class/Structured Outputs, Ranking,
- Evaluating Machine Learning algorithms and Model Selection, Ensemble Methods (Boosting, Bagging, Random Forests), Sparse Modeling and Estimation,
- Unsupervised Learning, Clustering: K-means/Kernel K-means, Dimensionality Reduction: PCA and kernel PCA, Matrix Factorization and Matrix Completion,
- Generative Models (mixture models and latent factor models),
- Assorted Topics: learning theory (bias/variance tradeoffs, practical advice); reinforcement learning.
- Deep Learning and Feature Representation Learning
The course will also discuss recent applications of machine learning, such as to medical imaging, data mining, bioinformatics, text and web data processing.
Programming assignments include hands-on experiments with various learning algorithms. This course is designed to give a graduate-level student a thorough grounding in the methodologies, technologies, mathematics and algorithms currently needed by people who do research in machine learning.
- Implement and analyze existing learning algorithms, including well-studied methods for classification, regression, structured prediction, clustering, and representation learning
- Integrate multiple facets of practical machine learning in a single system: data preprocessing, learning, regularization and model selection
- Describe the the formal properties of models and algorithms for learning and explain the practical implications of those results
- Compare and contrast different paradigms for learning (supervised, unsupervised, etc.)
- Design experiments to evaluate and compare different machine learning techniques on real-world problems
- Employ probability, statistics, calculus, linear algebra, and optimization in order to develop new predictive models or learning methods
- Given a description of a ML technique, analyze it to identify
- the expressive power of the formalism;
- the inductive bias implicit in the algorithm;
- the size and complexity of the search space;
- the computational properties of the algorithm:
- any guarantees (or lack thereof) regarding termination, convergence, correctness, accuracy or generalization power.
Lectures: Friday 11:30 am to 12:50 p.m., Wednesday 8:30 am to 9:50 am Tutorial: Monday 5-6:30 p.m.
The grading policy has been designed to give as much hands-on practice to students while making sure the fundamentals are strong. To manage the academic workload and address personal needs, there are plenty of choices the student can make in taking exams, homeworks or programming assignments.
- Tierce Exams: 30% [There will be 3 exams, best 2 will be taken]
- Theory/Conceptual Homeworks: 10% [There will be 4 homeworks, best 2 will be taken]
- Programming Assignments: 30% [There will be 7 assignments, best 2 will be taken]
- Major Project: 30% [There will be a choice to do a major project + 2 assignments or submit at least 4 assignments.] Project will require the instructor's consent which will be given based on performance up to Tierce 1 and originality of the ideas.
- Class Participation: Upto 5% [This is a bonus system]
- Students are encouraged to discuss homeworks and assignments with each other, but the submissions have to be original. If we find plagiarism, your grade will be reduced to D or F.
- Assignments will be individual effort only
- Projects can be done in groups of 2 or 3 with appropriate justification
- Every homework and/or assignment will have plenty of time to complete. There will be no late days allowed for any homework/assignment/project deadline.
Students entering the class are expected to have a pre-existing working knowledge of probability, linear algebra, statistics and algorithms, though the class has been designed to allow students with a strong numerate background to catch up and fully participate.
-
Programming experience in a general programming language. Specifically, you need to have written from scratch programs consisting of few hundred lines of code. Note: For each programming assignment, you will be required to use Python. You will be expected to know, or be able to quickly pick up, that programming language.
-
Basic familiarity with probability and statistics: (Conditional probability, Bayes Rule, Random variable, independence, conditional independence, Expectation, Variance, Concentration Inequalities, Distributions, Gaussian, Multi-variate)
-
Linear Algebra: Vectors and matrices, inner product, projection, Basis, (complete, orthonormal), Orthogonality, linear (in)dependence eigenvalues and eigenvectors; singular values and vectors; SVD
-
Discrete mathematics: (Proofs, Induction, Logic, Combinatorics, Graphs)
-
You must strictly adhere to these pre-requisites! Even if IIT Bhilai's registration system does not prevent you from registering for this course, it is still your responsibility to make sure you have all of these prerequisites before you register.
- [HML] Hands on Machine Learning Aurélien Géron
- [CIML] A Course in Machine Learning, Hal Daumé III
- [MML] Mathematics for Machine Learning, Marc Peter Deisenroth, A. Aldo Faisal, and Cheng Soon Ong
- [ML] Machine Learning, Tom Mitchell.
- [PRML] Pattern recognition and machine learning, Chistopher Bishop
- [PML] Probabilistic Machine Learning, Kevin Murphy (2nd edition)
- Google Drive Link:
- Canvas Link:
- CheatSheet: https://stanford.edu/~shervine/teaching/cs-229/cheatsheet-supervised-learning
- Example Projects: https://cs229.stanford.edu/proj2021spr/
- Stanford CS229: https://cs229.stanford.edu/syllabus-fall2021.html
- Harvard CS181: https://harvard-ml-courses.github.io/cs181-web/schedule
- MIT 6.036 : https://canvas.mit.edu/courses/7509
- IIT Delhi COL341: https://www.cse.iitd.ernet.in/~rahulgarg/Teaching/COL341.htm
- UW CSE446: https://courses.cs.washington.edu/courses/cse446/22wi/
- UCB CS189: https://people.eecs.berkeley.edu/~jrs/189/
Legend:
# | Week | Topics planned in this week | Text Book Reference | Reading | Notebooks |
---|---|---|---|---|---|
1 | Jul 29 | Tutorial 1: Linear Algebra and Python Libraries | MML Ch2-4 | Linear Algebra; numpy; pandas | |
2 | Aug 1 | The Machine Learning Landscape: Applications, Types, challenges | HML Ch1, CIML Ch1 | AlgoVsModel | |
2 | Aug 1 | End to End Approach: Data collection and preparation | HML Ch 2 | Data Carpentry | |
2 | Aug 1 | Tutorial 2: Vector Calculus | MML Ch 5 | Calculus | |
3 | Aug 8 | Limits of Learning | CIML Ch2 | ||
3 | Aug 8 | Regression: Linear, Polynomial, Regularization, Logistic | HML Ch4 | Demo1Demo2,UnderFitOverFit | |
3 | Aug 8 | Tutorial 3: Continuous Optimization | MML Ch 7 | ||
4 | Aug 15 | Geometry and Nearest Neighbors | CIML Ch3 | ||
4 | Aug 15 | Perceptron, Practical Issues in ML | CIML Ch4, Ch5 | ||
4 | Aug 15 | SVM: Hard-margin, Soft-margin, Linear, Non-linear, SVM Regression | HML Ch5 | ||
4 | Aug 15 | Tutorial 4: Probability and Distributions | MML Ch 6 | Probability | |
5 | Aug 22 | SVM: Kernelized SVM, Online SVM | HML Ch5 | ||
5 | Aug 22 | Decision Trees: Entropy, Regularlization | HML Ch6 | ||
5 | Aug 22 | Tutorial 5: TensorFlow/PyTorch | |||
6 | Aug 29 | Ensemble Learning: Bagging, Random Forests | HML Ch7 | ||
6 | Aug 29 | Ensemble Learning: Boosting, AdaBoost, Gradient Boosting | HML Ch7 | ||
7 | Aug 29 | Probabilistic Modeling | CIML Ch9 | ||
6 | Sep 2 | Project:Build an ensemble of models | |||
6 | Sep 7 | Tierce 1 Exam | |||
7 | Sep 12 | Neural Networks Introduction | HoML Ch 10 | ||
7 | Sep 12 | Compute Graph, Auto-diff | HoML Appendix D | ||
8 | Sep 19 | Batch Normalization, Graident Clipping, Regularization, Optimization Techniques | HoML Ch11 | ||
8 | Sep 19 | CNNs, Main Archiectures | HoML Ch 14 | ||
8 | Sep 26 | RNN, Attention Models | HoML Ch 15 | ||
9 | Sep 26 | Forecasting Time Series | HoML Ch 15 | ||
9 | Oct 10 | Autoencoders, GAN | |||
10 | Oct 10 | Generative Models | |||
10 | Oct 17 | Learning Theory | |||
11 | Oct 17 | ||||
11 | Project: Applications/Paper with code | ||||
12 | Oct 1 -Oct9 | Mid Sem Break | |||
12 | Oct 21 | Tierce 2 Exam | |||
13 | Oct 27 | Reinforcement Learning | |||
13 | Oct 27 | IVB Ch10 | |||
14 | Nov 4 | Basic concepts in RL, value iteration, policy iteration. | |||
14 | Nov 4 | ||||
14 | Nov 11 | Model-based RL, value function approximator. | |||
15 | Nov 18 | Fairness, algorithmic bias, explainability, privacy | Bias | ||
15 | Nov 18 | IVB Ch14 | |||
15 | Nov 25 | Fairness, algorithmic bias, explainability, privacy | |||
16 | Nov 25 | ||||
17 | Nov 25 | Project: Reinforcement learning Application |
- Linear Algebra Notes from Stanford - short
- 3blue1brown -beautiful animated explanations
- Linear Algebra Notes
- inear Algebra Review
- The Matrix Cookbook - It won't teach you linear algebra, but this free desktop reference on matrices may come in handy.
- Probability Notes from Stanford - short
- Review of probability from a course by David Blei at Princeton
- Andrew Moore's Probability tutorial slides (somewhat incomplete)
- Another probability review, from UCI
- https://www.cs.utoronto.ca/~fidler/teaching/2015/slides/CSC411/
- https://www.cs.cmu.edu/~epxing/Class/10701/lecture.html
- https://web.cs.ucla.edu/~sriram/courses/cs188.winter-2017/html/index.html
- https://people.eecs.berkeley.edu/~jrs/189/
- https://alex.smola.org/teaching/cmu2013-10-701/
- https://sli.ics.uci.edu/Classes/2015W-273a
We will benefit from other people’s efforts:
- Google Dataset Search
- Amazon’s AWS datasets
- Kaggle datasets
- Wikipedia’s list
- UC Irvine Machine Learning Repository
- Quora.com
- Dataportals.org
- Opendatamonitor.eu
- Quandl.com