A commitment to learn ML (and related topics) every day for 365 days starting Jan 1 2023.
- The Elements of Statistical Learning (ESLR)
- Serrano.Academy Youtube Channel
- Ritvik Math Youtube Channel
- Linkedin Learning
- 2 Minute Papers Youtube Channel
- StatQuest Youtube Channel by Josh Starmer
- Arxiv.org
Introduction to Supervised Learning, Variable Types, Encodings, Two Simple Approaches to Prediction: Least Squares and Nearest Neighbors, Other Models as a variant of these two approaches
Statistical Decision Theory: Probabilistic Setup, Conditional Mean/Median as Regression Function for Squared Loss/Absolute Loss, Linear Model Estimates and Nearest Neighbor Model Estimates from the Regression Function, Solution for Categorical Target Variable, Bayes Classifier.
Local Methods in High Dimensions: Curse of dimensionality, Nearest Neighbours not really "Near" in Nearest Neighbor models in High Dimensions, Bias-Variance Decomposition of MSE, Linear Assumption (and other Rigid Assumptions) to avoid Curse of Dimensionality.
Statistical Model for Pr(X,Y), Additive error Model, Supervised Learning, Function Approximation by Least Squares Method and Maximum Likelihood Method, Structured Regression Models: Using implicit or explicit neighborhood restrictions (usually complexity constraints)
Roughness Penalty or Regularization, Kernel Functions and Local Regression, Basis Functions, Splines, Dictionary Methods (Adaptively Chosen Basis Functions, eg: Neural Networks)
Model Selection and Bias - Variance Tradeoff, K Nearest Neighbours Example, Test Error, Overfitting and Underfitting
Gaussian Mixture Models, Iterative Approach to fit a Mixture of Gaussians for Clustering.
Linear Methods of Regression: Introduction, Generalisation and Basis Expansions, Least Square Method of finding Model Coefficients, Normality Assumptions, Significance of Coefficients
Significance of Linear Coefficients, Gauss Markov Theorem, Multiple Regression from Univariate Regression, Gram Schmidt Orthogonalisation to find Coefficients, Linear Regression with Multiple Outputs
Filling gaps from Day 9
Subset Selection: Best Subset Selection, Forward and Backward Stepwise Selection, Forward Stagewise Regression
Shrinkage Methods: Ridge and Lasso Regression
Comparison of Subset Selection, Ridge Regression and Lasso Regression
Least Angle Regression
Least Angle Regression
Methods Using Derived Input Directions: Principal Components Regression
Matrix Differentiation Propositions and Proofs.
Partial Least Squares
Comparison of Selection and Shrinkage Methods
Multiple Outcome Shrinkage and Selection
Multiple Outcome Shrinkage and Selection
OTTO Multi Objective Recommender System: Learnt about Ranking Models as opposed to Supervised and Unsupervised Models and started off the competition with a naive baseline model submission.
OTTO Multi Objective Recommender System: Added more logic to predicting the next 'cart' and 'order' item and improved the score.
OTTO Multi Objective Recommender System:
How does Netflix recommend movies?
KMeans and Heirarchical Clustering
OTTO Multi Objective Recommender System: Tried Label Propagation method to identify clusters structure to products from browsing order of products.
OTTO Multi Objective Recommender System: Tried one rule based method of finding key candidates (inferring from the training data) and another method of framing it as a ML problem by creating training data and ranking the probabilities
OTTO Multi Objective Recommender System: Bug Fixes and Time Optimsiation of the code.
GoDaddy Microbusiness Density Forecasting: Registered for the competition
Dirichlet Allocation and Gibbs Sampling
OTTO Multi Objective Recommender System: Competition Deadline,Final tries
Restricted Boltzmann Machines (RBM)
Going through top ranked submissions.
Incremental Foreward Stagewise Regression
Coding Incremental Foreward Stagewise Regression from Scratch
Piecewise Linear Path Algorithms
Dantzig Selector
Grouped Lasso
Further Properties of Lasso
Pathwise Coordinate Optimization
Computational Considerations
Denoting and Variational Autoencoders
Principal Component Analysis
Linear Methods for Classification : Introduction
Linear Regression of an Indicator Matrix
Linear Discriminant Analysis
Linear Discriminant Analysis: Regularised Discriminant Analysis
Linear Discriminant Analysis: Reduced Rank Discriminant
Linear Discriminant Analysis: Computations for LDA
Logistic Regression
Fitting Logistic Regression Models
Logistic Regression: South African Heart Disease Example
Logistic Regression: Quadratic Approximations and Inference
Completed the course Transformers: Text Classification for NLP using BERT.
L1 Regularized Logistic Regression
Logistic Regression or LDA?
Separating Hyperplanes
Rosenblatt's Perceptron Learning Algorithm
Optimal Separating Hyperplanes
A Friendly Introduction to Generative Adversarial Networks
Basis Expansions and Regularization: Introduction
Piecewise Polynomials and Splines
Natural Cubic Splines
South African Heart Disease Example
Phoneme Recognition
Filtering and Feature Extraction, Smoothing Splines
Degrees of Freedom and Smoothing Splines
Automatic Selection of the Smoothing Parameters
Fixing Degrees of Freedom
Automatic Selection of the Smoothing Parameters: The Bias-Variance Tradeoff
Non Parametric Logistic Regression
Multidimensional Splines
Regularization and Reproducing Kernel Hilbert Spaces
Spaces of Functions Generated by Kernels
Examples of RKHS
Wavelet Smoothing
Wavelet Bases and Wavelet Transform
Adaptive Wavelet Filtering
DALLE-2 for Music Generation
Computation for Splines
Computation for Smoothing Splines
Introduction to Kernel Smoothing Methods
1 Dimensional Kernel Smoothing
Local Linear Regression
Local Polynomial Regression
Decision Trees, Gini Impurity
Gradient Boost Part 1: Main Regression Ideas
Decision Trees and Pruning
XGBoost for Regression
OpenAI GPT4 - The Future is Here
Mid journey AI - A League Above DALL-E 2
EA's New AI - Next Level Games are Coming
XGBoost for Classification
XGBoost: A Scalable Tree Boosting System
DeepMind's AlphaFold AI
OpenAI's GPT4
Microsoft's new AI clones your voice in 3 seconds
OpenAI's ChatGPT took an IQ test
Selecting width of a Kernel
Local Regression in Rp
Structured Kernels
Structured Regression Functions
Local Likelihood and Other Models
Kernel Density Estimation
Kernel Density Classification
Naive Bayes Classifier
Radial Basis Functions and Kernels
OpenAI's GPT4 - Next Level AI Assistant!
Mixture Models for Density Estimation and Classification
Midjourney AI Version 5
Computational Considerations
NVIDIA's New AI: Better Games are Coming.
25 ChatGPT AIs play a game.
DeepMind's New AI: 10 Years of Learning in Seconds
OpenAI's Whisper Learnt 680,000 hours of speech
Stable Diffusion is getting Outrageously Good
Catboost Part 1 : Ordered Target Encoding
Model Assessment and Selection
Bias, Variance and Model Complexity
The Bias Variance Decomposition
Example: Bias Variance Tradeoff
Optimism of the training error rate
Estimates of In-Sample Prediction Error
The Effective Number of Parameters
Bayesian Information Criteria
Minimum Description Length
VC Dimension
Cross Validation
Cross Validation
Boostrap Methods
Conditional or Expected Test Error
Revision of Introduction to Supervised Learning
Revision
Model Inferencing and Averaging Introduction
A Smoothing Example
Maximum Likelihood Inference
Bootstrap vs Maximum Likelihood
Bayesian methods
Relationship between bootstrap and Bayesian inference
EM Maximization
Structure and Content Guided Video Synthesis with Diffusion models
Two component Mixture Model
The EM Algorithm in General
EM as Maximization Maximization Procedure
MCMC for Sampling from the Posterior
Gibbs Sampling for Mixtures
Bagging
Bagging Example: Trees with Simulated Data
Model Averaging and Stacking
Stochastic Search: Bumping
Google Bard: Is it better than ChatGPT?
Generalized Additive Models
DeepMind's AI Athletes Play in the Real World
Fitting Additive Models
OpenAI's GPT4 - Eccentric Genius AI
Example: Additive Regression Model
Example: Predicting Email Spam
Summary of Additive Regression Model
Tree Based Methods: Background
Tree Based Methods: Classification Trees, Regression Trees and Other Issues
Patient Rule Induction Method
Multivariate Adaptive Regression Splines
NVIDIA's New AI mastered MineCraft 15x faster
MARS Examples and Issues
Missing Data and Computational Considerations
Photoshop's New AI Feature is Amazing
Boosting Methods
Boosting Fits An Additive Model
Forward Stagewise Additive Model
Exponential Loss and Adaboost
NVIDIA's New AI: Making Games Come Alive
Why Exponential Loss?
Loss Functions and Robustness: Robust Loss Functions for Classification
Robust Loss Functions for Regression
DeepMind AlphaDev
Off the Shelf Procedures for Data Mining
Example: Spam Data
Boosting Trees
Numerical Optimization via Gradient descent
Google's New AI: Next Level Virtual World
Stable Diffusion XDSL : Text To Video ZeroScope v2 and More
Steepest Descent
Gradient Boosting
Implementations of Gradient Boosting
Right Sized Trees for Boosting
Regularisation
Stable Diffusion - 8 New Amazing Results
Google's New AI - Blurry Photos no More
1 Billion Tokens LLM
Meta's Music Gen - Text To Music
When AI tries to Reason with itself - AutoGPT and more
Regularization - Shrinkage
Regularization - Subsampling
Interpretation - Relative Importance of Predictor Variables
Partial Dependence Plots
Illustrations - California Housing
Illustrations - New Zealand Fish
Illustrations - Demographics Data
Introduction
Projection Pursuit Regression
Neural Networks
Fitting Neural Networks
Issues in Training Neural Networks - Starting Values
Overfitting
Scaling of the Inputs
Number of Hidden Units and Layers
Multiple Minima
Simulated Data Example
Zip Code Data Example
Discussion on projection pursuit regression and Neural Networks
NVIDIA Did it: Ray Tracing 10000 times faster
Midjourney AI: Text to Image Supercharged
Unreal Engine 5.2: Incredible Simulations
Bayesian Neural Nets and the NIPS 2003 Challenge
Bayes, Boosting and Bagging
Performance Comparisons
Computational Considerations
NVIDIA's New AI: Text to Image Supercharged!
Microsoft's AI watched 100 million Youtube Videos
NVIDIA's New AI trained for 5 billion steps
Stable Diffusion XL is here
AI Generated South Park, Llama 2, HyperDreamBooth and More
The Voice Cloning AIs they never tell you about (and how they work)
Support Vector Machines and Flexible Discriminants: Introduction
Support Vector Classifier
Computing the Support Vector Classifier
Mixture Example
Support Vector Machines and Kernels
Computing SVM for Classification
SVM as a penalisation method
Function Estimation and Reproducing Kernels
SVMS and the Curse of Dimensionality
A Path Algorithm for the SVM Classifier
Support Vector Machines for Regression
Regression and Kernels
Generalizing Linear Discriminant Analysis
Flexible Discriminant Analysis
Computing the FDA estimates
Penalised Discriminant Analysis
Mixture Discriminant Analysis
Example: Waveform Data
Computational Considerations