I have extensive experience in machine learning, generative AI, bioinformatics, research, and development, focusing on applying advanced ML, AI, statistical, and mathematical methods to solve complex problems. In addition, I have a long history of experience in designing, building, forecasting, deploying, and maintaining machine learning models to increase efficiency, accuracy, and utility of internal data processing.
-
I am motivated to use pre-trained LLM, reinforcement learning, and retrieval augmented generation to incorporate DNA Sequence and evolutionary information for classification task
-
I am working on re-designing and improving consensus features nested cross-validation cncv with heterogenous data types for feature selection, model selection, and risk prediction.
Programming language: Python ((Pandas, NumPy, Matplotlib, Scikit-learn, Scipy, NLTK, Keras, Tensorflow, PyTorch), R (ggplot2, igraph, glmnet, randomForest, XGBoost, caret, e1071, rpart, devtools, dplyr, Rmarkdown, Shiny), MATLAB, C, C++ (Eigen, Armadilo), SQL
Machine Learning: Supervised Learning (Linear Models, Tree-based Models, Distance-based Models, Kernel-based Models, Network-based Models), Unsupervised Learning (Clustering Methods, Dimensionality Reduction Methods, Community Detection Methods, Centrality-based Methods)
Deep Learning: Neural Nets & CNNs (TensorFlow, Keras), Natural Language Processing (NLTK, Vader, Pattern), GAN
HPC: SLURM, PBS
Frameworks & Other: Git, Anaconda