![twitter logo](https://raw.githubusercontent.com/github/explore/80688e429a7d4ef2fca1e82350fe8e3517d3494d/topics/twitter/twitter.png)
-
Yale School of Medicine - Yale University
Block or Report
Block or report km5ar
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseLanguage
Sort by: Recently starred
Starred repositories
Plotting library for IPython/Jupyter notebooks
Python toolkit for quantitative finance
Shortest solutions for CS231n 2021-2024
Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.
DSPy: The framework for programming—not prompting—foundation models
Unsupervised text tokenizer for Neural Network-based text generation.
PyTorch implementation of TabNet paper : https://arxiv.org/pdf/1908.07442.pdf
Convert PDF to HTML without losing text or format.
Evaluation framework for your Retrieval Augmented Generation (RAG) pipelines
Collection of data science projects in Python
Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.
Things that you should (and should not) do in your Materials Informatics research.
A guideline for building practical production-level deep learning systems to be deployed in real world applications.
Notes and links from the book club meetings
Interactive roadmaps, guides and other educational content to help developers grow in their careers.
Efficient few-shot learning with Sentence Transformers
A curated list of reinforcement learning with human feedback resources (continually updated)
The Open Source Feature Store for Machine Learning
Open-source Library PyGDebias: Graph Datasets and Fairness-Aware Graph Mining Algorithms
Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials,…
📖 A collection of pure bash alternatives to external processes.
Learn how to design large-scale systems. Prep for the system design interview. Includes Anki flashcards.
tiktoken is a fast BPE tokeniser for use with OpenAI's models.
Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically ChatGPT but with PaLM
Tune efficiently any LLM model from HuggingFace using distributed training (multiple GPU) and DeepSpeed. Uses Ray AIR to orchestrate the training on multiple AWS GPU instances