Skip to content
View zanussbaum's full-sized avatar

Highlights

  • Pro
Block or Report

Block or report zanussbaum

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Zoomable, animated scatterplots in the browser that scales over a billion points

TypeScript 1,003 57 Updated Jul 19, 2024

Embedding Vector Oriented Clustering

Python 93 3 Updated Jun 18, 2024

run embeddings in MLX

Python 62 6 Updated Jun 19, 2024

A Python library for fast and easy access to genomic resources such as sequence, data tracks, and annotations

Python 25 Updated Jul 24, 2024

Minimal sharded dataset loaders, decoders, and utils for multi-modal document, image, and text datasets.

Python 139 9 Updated Apr 3, 2024

structured outputs for llms

Python 6,844 550 Updated Jul 26, 2024

PyTorch extensions for high performance and large scale training.

Python 3,066 272 Updated Jun 18, 2024

Train Models Contrastively in Pytorch

Python 483 36 Updated Jul 18, 2024

Machine Learning Engineering Open Book

Python 10,296 617 Updated Jul 26, 2024

Data and tools for generating and inspecting OLMo pre-training data.

Python 859 83 Updated Jul 24, 2024

Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.

Python 3,474 326 Updated Jun 16, 2024

Tools for evaluating the performance of MT metrics on data from recent WMT metrics shared tasks.

Python 76 13 Updated Jul 4, 2024

HyDE: Precise Zero-Shot Dense Retrieval without Relevance Labels

Jupyter Notebook 406 26 Updated Jan 26, 2023

Solve puzzles. Learn CUDA.

Jupyter Notebook 5,438 315 Updated Jul 5, 2024

My small cheatsheets for data science, ML, computer science and more.

1,623 121 Updated Feb 2, 2023

🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.

Python 15,202 1,459 Updated Jul 26, 2024
Python 94 15 Updated May 30, 2023

A playbook for systematically maximizing the performance of deep learning models.

25,961 2,162 Updated Jun 18, 2024
Python 371 113 Updated Nov 4, 2022
Python 27 6 Updated Mar 2, 2023

The open source initiative for anonymized, elite-level athletic motion capture data. Run by Driveline Baseball.

Jupyter Notebook 196 47 Updated Jul 25, 2024

Playground for using large language models into the Modern Data Stack for entity matching

Python 104 5 Updated Apr 1, 2023

A collection of tasks to probe the effectiveness of protein sequence representations in modeling aspects of protein design

Jupyter Notebook 87 12 Updated Jun 25, 2023

Community-curated resources for research at the intersection of AI and molecular sciences

14 1 Updated May 24, 2022

Probabilistic Transformer: Modelling Ambiguities and Distributions for RNA Folding and Molecule Design

Python 16 4 Updated Dec 14, 2022

PEER Benchmark, appear at NeurIPS 2022 Dataset and Benchmark Track (https://arxiv.org/abs/2206.02096)

Python 77 10 Updated Mar 18, 2023

Interact, analyze and structure massive text, image, embedding, audio and video datasets

Python 1,171 156 Updated Jul 22, 2024

[IJCAI 2023 survey track]A curated list of resources for chemical pre-trained models

473 49 Updated Jun 17, 2023

Ask Me Anything language model prompting

Python 534 45 Updated Jul 5, 2023

Diffusion models of protein structure; trigonometry and attention are all you need!

Jupyter Notebook 491 50 Updated Dec 12, 2023
Next