Skip to content
View yanaiela's full-sized avatar

Highlights

  • Pro
Block or Report

Block or report yanaiela

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Tool for interactive embeddings visualization

Python 279 21 Updated May 27, 2020

This is an extension of the popular 21cmFAST code that interfaces with CLASS to generate initial conditions at recombination that are consistent with the input cosmological model

Jupyter Notebook 1 1 Updated Jul 5, 2024

Data Toolkit for Sailor Language Models

Python 68 6 Updated May 15, 2024

Extract structured text from pdfs quickly

Python 241 18 Updated May 27, 2024

Data and tools for generating and inspecting OLMo pre-training data.

Python 847 81 Updated Jul 6, 2024

A simple tool to update bib entries with their official information (e.g., DBLP or the ACL anthology).

Python 2,526 154 Updated Feb 11, 2024

A latent text-to-image diffusion model

Jupyter Notebook 66,564 9,974 Updated Jun 18, 2024

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 22,305 3,143 Updated Jul 7, 2024

A Survey on Data Selection for Language Models

111 6 Updated Jun 4, 2024

Lightweight clipboard manager for macOS

Swift 11,148 480 Updated Jul 4, 2024

A general representation model across vision, audio, language modalities. Paper: ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities

Python 883 54 Updated Jun 27, 2024

✨ Build AI interfaces that spark joy

Python 4,979 321 Updated Jul 5, 2024

What's In My Big Data (WIMBD) - a toolkit for analyzing large text datasets

Python 153 16 Updated Jun 10, 2024

ICLR2024 Spotlight: curation/training code, metadata, distribution and pre-trained models for MetaCLIP; CVPR 2024: MoDE: CLIP Data Experts via Clustering

Python 1,108 48 Updated Jun 26, 2024

A library to manipulate font files from Python.

Python 4,169 448 Updated Jul 5, 2024

Microsoft.Recognizers.Text provides recognition and resolution of numbers, units, date/time, etc. in multiple languages (ZH, EN, FR, ES, PT, DE, IT, TR, HI, NL. Partial support for JA, KO, AR, SV).…

C# 1,654 428 Updated Nov 15, 2023

BookNLP, a natural language processing pipeline for books

Python 769 90 Updated Apr 11, 2023

Creative interactive views of any dataset.

Python 816 42 Updated Feb 25, 2024

The Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.

Scala 135 33 Updated Feb 27, 2024

Lexical Generalization Improves with Larger Models and Longer Training (EMNLP 2022)

Python 3 Updated Feb 26, 2023

Accurately separates a URL’s subdomain, domain, and public suffix, using the Public Suffix List (PSL).

Python 1,812 211 Updated Jul 6, 2024

Trankit is a Light-Weight Transformer-based Python Toolkit for Multilingual Natural Language Processing

Python 718 99 Updated Apr 17, 2024

Python script which prints out a summary of your free slots from your Google calendar(s) so you can paste into a scheduling email.

Python 41 4 Updated Oct 28, 2022

A template repo for Python packages

Python 386 66 Updated Nov 3, 2023

Tools for checking ACL paper submissions

Python 551 47 Updated May 15, 2024

Open-Source Neural Machine Translation in Tensorflow

Python 800 271 Updated Dec 9, 2022

DoWhy is a Python library for causal inference that supports explicit modeling and testing of causal assumptions. DoWhy is based on a unified language for causal inference, combining causal graphic…

Python 6,884 916 Updated Jul 5, 2024

A reading list for papers on causality for natural language processing (NLP)

471 56 Updated Oct 7, 2023
Next