Skip to content
View michaelfeil's full-sized avatar
Block or Report

Block or report michaelfeil

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

A Python tool to enforce dependencies, using modular architecture 🌎 Open source 🐍 Installable via pip 🔧 Able to be adopted incrementally - ⚡ Implemented with no runtime impact ♾️ Interoperable with…

Python 707 24 Updated Jul 25, 2024

Infinity is a high-throughput, low-latency REST API for serving vector embeddings, supporting a wide range of text-embedding models and frameworks.

Python 2 Updated Jul 25, 2024

Triton implementation of FlashAttention2 that adds Custom Masks.

Python 16 2 Updated Jul 21, 2024

OpenGL Rendering Pipelines for Python

Python 168 11 Updated Jul 23, 2024

A stable, fast and easy-to-use inference library with a focus on a sync-to-async API

Python 1 Updated Jul 20, 2024
Python 2 Updated Jul 9, 2024

Website Speed and Performance Optimization and monitoring

Go 104 10 Updated Mar 3, 2022

Lightweight ML model proxy and autoscaler for kubernetes

Go 100 6 Updated Jul 11, 2024

Open language modeling toolkit based on PyTorch

Python 26 6 Updated Jul 22, 2024

Up to 10x faster strings for C, C++, Python, Rust, and Swift, leveraging SWAR and SIMD on Arm Neon and x86 AVX2 & AVX-512-capable chips to accelerate search, sort, edit distances, alignment scores,…

C++ 1,920 64 Updated Jun 30, 2024

A curated list of amazingly awesome Modal applications, demos, and shiny things. Inspired by awesome-php.

52 6 Updated Jul 26, 2024

Official repository for the paper "Grokfast: Accelerated Grokking by Amplifying Slow Gradients"

Python 344 28 Updated Jun 28, 2024

The RunPod worker template for serving our large language model endpoints. Powered by vLLM.

Python 200 69 Updated Jul 25, 2024

Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.

Python 1,591 96 Updated Jul 26, 2024

SkyPilot: Run LLMs, AI, and Batch jobs on any cloud. Get maximum savings, highest GPU availability, and managed execution—all with a simple interface.

Python 6,323 437 Updated Jul 26, 2024

Imitate OpenAI with Local Models

Python 80 9 Updated Jun 11, 2024

We write your reusable computer vision tools. 💜

Python 18,147 1,396 Updated Jul 26, 2024

Kuwa GenAI OS: An open, free, secure, and privacy-focused Generative-AI Operating System.

JavaScript 80 12 Updated Jul 26, 2024

Firebase for AI Agents: Open-source backend platform that puts powerful generative models at the core of your database. With managed memory and RAG capabilities, developers can easily build AI agen…

Python 71 1 Updated Jul 23, 2024

RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.

Python 12,649 1,237 Updated Jul 26, 2024

Declarative AI Pipelines

Python 17 Updated Jun 24, 2024

RAG (Retrieval Augmented Generation) Framework for building modular, open source applications for production by TrueFoundry

Python 3,036 233 Updated Jul 26, 2024

Train Models Contrastively in Pytorch

Python 483 36 Updated Jul 18, 2024

Multi-Modal Database replacing MongoDB, Neo4J, and Elastic with 1 faster ACID solution, with NetworkX and Pandas interfaces, and bindings for C 99, C++ 17, Python 3, Java, GoLang 🗄️

C++ 518 31 Updated Sep 1, 2023

Web Serving and Remote Procedure Calls at 50x lower latency and 70x higher bandwidth than FastAPI, implementing JSON-RPC & REST over io_uring ☎️

C 1,099 39 Updated May 22, 2024

Up to 200x Faster Inner Products and Vector Similarity — for Python, JavaScript, Rust, C, and Swift, supporting f64, f32, f16 real & complex, i8, and binary vectors using SIMD for both x86 AVX2 & A…

C 822 42 Updated Jul 11, 2024

MiniCPM-Llama3-V 2.5: A GPT-4V Level Multimodal LLM on Your Phone

Python 8,075 569 Updated Jul 26, 2024

This repository is designed for deploying and managing server processes that handle embeddings using the Infinity Embedding model or Large Language Models with an OpenAI compatible vLLM server.

Python 10 3 Updated Jul 20, 2024
Next