-
gradient.ai @Preemo-Inc
- San Francisco
-
13:46
(UTC -07:00) - michaelfeil.eu
- in/michael-feil
- @feilsystem
Highlights
Block or Report
Block or report michaelfeil
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseStars
Language
Sort by: Recently starred
A Python tool to enforce dependencies, using modular architecture 🌎 Open source 🐍 Installable via pip 🔧 Able to be adopted incrementally - ⚡ Implemented with no runtime impact ♾️ Interoperable with…
freshworksinc / freddy-infinity
Forked from michaelfeil/infinityInfinity is a high-throughput, low-latency REST API for serving vector embeddings, supporting a wide range of text-embedding models and frameworks.
Triton implementation of FlashAttention2 that adds Custom Masks.
ffdfo / embed
Forked from michaelfeil/embedA stable, fast and easy-to-use inference library with a focus on a sync-to-async API
Website Speed and Performance Optimization and monitoring
Lightweight ML model proxy and autoscaler for kubernetes
Up to 10x faster strings for C, C++, Python, Rust, and Swift, leveraging SWAR and SIMD on Arm Neon and x86 AVX2 & AVX-512-capable chips to accelerate search, sort, edit distances, alignment scores,…
A curated list of amazingly awesome Modal applications, demos, and shiny things. Inspired by awesome-php.
Official repository for the paper "Grokfast: Accelerated Grokking by Amplifying Slow Gradients"
The RunPod worker template for serving our large language model endpoints. Powered by vLLM.
Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.
SkyPilot: Run LLMs, AI, and Batch jobs on any cloud. Get maximum savings, highest GPU availability, and managed execution—all with a simple interface.
We write your reusable computer vision tools. 💜
Kuwa GenAI OS: An open, free, secure, and privacy-focused Generative-AI Operating System.
Firebase for AI Agents: Open-source backend platform that puts powerful generative models at the core of your database. With managed memory and RAG capabilities, developers can easily build AI agen…
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
RAG (Retrieval Augmented Generation) Framework for building modular, open source applications for production by TrueFoundry
Multi-Modal Database replacing MongoDB, Neo4J, and Elastic with 1 faster ACID solution, with NetworkX and Pandas interfaces, and bindings for C 99, C++ 17, Python 3, Java, GoLang 🗄️
Web Serving and Remote Procedure Calls at 50x lower latency and 70x higher bandwidth than FastAPI, implementing JSON-RPC & REST over io_uring ☎️
Up to 200x Faster Inner Products and Vector Similarity — for Python, JavaScript, Rust, C, and Swift, supporting f64, f32, f16 real & complex, i8, and binary vectors using SIMD for both x86 AVX2 & A…
MiniCPM-Llama3-V 2.5: A GPT-4V Level Multimodal LLM on Your Phone
This repository is designed for deploying and managing server processes that handle embeddings using the Infinity Embedding model or Large Language Models with an OpenAI compatible vLLM server.