Stars
Implementation of a local in-memory cache for Triton Inference Server's TRITONCACHE API
ModelScope: bring the notion of Model-as-a-Service to life.
Anbox is a container-based approach to boot a full Android system on a regular GNU/Linux system
Redis Vector Library (RedisVL) interfaces with Redis' vector database for realtime semantic search, RAG, and recommendation systems.
🦄🔒 Awesome list of secrets in environment variables 🖥️
the AI-native open-source embedding database
Generic helm chart for all kind of applications
Rapid fuzzy string matching in Python using various string metrics
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
A high-throughput and memory-efficient inference and serving engine for LLMs
🦁 Lion, new optimizer discovered by Google Brain using genetic algorithms that is purportedly better than Adam(w), in Pytorch
Fast and memory-efficient exact attention
MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone
Triton backend that enables pre-process, post-processing and other logic to be implemented in Python.
[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
Full-featured and highly configurable SFTP, HTTP/S, FTP/S and WebDAV server - S3, Google Cloud Storage, Azure Blob
Open Source Continuous File Synchronization
aria2 is a lightweight multi-protocol & multi-source, cross platform download utility operated in command-line. It supports HTTP/HTTPS, FTP, SFTP, BitTorrent and Metalink.
Provision infrastructure and install OpenShift 3.
⚡ Langchain apps in production using Jina & FastAPI
Vertica dialect for SQLAlchemy using the vertica-python client