Skip to content
View d9249's full-sized avatar
😃
😃
  • Republic of Korea

Block or report d9249

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, Replicate, Groq]

Python 12,902 1,505 Updated Oct 16, 2024

OCR, layout analysis, reading order, table recognition in 90+ languages

Python 11,625 740 Updated Oct 14, 2024

Multilingual Automatic Speech Recognition with word-level timestamps and confidence

Python 1,953 155 Updated Oct 3, 2024

Build real-time multimodal AI applications 🤖🎙️📹

Python 3,429 336 Updated Oct 16, 2024

rga: ripgrep, but also search in PDFs, E-Books, Office documents, zip, tar.gz, etc.

Rust 8,090 175 Updated Oct 14, 2024

code-based qr code generator

JavaScript 2,282 57 Updated Oct 13, 2024

React app for inspecting, building and debugging with the Realtime API

JavaScript 1,699 539 Updated Oct 7, 2024
TypeScript 1,631 68 Updated Oct 16, 2024

This node is primarily based on Easy-OCR to implement OCR text recognition functionality.

Python 22 4 Updated Aug 5, 2024

The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.

Python 53,331 5,650 Updated Oct 16, 2024

A tool for Python developers to easily debug the HTTP(S) client requests in a Python program.

Python 569 14 Updated Oct 16, 2024

Things you can do with the token embeddings of an LLM

Python 1,276 40 Updated Oct 14, 2024

Finetune Llama 3.2, Mistral, Phi & Gemma LLMs 2-5x faster with 80% less memory

Python 16,921 1,163 Updated Oct 15, 2024

Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022

Python 5,772 469 Updated Jul 11, 2024

Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model

Python 5,417 449 Updated Oct 13, 2024

open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.

Python 2,878 260 Updated Oct 16, 2024

A JAX research toolkit for building, editing, and visualizing neural networks.

Python 1,656 50 Updated Sep 11, 2024

A JavaScript library that brings vector search and RAG to your browser!

TypeScript 54 7 Updated Aug 15, 2024
Swift 253 18 Updated Oct 1, 2024

Framework for orchestrating role-playing, autonomous AI agents. By fostering collaborative intelligence, CrewAI empowers agents to work together seamlessly, tackling complex tasks.

Python 20,097 2,775 Updated Oct 16, 2024

A curated list of awesome open-source libraries for production LLM

328 26 Updated Sep 2, 2024

Get up and running with Llama 3.2, Mistral, Gemma 2, and other large language models.

Go 93,482 7,382 Updated Oct 15, 2024

An AI personal tutor built with Llama 3.1

TypeScript 1,322 192 Updated Aug 1, 2024

RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.

Python 19,578 1,963 Updated Oct 16, 2024

Enhance Tesseract OCR output for scanned PDFs by applying Large Language Model (LLM) corrections.

Python 2,094 135 Updated Aug 21, 2024

The all-in-one solution for RAG. Build, scale, and deploy state of the art Retrieval-Augmented Generation applications

Python 3,425 259 Updated Oct 16, 2024

Automatic Korean word spacing with Python

Python 402 118 Updated Jul 4, 2024

A programming framework for agentic AI 🤖

C# 31,992 4,655 Updated Oct 16, 2024

This repository is an implementation of inferring the PaliGemma Vision Language Model on Android using Hugging Face-Gradio Client API for tasks such as zero-shot object detection, image captioning …

Kotlin 16 2 Updated Oct 10, 2024

This is the third party implementation of the paper Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection.

Python 409 64 Updated Jun 25, 2024
Next