- Berkeley, CA
-
12:08
(UTC -07:00) - https://charliejcj.github.io/
- https://huggingface.co/CharlieJi
- @charlie_jcj02
Highlights
Lists (7)
Sort Name ascending (A-Z)
Stars
"LightRAG: Simple and Fast Retrieval-Augmented Generation"
The repository contains code for Adaptive Data Optimization
Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.
Resources for the "SummEval: Re-evaluating Summarization Evaluation" paper
ElasticTok: Adaptive Tokenization for Image and Video
Typer, build great CLIs. Easy to code. Based on Python type hints.
A Survey on Data Selection for Language Models
A reading list on LLM based Synthetic Data Generation 🔥
A system for agentic LLM-powered data processing and ETL
MiniCheck: Efficient Fact-Checking of LLMs on Grounding Documents [EMNLP 2024]
Evaluate your LLM's response with Prometheus and GPT4 💯
Get up and running with Llama 3.2, Mistral, Gemma 2, and other large language models.
Build resilient language agents as graphs.
An Analytical Evaluation Board of Multi-turn LLM Agents
Official repo for the paper "Scaling Synthetic Data Creation with 1,000,000,000 Personas"
Framework for enhancing LLMs for RAG tasks using fine-tuning.
Run PyTorch LLMs locally on servers, desktop and mobile
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
Collection of Summer 2025 tech internships!
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
IDE-style autocomplete for your existing terminal & shell
Agentless🐱: an agentless approach to automatically solve software development problems
BigCodeBench: Benchmarking Code Generation Towards AGI
Cambrian-1 is a family of multimodal LLMs with a vision-centric design.
A Dynamic Environment to Evaluate Attacks and Defenses for LLM Agents.