CharlieJCJ

Charlie Cheng-Jie Ji CharlieJCJ

Data Curation @bespokelabsai | Gorilla LLM | UC Berkeley 25' CS + DS | Research Assistant @ucbsky | Ex AWS, Tencent, TA @DS-100

53 followers · 292 following

Achievements

x3 x2

Achievements

x3 x2

Highlights

Developer Program Member
Pro

Lists (7)

Sort

Beta Lists are currently in beta. Share feedback and report bugs.

Stars

lmarena / PPE

Jupyter Notebook 22 5 Updated Oct 27, 2024

HKUDS / LightRAG

"LightRAG: Simple and Fast Retrieval-Augmented Generation"

Python 6,846 752 Updated Oct 31, 2024

yidingjiang / ado

The repository contains code for Adaptive Data Optimization

Python 16 Updated Oct 15, 2024

facebookresearch / lingua

Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.

Python 4,073 202 Updated Oct 29, 2024

ScalerLab / JudgeBench

Python 26 Updated Oct 17, 2024

Yale-LILY / SummEval

Resources for the "SummEval: Re-evaluating Summarization Evaluation" paper

Python 369 42 Updated Jun 23, 2024

LargeWorldModel / ElasticTok

ElasticTok: Adaptive Tokenization for Image and Video

Python 28 Updated Oct 30, 2024

fastapi / typer

Typer, build great CLIs. Easy to code. Based on Python type hints.

Python 15,692 668 Updated Nov 1, 2024

alon-albalak / data-selection-survey

A Survey on Data Selection for Language Models

175 11 Updated Oct 13, 2024

mlfoundations / dclm

DataComp for Language Models

HTML 1,149 104 Updated Oct 27, 2024

wasiahmad / Awesome-LLM-Synthetic-Data

A reading list on LLM based Synthetic Data Generation 🔥

687 41 Updated Oct 21, 2024

ucbepic / docetl

A system for agentic LLM-powered data processing and ETL

Python 1,199 108 Updated Nov 2, 2024

bespokelabsai / bespokelabs

3 Updated Oct 14, 2024

Liyan06 / MiniCheck

MiniCheck: Efficient Fact-Checking of LLMs on Grounding Documents [EMNLP 2024]

Python 98 9 Updated Oct 13, 2024

prometheus-eval / prometheus-eval

Evaluate your LLM's response with Prometheus and GPT4 💯

Python 787 49 Updated Sep 9, 2024

ollama / ollama

Get up and running with Llama 3.2, Mistral, Gemma 2, and other large language models.

Go 95,951 7,620 Updated Nov 2, 2024

langchain-ai / langgraph

Build resilient language agents as graphs.

Python 6,453 1,027 Updated Nov 2, 2024

hkust-nlp / AgentBoard

An Analytical Evaluation Board of Multi-turn LLM Agents

SAS 242 25 Updated May 20, 2024

SalesforceAIResearch / xLAM

Python 305 24 Updated Sep 26, 2024

tencent-ailab / persona-hub

Official repo for the paper "Scaling Synthetic Data Creation with 1,000,000,000 Personas"

Python 854 58 Updated Sep 25, 2024

IntelLabs / RAG-FiT

Framework for enhancing LLMs for RAG tasks using fine-tuning.

Python 497 36 Updated Oct 15, 2024

pytorch / torchchat

Run PyTorch LLMs locally on servers, desktop and mobile

Python 3,339 213 Updated Nov 2, 2024

huggingface / peft

🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.

Python 16,290 1,604 Updated Nov 1, 2024

cvrve / Summer2025-Internships

Collection of Summer 2025 tech internships!

3,114 109 Updated Nov 2, 2024

Unstructured-IO / unstructured

Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.

HTML 8,985 740 Updated Nov 1, 2024

withfig / autocomplete

IDE-style autocomplete for your existing terminal & shell

TypeScript 24,580 5,492 Updated Nov 1, 2024

OpenAutoCoder / Agentless

Agentless🐱: an agentless approach to automatically solve software development problems

Python 700 84 Updated Oct 29, 2024

bigcode-project / bigcodebench

BigCodeBench: Benchmarking Code Generation Towards AGI

Python 217 22 Updated Nov 2, 2024

cambrian-mllm / cambrian

Cambrian-1 is a family of multimodal LLMs with a vision-centric design.

Python 1,740 113 Updated Oct 30, 2024

ethz-spylab / agentdojo

A Dynamic Environment to Evaluate Attacks and Defenses for LLM Agents.

Jupyter Notebook 56 9 Updated Oct 29, 2024

Charlie Cheng-Jie Ji CharlieJCJ

Highlights

Lists (7)

AIGC 3D generation

Education

Gradient Methods

Leetcode

MusicGen/TTS

🚀 My stack

Red Teaming

Stars