Request for Repo: "Open Source, Daily Auto-Generated SOTA LLM Model Benchmarks" #505

Marviel · 2023-03-29T15:33:35Z

✨ Open Source Daily Auto-Generated SOTA LLM Model-Comparisons Repository

(Sorry for posting here, just not sure where to ask)

Does ^^ this already exist?

If So, Where??

EcoSystem Graphs Does not appear to have benchmarks.

It's hard to keep up.

All the open source LLM/AI repositories are becoming impossible for basically any human to keep pace with.

There are many awesome "snippets" that are posted in public channels, but not all models hold up well and generalize after practice.

💭 Feature Requests

Here's what I'd like to see: (Please add your own in comments)

License

GPLv3.0 Affero

Update Cadence

Daily Evaluation Runs -- auto-updating the Github Repo with up-to-date Evaluation Results as described below.

Eval Result DB

Description

Each time the cron job is run (daily) the evaluations should be written into a database.

Properties

The Output of the Cron Job should be a set of entries into the EvalDB, showing:
(1) The prompt / Input
(2) The model, including its known current parameters and limitations at the time of the eval run
(3) The output

Up-To-Date Model DB

Model DB Description

There should be an updated database of models which is displayed on the README.md

Model DB Properties

This should include the following:

API or Self-Hosted

This is critical for both speed and price reasons.

Modes & Mode Parameters

Text
- Chat vs. Standard Completion
- Cost-Per-Token
- Average Speed-Per-Token
- Context Length
- Unicode-Support (most models?)
Image
- Max Context Size
- Cost-Per-Pixel(?)
Video
- Max Context Length
- Max Input

Included Training Datasets

Does Something Like This Exist?

?????

What else do we need?

?????

I'm trying to make something to fill this niche myself and will link here shortly.

SierotkaM · 2023-03-29T15:51:50Z

śr., 29 mar 2023, 17:33 użytkownik Luke Bechtel ***@***.***> napisał:

…

✨ Open Source Daily Auto-Generated SOTA LLM Model-Comparisons Repository It's hard to keep up. All the open source LLM/AI repositories are becoming impossible for basically any human to keep pace with. There are many awesome "snippets" that are posted in public channels, but not all models hold up well and generalize after practice. 💭 Feature Requests Here's what I'd like to see: (Please add your own in comments) License - GPLv3.0 Affero Update Cadence - Daily Evaluation Runs -- auto-updating the Github Repo with up-to-date Evaluation Results as described below. Eval Result DB Description Each time the cron job is run (daily) the evaluations should be written into a database. Properties The Output of the Cron Job should be a set of entries into the EvalDB, showing: (1) The prompt / Input (2) The model, *including its known current parameters and limitations at the time of the eval run* (3) The output Up-To-Date Model DB Model DB Description There should be an updated database of models which is displayed on the README.md Model DB Properties This should include the following: API or Self-Hosted This is critical for both speed and price reasons. Modes & Mode Parameters - Text - Chat vs. Standard Completion - Cost-Per-Token - Average Speed-Per-Token - Context Length - Unicode-Support (most models?) - Image - Max Context Size - Cost-Per-Pixel(?) - Video - Max Context Length - Max Input Included Training Datasets Tags A free-form field for anything that doesn't fit into the above schema. Does Something Like This Exist? ????? What else do we need? ????? I'm trying to make something to fill this niche myself and will link here shortly. — Reply to this email directly, view it on GitHub <#505>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/A6PJHHVHLTHCIFXBPR24VYDW6RI6NANCNFSM6AAAAAAWMCEU6A> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

Marviel changed the title ~~RFC: "Open Source Daily Auto-Generated SOTA LLM Model-Comparisons Repository"~~ Request for Repo: "Open Source Daily Auto-Generated SOTA LLM Model-Comparisons Repository" Mar 29, 2023

Marviel changed the title ~~Request for Repo: "Open Source Daily Auto-Generated SOTA LLM Model-Comparisons Repository"~~ Request for Repo: "Open Source Daily Auto-Generated SOTA LLM Model-Comparisons" Mar 29, 2023

Marviel changed the title ~~Request for Repo: "Open Source Daily Auto-Generated SOTA LLM Model-Comparisons"~~ Request for Repo: "Open Source, Daily Auto-Generated SOTA LLM Model Benchmarks" Mar 29, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Request for Repo: "Open Source, Daily Auto-Generated SOTA LLM Model Benchmarks" #505

Request for Repo: "Open Source, Daily Auto-Generated SOTA LLM Model Benchmarks" #505

Marviel commented Mar 29, 2023 •

edited

SierotkaM commented Mar 29, 2023 via email

Request for Repo: "Open Source, Daily Auto-Generated SOTA LLM Model Benchmarks" #505

Request for Repo: "Open Source, Daily Auto-Generated SOTA LLM Model Benchmarks" #505

Comments

Marviel commented Mar 29, 2023 • edited

✨ Open Source Daily Auto-Generated SOTA LLM Model-Comparisons Repository

Does ^^ this already exist?

If So, Where??

It's hard to keep up.

💭 Feature Requests

License

Update Cadence

Eval Result DB

Description

Properties

Up-To-Date Model DB

Model DB Description

Model DB Properties

API or Self-Hosted

Modes & Mode Parameters

Included Training Datasets

Tags

Does Something Like This Exist?

What else do we need?

SierotkaM commented Mar 29, 2023 via email

Marviel commented Mar 29, 2023 •

edited