Refactor Inference Engine extensions to Backend #2783

louis-jan · 2024-04-22T15:53:11Z

Description:

All current inference engines primarily function within the browser host. We now require them to be refactored to operate on the backend. This adjustment will enable UI components to simply request with minimal model knowledge, such as the model's ID and messages.

Without requiring extensive knowledge of running the model, the client can simply send an OpenAI-compatible request without additional parameters. Models will be loaded using default settings, read from model.json, on the server side.
See: #2758

This approach would also contribute to scaling the model hub, as clients can easily retrieve the latest supported model list from the backend, which is dynamically updatable.

graph LR
UI[UI Components]-->|chat/completion| Backend
Backend -->|retrieve| Model[model.json]
Model-->|settings| Load[Model Loader]
Load-->|inference| Inference[Inference Engines]
Inference -->|Response| UI

The text was updated successfully, but these errors were encountered:

imtuyethan · 2024-07-02T18:06:04Z

deprecated

This was referenced Apr 23, 2024

Refactor Headless Backend #2781

Closed

Headless Backend Extensions #2782

Closed

louis-jan self-assigned this Apr 25, 2024

louis-jan mentioned this issue May 7, 2024

feat: cortex extensions janhq/cortex.cpp#541

Merged

2 tasks

imtuyethan closed this as not planned Won't fix, can't repro, duplicate, stale Jul 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor Inference Engine extensions to Backend #2783

Refactor Inference Engine extensions to Backend #2783

louis-jan commented Apr 22, 2024

imtuyethan commented Jul 2, 2024

Refactor Inference Engine extensions to Backend #2783

Refactor Inference Engine extensions to Backend #2783

Comments

louis-jan commented Apr 22, 2024

Description:

imtuyethan commented Jul 2, 2024