Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestion: Interactive Demo (second example code) #31

Open
chatbots opened this issue Mar 3, 2023 · 2 comments
Open

Suggestion: Interactive Demo (second example code) #31

chatbots opened this issue Mar 3, 2023 · 2 comments
Labels
enhancement New feature or request good first issue Good for newcomers

Comments

@chatbots
Copy link

chatbots commented Mar 3, 2023

Is it possible to modify the C++ to create a second example source code file, that loads the model once, before sending new prompts read in a loop from STDIN?

After backing up the original C++ source code file, I modified the code to read a prompt from STDIN in a loop, instead of argv. There were no errors going through the loop, generating responses, except I seem to be processing new responses from the first prompt read from STDIN, over again, instead processing the new subsequent prompts read from STDIN.

The funny part, is these unintended results may be useful for prompt engineering in the future, to keep the context. But first, the goal would be to try save time by avoiding reloads of the model for generating responses to each new prompt in a loop. Lastly, this is a suggestion for a second separate example source code file. The first example code source code file is correct and very useful.

@ggerganov ggerganov added enhancement New feature or request good first issue Good for newcomers labels Mar 6, 2023
@koogle
Copy link
Contributor

koogle commented Jun 28, 2023

Hi @ggerganov would you be open to contribution on this / a slightly more advanced use case. We basically want to run GGML in an interactive process that can accept new prompts via a socket and writes them.

For demo purposes we could also add the ability to answer multiple prompts via stdin.

I could have a basic draft of what I am thinking this week if that makes sense to figure out if we are on the right track.

Basically I want to add interactive flag that then performs the prompting in a loop after loading the model initially.

@ggerganov
Copy link
Owner

I think at some point https://github.com/ggerganov/llama.cpp will start supporting most of the available LLMs (not just LLaMA) so it will serve as a good "interactive" example.

But, maybe we can also have a simple interactive mode in ggml as well. It should be very minimal to avoid maintenance efforts.

CCLDArjun pushed a commit to CCLDArjun/ggml that referenced this issue Dec 18, 2023
* Apply fixes suggested to build on windows

Issue: ggerganov/llama.cpp#22

* Remove unsupported VLAs

* MSVC: Remove features that are only available on MSVC C++20.

* Fix zero initialization of the other fields.

* Change the use of vector for stack allocations.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

3 participants