Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reuse gpt2_graph #697

Closed
datduonguva opened this issue Jan 14, 2024 · 2 comments
Closed

Reuse gpt2_graph #697

datduonguva opened this issue Jan 14, 2024 · 2 comments

Comments

@datduonguva
Copy link

On the gpt-2 example, I can see that during inference, each step invokes a gpt2_eval() function, which in turn, invokes gpt2_graph() function to recreate the graph.

Why can't we create just once and reuse it?

@ggerganov
Copy link
Owner

Even though the nodes in the graph are the same type for each invocation (there are exceptions though), the tensor dimensions do change. Mainly because the size of the tensors in the attention are a function of the number of tokens and also the input number of tokens in the batch can be different. In certain scenarios, it could be beneficial to pre-init the graphs (lets say, a graph for each possible n_past and a single new token) at the start in order to avoid creating them at runtime.

@datduonguva
Copy link
Author

thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants