-
Notifications
You must be signed in to change notification settings - Fork 966
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reuse gpt2_graph #697
Comments
Even though the nodes in the graph are the same type for each invocation (there are exceptions though), the tensor dimensions do change. Mainly because the size of the tensors in the attention are a function of the number of tokens and also the input number of tokens in the batch can be different. In certain scenarios, it could be beneficial to pre-init the graphs (lets say, a graph for each possible |
thank you! |
On the gpt-2 example, I can see that during inference, each step invokes a gpt2_eval() function, which in turn, invokes gpt2_graph() function to recreate the graph.
Why can't we create just once and reuse it?
The text was updated successfully, but these errors were encountered: