-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
interactive_conditional_samples not checking if prompt length is greater than hparams.n_ctx / 2 #121
Comments
yeah feel free to send a PR! |
Been busy learning RL, but for sure when I catch a break! Did you see my pull req #119? Is it just not worth the risk of a bug? Not totally sure, but from my reading/mental diagram I think that it has literally ZERO effect on the output of the model. |
I'll actually test it when I have time. It runs perfectly fine, just haven't tested on same input and seeds. |
Or you can test better, you have more resources I think |
reviewed! thanks for the work on it! would be great if you could test it (we're pretty busy around here..) |
sure, will get to it! I imagine you don't have the infra set up to test better than me actually either |
It may also crash it. There is some error I'm getting when the input was ~460 words (GPT-2). Read it may be related to vocabulary but I'm not sure if this is the case here. tensorflow.python.framework.errors_impl.InvalidArgumentError: indices[0,0] = 1024 is not in [0, 1024) |
Wouldn't the fix actually be to count tokens encoded and check if it did not exceed the length variable? We actually base it on encoded tokens not inputted text itself. I found out when I printed the value and it was exactly 512 it worked ok, >512 crashed. However I have a bit more modified code so I can just give some hints:
This fixed problems for me and it's no longer crashing! I modified the original, pull request: #142 |
If the length is greater this will break the model due to the word position embedding (wpe tensor) not being large enough. Should a check be added?
The text was updated successfully, but these errors were encountered: