Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

understanding context length behaviors #1642

Open
simran-arora opened this issue Mar 27, 2024 · 1 comment
Open

understanding context length behaviors #1642

simran-arora opened this issue Mar 27, 2024 · 1 comment
Labels
asking questions For asking for clarification / support on library usage.

Comments

@simran-arora
Copy link
Contributor

Hi, I have a quick question. What is the behavior of the harness if the input examples exceed the model's sequence length in long-document tasks (and how do few-shot examples influence this)? Thank you!

@haileyschoelkopf
Copy link
Contributor

haileyschoelkopf commented Apr 7, 2024

Hi!

The current (intended) behavior is to simply left-truncate so things won't exceed (model max length).

  • for perplexities, no context is discarded--we chunk into nonoverlapping blocks of length (max length), with a BOS token prefixed to the first chunk. This is the same as strided PPL evaluation with a stride of (max length)--we might expose the option for shorter sliding window sizes / strides soon.
  • for generative tasks, we truncate to (max length + max tokens to generate)
  • for multiple-choice, and loglikelihood type tasks (lambada), we truncate the (context + continuation) to (max length).

(PS. this is all described for HFLM, but the other local model impls are meant to match HF as much as possible in behaviors like this.)

There aren't currently any optimizations we're doing w.r.t intelligently truncating beyond these--because LMs' tokenizers are (not currently) exposed elsewhere to the tasks or construction of string inputs, it's a pain to figure out what would be truncated beforehand and e.g. only provide the max number of shots that fit while maintaining the prefixed task description and fewshot format. Would like to change this in future though.

Would like to improve this behavior in future, or at minimum make it clear via logging when requests are being truncated!

Hope this helps!

@haileyschoelkopf haileyschoelkopf added the asking questions For asking for clarification / support on library usage. label Apr 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
asking questions For asking for clarification / support on library usage.
Projects
None yet
Development

No branches or pull requests

2 participants