Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Negative perplexity values #1595

Open
shikhar-srivastava opened this issue Mar 17, 2024 · 4 comments
Open

Negative perplexity values #1595

shikhar-srivastava opened this issue Mar 17, 2024 · 4 comments
Assignees
Labels
asking questions For asking for clarification / support on library usage.

Comments

@shikhar-srivastava
Copy link

shikhar-srivastava commented Mar 17, 2024

Hey folks,

So, the perplexity values on a per sample/doc basis are ALL negative.
Can someone explain why this is ?

This is using the --log-samples option.

Command:

lm_eval --model hf  
 --model_args pretrained=EleutherAI/pythia-160m,revision=step100000,dtype="float"  
 --tasks lambada_openai     --device cuda:0     --batch_size 32 
 --log_samples 
 --output_path test_nlu_eval/

Sample output:

{
    "doc_id": 0,
    "doc": {
      "text": "In my palm is a clear stone, and inside it is a small ivory statuette. A guardian angel.\n\n\"Figured if you're going to be out at night getting hit by cars, you might as well have some backup.\"\n\nI look at him, feeling stunned. Like this is some sort of sign. But as I stare at Harlin, his mouth curved in a confident grin, I don't care about signs"
    },
    "target": " signs",
    "arguments": [
      [
        "In my palm is a clear stone, and inside it is a small ivory statuette. A guardian angel.\n\n\"Figured if you're going to be out at night getting hit by cars, you might as well have some backup.\"\n\nI look at him, feeling stunned. Like this is some sort of sign. But as I stare at Harlin, his mouth curved in a confident grin, I don't care about",
        " signs"
      ]
    ],
    "resps": [
      [
        [
          -8.487064361572266,
          false
        ]
      ]
    ],
    "filtered_resps": [
      [
        -8.487064361572266,
        false
      ]
    ],
    "perplexity": -8.487064361572266,
    "acc": 0
  },
@shikhar-srivastava shikhar-srivastava changed the title Negative Perplexity values Negative perplexity values Mar 17, 2024
@haileyschoelkopf haileyschoelkopf added the asking questions For asking for clarification / support on library usage. label Mar 17, 2024
@haileyschoelkopf haileyschoelkopf self-assigned this Mar 17, 2024
@haileyschoelkopf
Copy link
Contributor

Hi! Sorry, been meaning to more clearly document the sample log formats and semantic meaning of per-sample metrics for e.g. perplexity.

Here the per-sample perplexity values are loglikelihoods of the target string for the document, and the list of these loglikelihoods goes through

@register_aggregation("perplexity")
def perplexity(items):
return math.exp(-mean(items))
to turn it into the dataset-level perplexity. Sorry for the confusion!

@shikhar-srivastava
Copy link
Author

shikhar-srivastava commented Mar 17, 2024

Thanks for clarifying!
Another great thing to add would be a [predicted word] along with the [target word] to the per-sample outputs.

Any quick ways to get that too?

@haileyschoelkopf
Copy link
Contributor

Ah, unfortunately not currently--though you can tell based on the 0/1 logged for acc on lambada whether the word was correctly predicted.

@shikhar-srivastava
Copy link
Author

Ah, I see. Unfortunately, I need access to the predicted word for the per-sample output.
How would you suggest possibly doing that?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
asking questions For asking for clarification / support on library usage.
Projects
None yet
Development

No branches or pull requests

2 participants