globally normalized models #960

denizyuret · 2023-11-03T07:50:53Z

Globally normalized models score sequences using the sum of unnormalized logits, as opposed to locally normalized ones which take a log_softmax at each token position before summing to compute the sequence score. They are strictly more expressive then locally normalized models (i.e. they can express a superset of probability distributions over sequences). It would take a one-line change in lm-evaluation-harness/lm_eval/base.py to allow evaluation of globally normalized models:

            # multi_logits = F.log_softmax(                                                                                            
            #     self._model_call(batched_inps), dim=-1                                                                               
            # ).cpu()  # [batch, padding_length, vocab]                                                                                
            multi_logits = self._model_call(batched_inps).cpu()

I just wanted to start a discussion to see if there is any interest and if there is a way we could make this an option for the evaluator. I can submit a pull request. (And I have some globally normalized models I'd like to share on hf-leaderboard ;).

The text was updated successfully, but these errors were encountered:

StellaAthena · 2023-11-23T14:21:05Z

Sure, I don't see why we shouldn't include this.

This was referenced Nov 29, 2023

Added the --no_softmax option #1039

Closed

Added no-softmax entries to MODEL_REGISTRY #1052

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

globally normalized models #960

globally normalized models #960

denizyuret commented Nov 3, 2023

StellaAthena commented Nov 23, 2023

globally normalized models #960

globally normalized models #960

Comments

denizyuret commented Nov 3, 2023

StellaAthena commented Nov 23, 2023