Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correct Alibi implementation #351

Merged
merged 2 commits into from
Jul 11, 2023
Merged

Correct Alibi implementation #351

merged 2 commits into from
Jul 11, 2023

Conversation

daulet
Copy link
Contributor

@daulet daulet commented Jul 6, 2023

The current implementation doesn't reflect intended description of Alibi. Here is reference implementation linked in Papers with Code. This was discovered when using ggml on a model trained with regular definition of Alibi.

MPT and Replit models appear to be using this implementation, but I don't know if they've been verified to produce correct completions. Alternatively I can add this as alibi_classic op to avoid breaking those models.

Tests: I'd love to add them but it wasn't clear where the fit in the repo.

@ggerganov
Copy link
Owner

Hi, thanks for finding this. I ran a few tests with MPT and it seems to generate coherent text.

It would be nice to add some tests for that, but we don't have a good way of doing that yet.
We should somehow compare the results between the ggml implementation and the reference Python implementations

Open to suggestions

@ggerganov ggerganov merged commit 2c1536b into ggerganov:master Jul 11, 2023
2 checks passed
@ggerganov
Copy link
Owner

Ah, I forgot we have a --perplexity option available for mpt.

make -j && ./bin/mpt -m ./models/mpt-7b/ggml-model-q4_0.bin -t 8 --perplexity -f ./wiki.test.raw

Before change

Chunk	PPL cumulative	PPL chunk
1	20.51003208	20.51003208
2	30.93104099	46.64689421
3	39.46465368	64.24447954
4	36.14406908	27.76663723
5	32.26795364	20.49788003

After change

Chunk	PPL cumulative	PPL chunk
1	20.49744252	20.49744252
2	30.92985719	46.67197213
3	39.39654819	63.91733896
4	36.08996864	27.74415508
5	32.21732477	20.45979870

So the change brings the perplexity down a bit which is good.
Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants