Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for PEGASUS model #63

Merged
merged 2 commits into from
Jan 18, 2022
Merged

Add support for PEGASUS model #63

merged 2 commits into from
Jan 18, 2022

Conversation

thomas-chong
Copy link
Contributor

I would like to add the support of PEGASUS in model-config.yaml.

PEGASUS model is an encoder-decoder type and the implementation is completely inherited from BartForConditionalGeneration. So the config is similar to the BART model.

Notes: This is my first time making a pull request on an open-source project, but hope this helps!

Added support of PEGASUS model
@jalammar
Copy link
Owner

Thank you for the contribution @thomas-chong. Were you able to run the model and use it to generate text?

@thomas-chong
Copy link
Contributor Author

thomas-chong commented Jan 14, 2022

Yes @jalammar . I have implemented and it worked perfectly to generate abstractive summary with PEGASUS.

image

@jalammar
Copy link
Owner

@thomas-chong Brilliant! Can you please change the token prefix to the character: '▁' (instead of the normal underscore '_').

What would you also think of adding other Pegasus models since they'll likely use the same config:

https://huggingface.co/google/pegasus-xsum
https://huggingface.co/google/pegasus-large

for example

@jalammar
Copy link
Owner

Just for my records, to update the notebook with an example later:

!pip install sentencepiece

import ecco
lm = ecco.from_pretrained('google/pegasus-cnn_dailymail', verbose=False)
prompt=""" The tower is 324 metres (1,063 ft) tall, about the same height as an 81-storey building, and the tallest structure in Paris. Its base is square, measuring 125 metres (410 ft) on each side. During its construction, the Eiffel Tower surpassed the Washington Monument to become the tallest man-made structure in the world, a title it held for 41 years until the Chrysler Building in New York City was finished in 1930. It was the first structure to reach a height of 300 metres. Due to the addition of a broadcasting aerial at the top of the tower in 1957, it is now taller than the Chrysler Building by 5.2 metres (17 ft). Excluding transmitters, the Eiffel Tower is the second tallest free-standing structure in France after the Millau Viaduct."""

output = lm.generate(prompt, generate=50, do_sample=True)
output

Attribution works but runs out of memory for me. So some optimization is likely needed in the future.

added all pegasus downstream models
@thomas-chong
Copy link
Contributor Author

thomas-chong commented Jan 17, 2022

@jalammar I have added all the PEGASUS downstream models to model-config.yaml as well.

@jalammar
Copy link
Owner

Brilliant

@jalammar jalammar merged commit 0ef61c5 into jalammar:main Jan 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants