Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

getting decoded output #5

Open
ingride opened this issue Dec 8, 2023 · 3 comments
Open

getting decoded output #5

ingride opened this issue Dec 8, 2023 · 3 comments

Comments

@ingride
Copy link

ingride commented Dec 8, 2023

hi.

any suggestions on how to get a decoded output once I have done

let output = model.forward(&input_ids, &token_ids)?;

Am trying to go from a text input, extract input and token_ids, and then decode to a text output. I tried to do some basic convert but didn't get anywhere. if you have an example, I would appreciate it, am kinda struggling to connect the bits.

@ToluClassics
Copy link
Owner

Hi @ingride ,

Oh when you get the output_ids, you can decode back to text using the model tokenizer

tokenizer.decode(output)

source: https://github.com/huggingface/candle/blob/9bd94c1ffa0ccfd2bbc9526569b8b8a2a3812027/candle-examples/src/token_output_stream.rs#L27

@ingride
Copy link
Author

ingride commented Dec 14, 2023

sorry this might be a very dumb question, but the output is a two dimensional tensor - Base Roberta.

are you suggesting to flatten it similarly to here https://github.com/ToluClassics/candle-tutorial/blob/main/tests/test_roberta.rs#L81 ? because decode takes a &[u32] slice as param.

how do i get the output_ids from the two dimensional tensor / the config

@ingride
Copy link
Author

ingride commented Dec 18, 2023

@ToluClassics any thoughts on this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants