✨ Add tokenization task to generation modules #351

evaline-ju · 2024-04-24T21:03:55Z

Closes #350

Because TextGenerationTGIS is not a subclass but a backend type for TextGeneration, any additional tasks declared on the former were not actually getting added. By adding the additional tasks on TextGeneration and PeftPromptTuning respectively, this allows the tokenization tasks to be available on the TGIS backend implementations. Unimplemented functions have to be added or there will be errors.

Tokenization run functions could eventually be implemented since each LLM could be reasonably expected to have a tokenizer.

Std getting imported with LaunchConfig started to error in latest torch 2.3.0. It was observed the Std object is at https://github.com/pytorch/pytorch/blame/main/torch/distributed/elastic/multiprocessing/api.py and not in the LaunchConfig and was potentially getting imported through another import to the launcher api. tee is no longer an arg on LaunchConfig. Since this was a breaking change, torch is now pinned below 2.3.0.

Signed-off-by: Evaline Ju <[email protected]>

gkumbhat

Looks good to me! Thanks for fixing.

evaline-ju added 2 commits April 24, 2024 14:58

✨ Add tokenization task to generation modules

6f231bf

Signed-off-by: Evaline Ju <[email protected]>

✅ Add unimplemented function tests

4cc6b88

Signed-off-by: Evaline Ju <[email protected]>

evaline-ju requested review from alex-jw-brooks, gkumbhat, gabe-l-hart, tharapalanivel and Ssukriti as code owners April 24, 2024 21:03

evaline-ju added 3 commits April 24, 2024 15:11

📌 Pin breaking import changes for torch 2.3.0

cb9a51a

Signed-off-by: Evaline Ju <[email protected]>

🐛 Change Std import to avoid torch pin

b608e75

Signed-off-by: Evaline Ju <[email protected]>

📌 Pin torch due to changed LaunchConfig args

79f20d8

Signed-off-by: Evaline Ju <[email protected]>

gkumbhat approved these changes Apr 25, 2024

View reviewed changes

gkumbhat merged commit d81f11e into caikit:main Apr 25, 2024
5 checks passed

gkumbhat deleted the llm-tok branch April 25, 2024 03:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

✨ Add tokenization task to generation modules #351

✨ Add tokenization task to generation modules #351

evaline-ju commented Apr 24, 2024 •

edited

Loading

gkumbhat left a comment

✨ Add tokenization task to generation modules #351

✨ Add tokenization task to generation modules #351

Conversation

evaline-ju commented Apr 24, 2024 • edited Loading

gkumbhat left a comment

Choose a reason for hiding this comment

evaline-ju commented Apr 24, 2024 •

edited

Loading