Name		Name	Last commit message	Last commit date
parent directory ..
README.md		README.md
text_generation.py		text_generation.py
text_generation_tp.py		text_generation_tp.py
web_demo.py		web_demo.py

README.md

Text Generation Task

To run text generation task in the streaming mode:

python text_generation.py \
    --model 01-ai/Yi-6B \
    --tokenizer 01-ai/Yi-6B \
    --max-tokens 512 \
    --eos-token $'\n' \
    --streaming

You can also provide an extra --prompt argument to try some other prompts.

When dealing with extremely long input sequences, you may need multiple GPU devices and to enable tensor parallelism acceleration during inference to avoid insufficient memory error.

To run text generation task using tensor parallelism acceleration with 2 GPU devices:

torchrun --nproc_per_node 2 \
    text_generation_tp.py \
    --model 01-ai/Yi-6B \
    --max-tokens 512 \
    --eos-token $'\n' \
    --streaming

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

demo

demo

README.md

Text Generation Task

Files

demo

Directory actions

More options

Directory actions

More options

Latest commit

History

demo

Folders and files

parent directory

README.md

Text Generation Task