Skip to content

Latest commit

 

History

History
 
 

llma

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 

LLMA: Large Language Model Accelerator

News

  • Outputs of LLMs often have significant overlaps with some references (e.g, retrieved documents).
  • Lossless acceleration of LLM inference by copying from references.
  • Applicable to important LLM scenarios such as retrieval-augmented generation and multi-turn conversations.
  • 2~3 times speed-up; no additional model required!

image

image