You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Please consider adding the ability to display the inference speed for each interaction with the AI model.
🧐 Proposed Solution
This could be presented in a format similar to "Round trip time: 2.52s" or a more detailed breakdown like the example below:
Input
Output
Total
Speed (T/s)
868
723
731
Tokens
33
480
513
Inference Time (s)
0.04
0.66
0.70
Displaying the inference speed would allow users to better understand the responsiveness of the AI model and help them gauge the performance of their queries. This information could also be useful for developers and researchers to optimize their models and improve the overall efficiency of LobeChat.
📝 Additional Information
No response
The text was updated successfully, but these errors were encountered:
Thank you for raising an issue. We will investigate into the matter and get back to you as soon as possible.
Please make sure you have given us as much context as possible.
非常感谢您提交 issue。我们会尽快调查此事,并尽快回复您。 请确保您已经提供了尽可能多的背景信息。
🥰 Feature Description
Please consider adding the ability to display the inference speed for each interaction with the AI model.
🧐 Proposed Solution
This could be presented in a format similar to "Round trip time: 2.52s" or a more detailed breakdown like the example below:
Displaying the inference speed would allow users to better understand the responsiveness of the AI model and help them gauge the performance of their queries. This information could also be useful for developers and researchers to optimize their models and improve the overall efficiency of LobeChat.
📝 Additional Information
No response
The text was updated successfully, but these errors were encountered: