Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal for integrating ggml and tiny-dnn and extending whisper.cpp and llama.cpp #66

Open
amirrastifarsad opened this issue Apr 5, 2023 · 0 comments

Comments

@amirrastifarsad
Copy link

Dear Georgi Gerganov,

I am a fan of your work on ggml, whisper.cpp and llama.cpp. I think you have done an amazing job of creating efficient and portable deep learning libraries and models in C/C++. I am also interested in tiny-dnn, a header-only, dependency-free deep learning framework in C++ that supports various types of neural network layers, activation functions, loss functions and optimization algorithms.

I have a proposal for integrating ggml and tiny-dnn and extending whisper.cpp and llama.cpp with training and fine-tuning abilities. I think this would bring many benefits for both projects and the users, such as:

  • Reducing the memory footprint and improving the inference speed of deep learning models using ggml’s 4-bit integer quantization support. This would be useful for tiny-dnn, which aims to run on limited computational resource and embedded systems. It would also enable tiny-dnn to run more complex models such as GPT-J and LLaMA that require large amounts of memory.

  • Enhancing the performance of tiny-dnn on different platforms and devices using ggml’s optimized inference for models such as GPT-2, GPT-J, Whisper, LLaMA, and RWKV using NEON intrinsics and Accelerate framework on Apple silicon, and AVX intrinsics on x86 architectures. This would allow tiny-dnn to support more types of natural language models that use attention mechanisms and transformers.

  • Experimenting with different network architectures and hyperparameters to improve the accuracy and robustness of whisper.cpp and llama.cpp models using tiny-dnn’s various types of neural network layers, activation functions, loss functions and optimization algorithms. This would enable whisper.cpp and llama.cpp to support training and fine-tuning, which are currently not available. It would also allow whisper.cpp and llama.cpp to leverage the existing knowledge and resources from the tiny-dnn community.

  • Leveraging the existing pre-trained models from Caffe using tiny-dnn’s ability to import models from Caffe. This would enable whisper.cpp and llama.cpp to support more types of models from different sources. It would also allow whisper.cpp and llama.cpp to benefit from the popularity and availability of Caffe models online.

I hope you find this proposal interesting and worthwhile. I would love to hear your feedback and thoughts on this idea. I think it would be a great opportunity to collaborate and create something awesome together.

Thank you for your time and attention.

Sincerely, Amir Rasti

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant