Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Code difference is getting more between ggml and rwkv.cpp #25

Closed
yorkzero831 opened this issue Apr 16, 2023 · 4 comments · Fixed by #28
Closed

Code difference is getting more between ggml and rwkv.cpp #25

yorkzero831 opened this issue Apr 16, 2023 · 4 comments · Fixed by #28

Comments

@yorkzero831
Copy link
Contributor

yorkzero831 commented Apr 16, 2023

Hi,
I am working on rwkv-rs project which based on ggml as well, recentlly I am refering your implemtation of ggml, but I found it was hard to keep the rust friendly type defination (which is on master branch of gglm) and Q4_1_0 support (on your master branch) together.

Do you have any plain to update your gglm or contribute the Q4_1_0 back to ggml?

Currently I manually merged part of your code (without Q4_1_0 ) with newest gglm code on my project.
and here is my project,
https://github.com/yorkzero831/rwkv-rs

@saharNooby
Copy link
Collaborator

saharNooby commented Apr 17, 2023

That's a diffucult situation.

On the one hand, I would like to keep ggml up to date, maybe even as a subrepo (so I don't need to manually pull updates). That way we get all new performance improvements and fixes for free.

On the other hand, ggml's Q4_0 and Q4_1 completely break RWKV, and we need Q4_1_O for any hope of quality; and both options "port Q4_1_O again every time ggml updates" and "just delete Q4_1_O" look very bad.

I guess my plan/vision currently is "it works fine now, so we can live without ggml updates at all".

I'll ask in ggml repo what they think about merging Q4_1_O upstream.

@saharNooby
Copy link
Collaborator

I started the refactoring in ggml-to-submodule branch.

When merged, this will update ggml in rwkv.cpp to the latest version. After that, I would need to manually pull changes into the submodule to update ggml, but since there is now no new operators and change set is greatly reduced, this would be way easier.

Ideally, I need to pull Q4_1_O format into upstream ggml. That way, I would not even need to support a ggml fork. As an alternative, vanilla ggml quantization formats may become good enough for RWKV, as mentioned in that issue.

@saharNooby
Copy link
Collaborator

@yorkzero831 I've forked most recent version of ggml here. The fork contains only one important commit: introduction of Q4_1_O format. I guess it should be usable with Rust now.

@yorkzero831
Copy link
Contributor Author

@saharNooby Really nice, that helps alot

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants