Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sparsity - The next performance frontier. #93

Open
JohnnyOpcode opened this issue Apr 19, 2023 · 2 comments
Open

Sparsity - The next performance frontier. #93

JohnnyOpcode opened this issue Apr 19, 2023 · 2 comments

Comments

@JohnnyOpcode
Copy link

Great work going on with GGML. Bravo to so many contributors. You are champions!

Maybe more performance (on CPU) can be had with bringing sparsity into the workflow. Here is one of the many efforts out there at the moment.

https://github.com/neuralmagic/deepsparse

What are peoples thoughts on this?

@DifferentialityDevelopment
Copy link

DifferentialityDevelopment commented May 22, 2023

The process for converting a model to a SparseML compatible model doesn't seem all that complicated. Sparsity has a lot of benefits to offer for inference, while you can quantize models to the GGML format, reducing their size and complexity, whereas making a model sparse involves both quantizing and pruning irrelevant parts?

@JohnnyOpcode
Copy link
Author

Here is a good explanation if anyone is interested.

https://neuralmagic.com/blog/sparsegpt-remove-100-billion-parameters-for-free/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants