Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature request] Implement 8-bit GPT-J #5

Closed
pablogranolabar opened this issue Nov 13, 2022 · 0 comments · Fixed by #27
Closed

[Feature request] Implement 8-bit GPT-J #5

pablogranolabar opened this issue Nov 13, 2022 · 0 comments · Fixed by #27
Labels
enhancement New feature or request

Comments

@pablogranolabar
Copy link

Results in ~11Gb weights vs. 16Gb, implemented in PyTorch now as load_in_8bit=True:

https://huggingface.co/hivemind/gpt-j-6B-8bit

@ggerganov ggerganov added the enhancement New feature or request label Nov 14, 2022
CCLDArjun pushed a commit to CCLDArjun/ggml that referenced this issue Dec 18, 2023
* use hipblas based on cublas
* Update Makefile for the Cuda kernels
* Expand arch list and make it overrideable
* Fix multi GPU on multiple amd architectures with rocblas_initialize() (ggerganov#5)
* add hipBLAS to README
* new build arg LLAMA_CUDA_MMQ_Y
* fix half2 decomposition
* Add intrinsics polyfills for AMD
* AMD assembly optimized __dp4a
* Allow overriding CC_TURING
* use "ROCm" instead of "CUDA"
* ignore all build dirs
* Add Dockerfiles
* fix llama-bench
* fix -nommq help for non CUDA/HIP

---------

Co-authored-by: YellowRoseCx <[email protected]>
Co-authored-by: ardfork <[email protected]>
Co-authored-by: funnbot <[email protected]>
Co-authored-by: Engininja2 <[email protected]>
Co-authored-by: Kerfuffle <[email protected]>
Co-authored-by: jammm <[email protected]>
Co-authored-by: jdecourval <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants