Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conda installation detailed instructions #73

Closed
NasonZ opened this issue Jan 8, 2024 · 25 comments
Closed

Conda installation detailed instructions #73

NasonZ opened this issue Jan 8, 2024 · 25 comments

Comments

@NasonZ
Copy link

NasonZ commented Jan 8, 2024

I'm trying to follow the instructions for installing unsloth in a conda environment, the problem is that the conda gets stuck when running the install lines.

I've tried running it twice, both times it got stuck solving the environment and I stopped after 30 minutes.

$ conda install cudatoolkit xformers bitsandbytes pytorch pytorch-cuda=12.1 -c pytorch -c nvidia -c xformers -c conda-forge -y
Collecting package metadata (current_repodata.json): \ WARNING conda.models.version:get_matcher(546): Using .* with relational operator is superfluous and deprecated and will be removed in a future version of conda. Your spec was 1.7.1.*, but conda is ignoring the .* and treating it as 1.7.1
done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.
Solving environment: failed with repodata from current_repodata.json, will retry with next repodata source.
Collecting package metadata (repodata.json): - WARNING conda.models.version:get_matcher(546): Using .* with relational operator is superfluous and deprecated and will be removed in a future version of conda. Your spec was 1.8.0.*, but conda is ignoring the .* and treating it as 1.8.0
WARNING conda.models.version:get_matcher(546): Using .* with relational operator is superfluous and deprecated and will be removed in a future version of conda. Your spec was 1.9.0.*, but conda is ignoring the .* and treating it as 1.9.0
WARNING conda.models.version:get_matcher(546): Using .* with relational operator is superfluous and deprecated and will be removed in a future version of conda. Your spec was 1.6.0.*, but conda is ignoring the .* and treating it as 1.6.0
done
Solving environment: | 

Additional system info:

$ nvidia-smi
Mon Jan  8 20:28:55 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.129.03             Driver Version: 535.129.03   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA A10G                    Off | 00000000:00:1E.0 Off |                    0 |
|  0%   28C    P8              16W / 300W |      4MiB / 23028MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                                         
+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|  No running processes found                                                           |
+---------------------------------------------------------------------------------------+
@danielhanchen
Copy link
Contributor

Do you have mamba?

Maybe try mamba install cudatoolkit xformers bitsandbytes pytorch pytorch-cuda=12.1 -c pytorch -c nvidia -c xformers -c conda-forge -y

@danielhanchen
Copy link
Contributor

Mamba can help solve long solving issues

@NasonZ
Copy link
Author

NasonZ commented Jan 9, 2024

No, I have a miniconda/anaconda which was installed via oobabooga.

(base) ubuntu@awsec2:~$ conda activate model_train_env
(model_train_env) ubuntu@awsec2:~$ mamba install cudatoolkit xformers bitsandbytes pytorch pytorch-cuda=12.1 -c pytorch -c nvidia -c xformers -c conda-forge -y
Command 'mamba' not found, did you mean:
  command 'samba' from deb samba (2:4.15.13+dfsg-0ubuntu1.5)
Try: sudo apt install <deb name>

@danielhanchen
Copy link
Contributor

@NasonZ hmmmm another approach is to install it one by one and ignoring pytorch

conda install cudatoolkit xformers bitsandbytes -c nvidia -c xformers -c conda-forge

@NasonZ
Copy link
Author

NasonZ commented Jan 10, 2024

TLDR:

These are the steps I took to get my unsloth conda env working

$ conda create --name <your_unsloth_env> python=<3.10/3.9>

$ conda install pytorch torchvision torchaudio pytorch-cuda=<12.1/11.8> -c pytorch -c nvidia

$ conda install xformers -c xformers -y

$ pip install bitsandbytes

$ pip install "unsloth[conda] @ git+https://github.com/unslothai/unsloth.git"

So I tried installing one by one which raised I few issues which I was able to work around.

  1. xformers needs python 3.9 or 3.10 (I had 3.11 as it wasn't specified what python version was needed in the readme.md)
(model_train_env) ubuntu@awsec2:~/dmyzer/dmyzer-data-generator$ conda install xformers -c xformers -y        
Collecting package metadata (current_repodata.json): done                                                             
Solving environment: failed with initial frozen solve. Retrying with flexible solve.                                  
Solving environment: failed with repodata from current_repodata.json, will retry with next repodata source.           
Collecting package metadata (repodata.json): done                                                                     
Solving environment: failed with initial frozen solve. Retrying with flexible solve.                                  
Solving environment: -                                                                                                
Found conflicts! Looking for incompatible packages.                                                                   
This can take several minutes.  Press CTRL-C to abort.                                                                
failed                                                                                                                
                                                                                                                      
UnsatisfiableError: The following specifications were found                                                           
to be incompatible with the existing python installation in your environment:                                         
                      
Specifications:

  - xformers -> python[version='>=3.10,<3.11.0a0|>=3.9,<3.10.0a0']

Your python: python=3.11

If python is on the left-most side of the chain, that's the version you've asked for.
When python appears to the right, that indicates that the thing on the left is somehow
not available for the python version you are constrained to. Note that conda will not
change your python version to a different minor version unless you explicitly specify
that.

The following specifications were found to be incompatible with your system:

  - feature:/linux-64::__cuda==12.2=0
  - feature:/linux-64::__glibc==2.35=0
  - feature:|@/linux-64::__glibc==2.35=0
  - python=3.11 -> libgcc-ng[version='>=11.2.0'] -> __glibc[version='>=2.17']
  - xformers -> pytorch=2.0.1 -> __cuda[version='>=11.8']

Your installed version is: 2.35
  1. Installing cudatoolkit separately led to issues when installing pytorch after, cudatoolkit is installed by pytorch-cuda so specifying it separately was redundant in my case.

  2. Installing bitsandbytes via conda install bitsandbytes -c conda-forge -y led to the same frozen solve issue outlined originally. Installing via conda install conda-forge::bitsandbytes also didn't work, bitsandbytes threw a load of errors when running from unsloth import FastLanguageModel. Eventually got it running by installing the method mentioned in the bitsandbytes repo - pip install bitsandbytes.

I verified that my enviornment was working by running the TinyLLama notebook.

@danielhanchen
Copy link
Contributor

Oh my! Thanks so so much for the detailed instructions - I'll be pinning this if you don't mind :) Glad it finnaly was able to work!!

@danielhanchen danielhanchen changed the title Conda install stuck Conda installation detailed instructions Jan 10, 2024
@danielhanchen danielhanchen pinned this issue Jan 10, 2024
@NasonZ
Copy link
Author

NasonZ commented Jan 10, 2024

No worries, happy to help other get onboard with what looks to be a really useful package :)

@NasonZ NasonZ closed this as completed Jan 10, 2024
@findalexli
Copy link

hi there, still getting the following error: Could not solve for environment specs
The following packages are incompatible
└─ xformers is installable with the potential options
├─ xformers [0.0.16|0.0.17|...|0.0.24] would require
│ └─ python >=3.10,<3.11.0a0 , which can be installed;
├─ xformers [0.0.16|0.0.17|...|0.0.24] would require
│ └─ python >=3.9,<3.10.0a0 , which can be installed;
└─ xformers [0.0.16|0.0.20|0.0.21] conflicts with any installable versions previously reported. When running (unsloth) (base) ubuntu@ip-172-31-34-94:~$ mamba install cudatoolkit xformers bitsandbytes pytorch pytorch-cuda=12.1 -c pytorch -c nvidia -c xformers -c conda-forge -y, I checked that I have coda 12.1 installed

@Gene-Weaver
Copy link

I ran into this error:

Exception has occurred: RuntimeError

        CUDA Setup failed despite GPU being available. Please run the following command to get more information:

        python -m bitsandbytes

And used this combination of the approaches listed above to get things working:

conda create --name unsloth_env python=3.10
conda activate unsloth_env
mamba install xformers pytorch pytorch-cuda=12.1 -c pytorch -c nvidia -c xformers -c conda-forge -y
pip install bitsandbytes
pip install "unsloth[conda] @ git+https://github.com/unslothai/unsloth.git"

@felipepenhorate
Copy link

Just a heads up to anyone that is trying to install the package on a miniconda env and getting error in the xformers installation because of conflicts, it turns out nowadays the conda is installing pytorch==2.2.1 that is not compatible with the xformers. You need to set the pytorch version to 2.2.0 in order to make the installation work properly.

This is what I used:

conda create --name unsloth_env python=3.10
conda activate unsloth_env
conda install pytorch==2.2.0 cudatoolkit torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia
conda install xformers -c xformers
pip install bitsandbytes
pip install "unsloth[conda] @ git+https://github.com/unslothai/unsloth.git"

@NasonZ
Copy link
Author

NasonZ commented Mar 15, 2024

@felipepenhorate Thanks for the quick fix, just encountered this issue with conda.

@danielhanchen danielhanchen unpinned this issue Mar 15, 2024
@danielhanchen
Copy link
Contributor

@felipepenhorate Yes thanks so much! I'll update the readme!!

@Gene-Weaver
Copy link

Also, because of the triton package requirements, this only works on Linux systems (without compiling your own triton workaround 😬). You can train on Linux and then deploy on other systems using regular Hugging Face workflows. Thanks @danielhanchen!

@ppaartha
Copy link

TLDR:

These are the steps I took to get my unsloth conda env working

$ conda create --name <your_unsloth_env> python=<3.10/3.9>

$ conda install pytorch torchvision torchaudio pytorch-cuda=<12.1/11.8> -c pytorch -c nvidia

$ conda install xformers -c xformers -y

$ pip install bitsandbytes

$ pip install "unsloth[conda] @ git+https://github.com/unslothai/unsloth.git"

So I tried installing one by one which raised I few issues which I was able to work around.

  1. xformers needs python 3.9 or 3.10 (I had 3.11 as it wasn't specified what python version was needed in the readme.md)
(model_train_env) ubuntu@awsec2:~/dmyzer/dmyzer-data-generator$ conda install xformers -c xformers -y        
Collecting package metadata (current_repodata.json): done                                                             
Solving environment: failed with initial frozen solve. Retrying with flexible solve.                                  
Solving environment: failed with repodata from current_repodata.json, will retry with next repodata source.           
Collecting package metadata (repodata.json): done                                                                     
Solving environment: failed with initial frozen solve. Retrying with flexible solve.                                  
Solving environment: -                                                                                                
Found conflicts! Looking for incompatible packages.                                                                   
This can take several minutes.  Press CTRL-C to abort.                                                                
failed                                                                                                                
                                                                                                                      
UnsatisfiableError: The following specifications were found                                                           
to be incompatible with the existing python installation in your environment:                                         
                      
Specifications:

  - xformers -> python[version='>=3.10,<3.11.0a0|>=3.9,<3.10.0a0']

Your python: python=3.11

If python is on the left-most side of the chain, that's the version you've asked for.
When python appears to the right, that indicates that the thing on the left is somehow
not available for the python version you are constrained to. Note that conda will not
change your python version to a different minor version unless you explicitly specify
that.

The following specifications were found to be incompatible with your system:

  - feature:/linux-64::__cuda==12.2=0
  - feature:/linux-64::__glibc==2.35=0
  - feature:|@/linux-64::__glibc==2.35=0
  - python=3.11 -> libgcc-ng[version='>=11.2.0'] -> __glibc[version='>=2.17']
  - xformers -> pytorch=2.0.1 -> __cuda[version='>=11.8']

Your installed version is: 2.35
  1. Installing cudatoolkit separately led to issues when installing pytorch after, cudatoolkit is installed by pytorch-cuda so specifying it separately was redundant in my case.
  2. Installing bitsandbytes via conda install bitsandbytes -c conda-forge -y led to the same frozen solve issue outlined originally. Installing via conda install conda-forge::bitsandbytes also didn't work, bitsandbytes threw a load of errors when running from unsloth import FastLanguageModel. Eventually got it running by installing the method mentioned in the bitsandbytes repo - pip install bitsandbytes.

I verified that my enviornment was working by running the TinyLLama notebook.

tmp/tmpmemclhbv/main.c: In function ‘list_to_cuuint64_array’:
/tmp/tmpmemclhbv/main.c:354:3: error: ‘for’ loop initial declarations are only allowed in C99 mode
for (Py_ssize_t i = 0; i < len; i++) {
^
/tmp/tmpmemclhbv/main.c:354:3: note: use option -std=c99 or -std=gnu99 to compile your code
/tmp/tmpmemclhbv/main.c: In function ‘list_to_cuuint32_array’:
/tmp/tmpmemclhbv/main.c:365:3: error: ‘for’ loop initial declarations are only allowed in C99 mode
for (Py_ssize_t i = 0; i < len; i++) {

subprocess.CalledProcessError: Command '['/usr/bin/gcc', '/tmp/tmporgwe35u/main.c', '-O3', '-I/miniconda3/envs/LLM/lib/python3.10/site-packages/triton/common/../third_party/cuda/include', '-I/miniconda3/envs/LLM/include/python3.10', '-I/tmp/tmporgwe35u', '-shared', '-fPIC', '-lcuda', '-o', '/tmp/tmporgwe35u/cuda_utils.cpython-310-x86_64-linux-gnu.so', '-L/lib64', '-L/lib', '-L/lib64', '-L/lib']' returned non-zero exit status 1.
getting this error after trying every type of unsloth env setup. Got stuck in this issue.

@danielhanchen
Copy link
Contributor

Oh maybe outdated gcc?

@ppaartha
Copy link

Oh maybe outdated gcc?

My gcc version is -
gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44)
Is it the actual problem?

@danielhanchen
Copy link
Contributor

@felipepenhorate Ye I think that's wayyy too old!!

@Lipapaldl
Copy link

from triton.common.build import libcuda_dirs

ModuleNotFoundError: No module named 'triton.common'

@danielhanchen
Copy link
Contributor

@Lipapaldl Is your Triton version 3.0.0?

@ArrangingFear56
Copy link

ArrangingFear56 commented Jul 9, 2024

I ran
pip install xformers==0.0.24
to retain torch version as latest xformers require torch==2.3.0
and
conda install xformers -c xformers
doesn't seem to work anymore.

@danielhanchen
Copy link
Contributor

I'm planning to write a better guide for conda installs in the near future

@WasamiKirua
Copy link

I ran pip install xformers==0.0.24 to retain torch version as latest xformers require torch==2.3.0 and conda install xformers -c xformers doesn't seem to work anymore.

oh my god. I'm trying hard to use unsloth locally but it's a pain. I follow the conda instructions but I've been forced to downgrade xformers, i"ve tried the version printed out in the error as well yours but no way it seems that the conflict which triggered the error is still there. I'm done

@ArrangingFear56
Copy link

ArrangingFear56 commented Jul 14, 2024

@WasamiKirua
Are you running windows? If so, you may need to run it in WSL 2 instead, that’s what eventually worked for me.

@richardxoldman
Copy link

conda create --name unsloth_env python=3.10
conda activate unsloth_env
conda install pytorch==2.2.0 cudatoolkit torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia
pip install xformers==0.0.24
pip install bitsandbytes
pip install "unsloth[conda] @ git+https://github.com/unslothai/unsloth.git"

It works for me (Linux)

@danielhanchen
Copy link
Contributor

Apologies on Conda issues - I do know sometimes it can be painful - another option is to copy paste our Kaggle install instructions here: https://www.kaggle.com/danielhanchen/kaggle-gemma2-9b-unsloth-notebook which might work

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants