Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP test out new dockerfile with more nvidia tools #1557

Open
wants to merge 14 commits into
base: main
Choose a base branch
from
Next Next commit
WIP test out new dockerfile with more nvidia tools
  • Loading branch information
winglian committed Apr 21, 2024
commit 7290e7726b9083974b4c88af76a922bf7c0552ba
36 changes: 36 additions & 0 deletions .github/workflows/beta.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
name: beta-docker-images

on:
workflow_dispatch:
pull_request:

jobs:
build-axolotl-beta:
if: ${{ ! contains(github.event.commits[0].message, '[skip docker]]') && github.repository_owner == 'OpenAccess-AI-Collective' }}
strategy:
fail-fast: false
runs-on: axolotl-gpu-runner
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Docker metadata
id: metadata
uses: docker/metadata-action@v5
with:
images: winglian/axolotl-beta
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Login to Docker Hub
uses: docker/login-action@v3
with:
username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_TOKEN }}
# guidance for testing before pushing: https://docs.docker.com/build/ci/github-actions/test-before-push/
- name: Build and export to Docker
uses: docker/build-push-action@v5
with:
context: .
file: ./docker/Dockerfile-beta
tags: |
${{ steps.metadata.outputs.tags }}
labels: ${{ steps.metadata.outputs.labels }}
6 changes: 6 additions & 0 deletions docker/Dockerfile-base
Original file line number Diff line number Diff line change
Expand Up @@ -35,3 +35,9 @@ RUN git lfs install --skip-repo && \
pip3 install awscli && \
# The base image ships with `pydantic==1.8.2` which is not working
pip3 install -U --no-cache-dir pydantic==1.10.10

WORKDIR /workspace

RUN git clone --depth=1 https://github.com/OpenAccess-AI-Collective/axolotl.git

WORKDIR /workspace/axolotl
40 changes: 40 additions & 0 deletions docker/Dockerfile-beta
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
FROM nvcr.io/nvidia/pytorch:24.03-py3

RUN python3 -m pip install --upgrade pip

RUN groupadd axolotl && useradd -m -g axolotl -s /bin/bash axolotl

USER axolotl

RUN mkdir -p /home/axolotl/venv

RUN python -m venv /home/axolotl/venv/axolotl

ENV PATH="/home/axolotl/venv/axolotl/bin:$PATH"

RUN echo "source /home/axolotl/venv/axolotl/bin/activate" >> /home/axolotl/.bashrc

RUN git lfs install --skip-repo && \
pip3 install awscli

RUN pip install causal_conv1d && \
pip install -e .[deepspeed,flash-attn,mamba-ssm,galore]

# So we can test the Docker image
RUN pip install pytest

# fix so that git fetch/pull from remote works
RUN git config remote.origin.fetch "+refs/heads/*:refs/remotes/origin/*" && \
git config --get remote.origin.fetch

# helper for huggingface-login cli
RUN git config --global credential.helper store


ENV HF_DATASETS_CACHE="/workspace/data/huggingface-cache/datasets"
ENV HUGGINGFACE_HUB_CACHE="/workspace/data/huggingface-cache/hub"
ENV TRANSFORMERS_CACHE="/workspace/data/huggingface-cache/hub"
ENV HF_HOME="/workspace/data/huggingface-cache/hub"
ENV HF_HUB_ENABLE_HF_TRANSFER="1"

CMD ["sleep", "infinity"]
Loading