Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disable pip cache for Dockerfiles #2015

Merged
merged 2 commits into from
Jan 19, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 0 additions & 1 deletion .github/workflows/docker_build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,6 @@ on:
push:
branches:
- master
- docker-build

jobs:
build:
Expand Down
18 changes: 11 additions & 7 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -2,24 +2,28 @@ FROM python:3.7.4-stretch

WORKDIR /home/user

RUN apt-get update && apt-get install -y curl git pkg-config cmake
RUN apt-get update && apt-get install -y \
curl \
git \
pkg-config \
cmake \
libpoppler-cpp-dev \
tesseract-ocr \
libtesseract-dev \
poppler-utils && \
rm -rf /var/lib/apt/lists/*

# Install PDF converter
RUN wget --no-check-certificate https://dl.xpdfreader.com/xpdf-tools-linux-4.03.tar.gz && \
tar -xvf xpdf-tools-linux-4.03.tar.gz && cp xpdf-tools-linux-4.03/bin64/pdftotext /usr/local/bin

RUN apt-get install libpoppler-cpp-dev pkg-config -y --fix-missing

# Install Tesseract
RUN apt-get install tesseract-ocr libtesseract-dev poppler-utils -y

# copy code
COPY haystack /home/user/haystack

# install as a package
COPY setup.py requirements.txt README.md /home/user/
RUN pip install --upgrade pip
RUN pip install -r requirements.txt
RUN pip install --no-cache-dir -r requirements.txt
RUN pip install -e .
RUN python3 -c "from haystack.utils.docker import cache_models;cache_models()"

Expand Down
13 changes: 7 additions & 6 deletions Dockerfile-GPU
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
FROM nvidia/cuda:11.1-runtime-ubuntu20.04
FROM nvidia/cuda:11.1-runtime-ubuntu20.04

WORKDIR /home/user

Expand All @@ -11,7 +11,7 @@ RUN mkdir -p /home/user/file-upload && chmod 777 /home/user/file-upload
# Install software dependencies
RUN apt-get update && apt-get install -y software-properties-common && \
add-apt-repository ppa:deadsnakes/ppa && \
apt-get update && apt-get install -y \
ZanSara marked this conversation as resolved.
Show resolved Hide resolved
apt-get install -y \
cmake \
curl \
git \
Expand All @@ -25,7 +25,8 @@ RUN apt-get update && apt-get install -y software-properties-common && \
python3.7-distutils \
swig \
tesseract-ocr \
wget
wget && \
rm -rf /var/lib/apt/lists/*

# Install PDF converter
RUN curl -s https://dl.xpdfreader.com/xpdf-tools-linux-4.03.tar.gz | tar -xvzf - -C /usr/local/bin --strip-components=2 xpdf-tools-linux-4.03/bin64/pdftotext
Expand All @@ -40,9 +41,9 @@ COPY setup.py requirements.txt README.md /home/user/
RUN pip install --upgrade pip
RUN echo "Install required packages" && \
# Install PyTorch for CUDA 11
pip3 install torch==1.10.1+cu111 -f https://download.pytorch.org/whl/torch_stable.html && \
# Install from requirements.txt
pip3 install -r requirements.txt
pip3 install --no-cache-dir torch==1.10.1+cu111 -f https://download.pytorch.org/whl/torch_stable.html && \
# Install from requirements.txt
pip3 install --no-cache-dir -r requirements.txt

# copy saved models
COPY README.md models* /home/user/models/
Expand Down