-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* Feat: Removing use of temp file while downloading archive from url along with adding CI for windows and mac platform * Windows CI by default installing pytorch gpu hence updating CI to pick cpu version * fixing mac cache build issue * updating windows pip install command for torch * another attempt * updating ci * Adding sudo * fixing ls failure on windows * another attempt to fix build issue * Saving env variable of test files * Adding debug log * Github action differ on windows * adding debug * anohter attempt * Windows have different ways to receive env * fixing template * minor fx * Adding debug * Removing use of json * Adding back fromJson * addin toJson * removing print * anohter attempt * disabling parallel run at least for testing * installing docker for mac runner * correcting docker install command * Linux dockers are not suported in windows * Removing mac changes * Upgrading pytorch * using lts pytorch * Separating win and ubuntu * Install java 11 * enabling linux container env * docker cli command * docker cli command * start elastic service * List all service * correcting service name * Attempt to fix multiple test run * convert to json * another attempt to check * Updating build cache step * attempt * Add tika * Separating windows CI * Changing CI name * Skipping test which does not work in windows * Skipping tests for windows * create cleanup function in conftest * adding skipif marker on tests * Run windows PR on only push to master * Addressing review comments * Enabling windows ci for this PR * Tika init is being called when importing tika function * handling tika import issue * handling tika import issue in test * Fixing import issue * removing tika fixure * Removing fixture from tests * Disable windows ci on pull request * Add back extra pytorch install step Co-authored-by: Malte Pietsch <[email protected]>
- Loading branch information
1 parent
08341f5
commit e5b4b62
Showing
12 changed files
with
131 additions
and
25 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,97 @@ | ||
name: Windows CI | ||
|
||
on: | ||
push: | ||
branches: [ master ] | ||
# pull_request: | ||
# branches: [ master ] | ||
|
||
jobs: | ||
type-check: | ||
runs-on: windows-latest | ||
steps: | ||
- uses: actions/checkout@v2 | ||
- uses: actions/setup-python@v2 | ||
with: | ||
python-version: 3.8 | ||
- name: Test with mypy | ||
run: | | ||
pip install mypy types-Markdown types-requests types-PyYAML pydantic | ||
mypy haystack | ||
build-cache: | ||
needs: type-check | ||
runs-on: windows-latest | ||
|
||
steps: | ||
- uses: actions/checkout@v2 | ||
- uses: actions/setup-python@v2 | ||
with: | ||
python-version: 3.7 | ||
- run: echo "date=$(date +'%Y-%m-%d')" >> $env:GITHUB_ENV | ||
- name: Cache | ||
id: cache-python-env | ||
uses: actions/cache@v2 | ||
with: | ||
path: ${{ env.pythonLocation }} | ||
key: windows-${{ env.pythonLocation }}-${{ env.date }}-${{ hashFiles('setup.py') }}-${{ hashFiles('requirements.txt') }}-${{ hashFiles('requirements-dev.txt') }} | ||
- name: Install Pytorch on windows | ||
run: | | ||
pip install torch==1.8.1+cpu -f https://download.pytorch.org/whl/lts/1.8/torch_lts.html | ||
- name: Install dependencies | ||
if: steps.cache-python-env.outputs.cache-hit != 'true' | ||
run: | | ||
python -m pip install --upgrade pip | ||
pip install --upgrade --upgrade-strategy eager -r requirements-dev.txt -e . | ||
pip install --upgrade --upgrade-strategy eager -f https://download.pytorch.org/whl/torch_stable.html -r requirements.txt -e . | ||
pip install torch-scatter -f https://data.pyg.org/whl/torch-1.9.0+cpu.html | ||
prepare-build: | ||
needs: build-cache | ||
# With Windows it gives error, also this step only listing test files only | ||
runs-on: ubuntu-20.04 | ||
steps: | ||
- uses: actions/checkout@v2 | ||
- id: set-matrix | ||
run: | | ||
echo "::set-output name=matrix::$(cd test && ls -d test_*.py | jq -R . | jq -cs .)" | ||
outputs: | ||
matrix: ${{ steps.set-matrix.outputs.matrix }} | ||
|
||
build: | ||
needs: prepare-build | ||
runs-on: windows-latest | ||
strategy: | ||
matrix: | ||
test-path: ${{fromJson(needs.prepare-build.outputs.matrix)}} | ||
fail-fast: false | ||
steps: | ||
- uses: actions/checkout@v2 | ||
- name: Set up Python 3.7 | ||
uses: actions/setup-python@v2 | ||
with: | ||
python-version: 3.7 | ||
- run: echo "date=$(date +'%Y-%m-%d')" >> $env:GITHUB_ENV | ||
- name: Cache | ||
uses: actions/cache@v2 | ||
with: | ||
path: ${{ env.pythonLocation }} | ||
key: windows-${{ env.pythonLocation }}-${{ env.date }}-${{ hashFiles('setup.py') }}-${{ hashFiles('requirements.txt') }}-${{ hashFiles('requirements-dev.txt') }} | ||
|
||
# Windows runner can't run Linux containers. Refer https://github.com/actions/virtual-environments/issues/1143 | ||
- name: Set up Windows test env | ||
run: | | ||
choco install xpdf-utils | ||
choco install openjdk11 | ||
refreshenv | ||
choco install tesseract --pre | ||
choco install elasticsearch --version=7.9.2 | ||
refreshenv | ||
Get-Service elasticsearch-service-x64 | Start-Service | ||
# We have to remove files if not test going to run from it | ||
# As on windows we are going to disable quite a few tests these, hence these files will throw error refer https://github.com/pytest-dev/pytest/issues/812 | ||
# Removing test_ray, test_utils, test_preprocessor, test_knowledge_graph and test_connector | ||
- name: Run tests | ||
if: ${{ !contains(fromJSON('["test_ray.py", "test_knowledge_graph.py", "test_connector.py"]'), matrix.test-path) }} | ||
run: cd test && pytest --document_store_type=memory,faiss,elasticsearch -m "not tika and not graphdb" -s ${{ matrix.test-path }} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,17 +1,16 @@ | ||
import pytest | ||
|
||
from haystack.preprocessor.utils import convert_files_to_dicts, tika_convert_files_to_dicts | ||
from haystack.preprocessor.cleaning import clean_wiki_text | ||
from haystack.utils.preprocessing import convert_files_to_dicts, tika_convert_files_to_dicts | ||
from haystack.utils.cleaning import clean_wiki_text | ||
|
||
|
||
@pytest.mark.tika | ||
def test_convert_files_to_dicts(xpdf_fixture): | ||
def test_convert_files_to_dicts(): | ||
documents = convert_files_to_dicts(dir_path="samples", clean_func=clean_wiki_text, split_paragraphs=True) | ||
assert documents and len(documents) > 0 | ||
|
||
|
||
@pytest.mark.tika | ||
def test_tika_convert_files_to_dicts(tika_fixture): | ||
def test_tika_convert_files_to_dicts(): | ||
documents = tika_convert_files_to_dicts(dir_path="samples", clean_func=clean_wiki_text, split_paragraphs=True) | ||
assert documents and len(documents) > 0 | ||
|