Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: fix HTMLToDocument test #8127

Merged
merged 1 commit into from
Jul 31, 2024
Merged

chore: fix HTMLToDocument test #8127

merged 1 commit into from
Jul 31, 2024

Conversation

anakin87
Copy link
Member

Related Issues

https://github.com/deepset-ai/haystack/actions/runs/10176181855/job/28145024482
A new version of Trafilatura was released, where the extraction parameters are still valid but the extraction is handled a bit differently.

Proposed Changes:

Transform the test into a real unit test: we now check that Trafilatura is called with the right parameters and not the extraction result

How did you test it?

CI

Checklist

@anakin87 anakin87 added the ignore-for-release-notes PRs with this flag won't be included in the release notes. label Jul 31, 2024
@anakin87 anakin87 requested a review from a team as a code owner July 31, 2024 08:26
@anakin87 anakin87 requested review from silvanocerza and removed request for a team July 31, 2024 08:26
@@ -161,21 +162,23 @@ def test_serde(self):
assert new_converter.extraction_kwargs == converter.extraction_kwargs

def test_run_difficult_html(self, test_files_path):
# boilerpy3's DefaultExtractor fails to extract text from this HTML file
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unrelated: removed an outdated comment

@anakin87 anakin87 merged commit 3d1ad10 into main Jul 31, 2024
15 checks passed
@anakin87 anakin87 deleted the fix-html-test branch July 31, 2024 08:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ignore-for-release-notes PRs with this flag won't be included in the release notes. topic:tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants