Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PDFMinerToDocument buggy output #7749

Closed
1 task done
lukebfox opened this issue May 27, 2024 · 1 comment · Fixed by #7750
Closed
1 task done

PDFMinerToDocument buggy output #7749

lukebfox opened this issue May 27, 2024 · 1 comment · Fixed by #7750
Labels
P1 High priority, add to the next sprint type:bug Something isn't working

Comments

@lukebfox
Copy link
Contributor

Describe the bug
Connecting a pdfminer coverter to a document writer i.e `pipeline.connect('converter.document', 'writer.documents') does not work.

The PDFMinerToDocument output_type does not match the returned json key:

@component.output_types(document=List[Document])

return {"documents": documents}

document != documents

To Reproduce

pipeline = Pipeline()
document_store = InMemoryDocumentStore
pipeline.add_component("converter", PDFMinerToDocument())
pipeline.add_component("writer", DocumentWriter(document_store=document_store))
pipeline.connect("converter.document", "writer.documents")
response = pipeline.run({"converter": {"sources": ['some.pdf']}})
print(response['writer']['documents_written'])

Error message
KeyError: 'writer'

Expected behavior
The writer node is able to use documents from the previous converter node.

FAQ Check

System:

  • OS: NixOS
  • Haystack version (commit): d0da31a
  • DocumentStore: InMemoryDocumentStore
@anakin87
Copy link
Member

Oh, you are right!

Would you like to open a PR to fix this?

@anakin87 anakin87 added type:bug Something isn't working P1 High priority, add to the next sprint labels May 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P1 High priority, add to the next sprint type:bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants