Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TransformersDocumentClassifier replacing FARMClassifier #1540

Merged
merged 13 commits into from
Oct 1, 2021

Conversation

julian-risch
Copy link
Member

@julian-risch julian-risch commented Sep 29, 2021

Proposed changes:

  • Add a Transformers-based document classification node that replaces FARMClassifier

closes #1508

Status (please check what you already did):

Limitation: I did not add multi-label classification but that would be easy to add in future.

@lalitpagaria
Copy link
Contributor

It would be great to allow zero shot classifier as well. Some cases user what to define their out label and interested in getting probabilities (both with multiclass and without multiclass)

@julian-risch
Copy link
Member Author

It would be great to allow zero shot classifier as well. Some cases user what to define their out label and interested in getting probabilities (both with multiclass and without multiclass)

I thought so too. 👍

@julian-risch julian-risch changed the title WIP: Initial draft of TransformersClassifier TransformersDocumentClassifier replacing FARMClassifier Sep 30, 2021
@julian-risch julian-risch marked this pull request as ready for review September 30, 2021 12:58
Copy link
Member

@tholor tholor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good! Just left one suggestion. Your call if you want to do it right away in this PR :)

params={"Retriever": {"top_k": 10}, "Classifier": {"top_k": 5}}
)
print(res["documents"][0].to_dict()["meta"]["classification"]["label"])
__Note that print_documents() does not output the content of the classification field in the meta data__
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems quite complicated here to print the meta data. Shall we just add an arg print_meta=True/False to print_documents? Could be helpful in other occasions as well and would simplify this example

@julian-risch julian-risch merged commit 24483d7 into master Oct 1, 2021
@julian-risch julian-risch deleted the transformers_classifier branch October 1, 2021 09:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Document Classification Node with Transformers instead of FARM
3 participants