Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TopicClassifier #5678

Open
2 tasks
Tracked by #5581
ZanSara opened this issue Aug 29, 2023 · 1 comment
Open
2 tasks
Tracked by #5581

TopicClassifier #5678

ZanSara opened this issue Aug 29, 2023 · 1 comment
Labels
2.x Related to Haystack v2.0

Comments

@ZanSara
Copy link
Contributor

ZanSara commented Aug 29, 2023

Classifies the input text/document into one in a list of arbitrary categories. It’s very similar to the LanguageClassifier, but it’s more generic as it can classify by arbitrary categories. It can be implemented with a zero-shot text classifier. Again, we may need a text and a Document version.

Draft I/O for TextTopicClassifier:

@component
class TextTopicClassifier:
    
    def __init__(self, topics: Optional[List[str]] = None):
        self.topics = topics
        component.set_output_types(**{topic: List[str] for topic in topics})

    def run(self, strings: List[str]):
        # classify the strings
        return {"topic_1": strings_1, "topic_2": strings_2, ...}

Draft I/O for DocumentTopicClassifier:

@component
class DocumentTopicClassifier:
    
    def __init__(self, topics: Optional[List[str]] = None):
        self.topics = topics
        component.set_output_types(**{topic: List[str] for topic in topics})

    def run(self, documents: List[Document]):
        # classify the documents
        return {"topic_1": docs_1, "topic_2": docs_2, ...}

Tasks

@ZanSara ZanSara added the 2.x Related to Haystack v2.0 label Aug 29, 2023
@Timoeller
Copy link
Contributor

We want to turn https://prompthub.deepset.ai/?prompt=deepset%2Ftopic-classification into a more explicit component.

@Timoeller Timoeller added the P2 Medium priority, add to the next sprint if no P1 available label Oct 9, 2023
@masci masci removed the P2 Medium priority, add to the next sprint if no P1 available label Feb 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2.x Related to Haystack v2.0
Projects
None yet
Development

No branches or pull requests

3 participants