Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to use the FlairScorer to predict given *one* broken word? #6

Open
kkew3 opened this issue Jun 20, 2024 · 0 comments
Open

How to use the FlairScorer to predict given *one* broken word? #6

kkew3 opened this issue Jun 20, 2024 · 0 comments

Comments

@kkew3
Copy link

kkew3 commented Jun 20, 2024

Hello. Thanks for the great project.

I have a handful of broken words with hyphen removed, and I want to predict if the hyphen in between should be recovered. I'm wondering how to achieve this with FlairScorer. Input example: ('state', 'of-the-art').

From the code:

# 2. some compound-word (keep hyphen), remove whitespace
option2 = last_word.strip() + next_word
# 3. remove hyphen, most likely to happen
option3 = last_word.strip()[:-1] + next_word
scores = self.score((option1, option2, option3))

I imagine the following usage:

from dehyphen import FlairScorer

class DehyphenModel:
    def __init__(self, lang: str):
        self.scorer = FlairScorer(lang=lang)

    def predict(self, last_word, next_word):
        options = [
            last_word + next_word,  # no hyphen in between
            last_word + '-' + next_word,  # keep the hyphen
        ]
        score_0, score_1 = self.scorer.score(options)
        # If option 1 is larger, return 1 (keep hyphen); else return 0
        return (score_1 > score_0) * 1

# Test cases
dehyphen_model = DehyphenModel('en')
print((
    dehyphen_model.predict('p', 'value'),
    dehyphen_model.predict('state', 'of-the-art'),
    dehyphen_model.predict('hel', 'lo'),
    dehyphen_model.predict('require', 'ment'),
))

Output: (0, 0, 1, 1), all wrong!

Therefore, I suspect my usage is incorrect.

Could you please tell me how to achieve this function? Thank you so much!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant