Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible faulty prediction of the de model #1

Closed
zouharvi opened this issue Jul 26, 2022 · 2 comments
Closed

Possible faulty prediction of the de model #1

zouharvi opened this issue Jul 26, 2022 · 2 comments

Comments

@zouharvi
Copy link

According to my vauge understanding of rhyme, the following poem should have the rhyme ABABC. However, the model does not detect it. Is this an error on my side (or my installation) or did this just got mispredicted by the model? Are there any other models that could make this work? Or perhaps a setting that would increase rhyme sensitivity?

import rhymetagger

poem = """
Zwei Straßen gingen ab im gelben Wald,
Und leider konnte ich nicht beide reisen,
Da ich nur einer war; ich stand noch lang
Und sah noch nach, so weit es ging, der einen
Bis sie im Unterholz verschwand;
""".strip()

rt = rhymetagger.RhymeTagger()
rt.load_model(model="de")
print(rt.tag(poem.split("\n"), output_format=3))

Output:

====================================
Model loaded with following settings:
====================================
  frequency_min: 3
           lang: de
       max_iter: 20
          ngram: 3
   ngram_length: 3
   prob_ipa_min: 0.9
 prob_ngram_min: 0.9
     same_words: False
   stanza_limit: True
         stress: True
       syll_max: 2
    t_score_min: 3.078
   vowel_length: True
         window: 5
====================================
[None, None, None, None, None]
@versotym
Copy link
Owner

According to my vague understanding of German, I'd say these are kind of "imperfect rhymes". The model was trained with data mainly from 17C to 19C where I expect rhyming to be way more constrained. Relaxing the prob_ipa_min and prob_ngram_min parameters may do the trick.

@zouharvi
Copy link
Author

Relaxing the prob_ipa_min and prob_ngram_min parameters may do the trick.

Thank you, that's exactly what I was looking for. I got some (partially correct) results ABABA only once I went as low as prob_ngram_min=0.001, prob_ipa_min=0.001 which is very iffy. I'll try to see whether I can find some more data or some other method.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants