-
Notifications
You must be signed in to change notification settings - Fork 91
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for Latex/Math/Equations #531
Comments
Hi @Vilhelm-Ian, thanks for you feature request! TLDR;I would love to see this integrated in NormCap, but due to it's complexity, I probably won't have enough time to work on this my own. But I'm definitely open for contributions here. Some backgroundTesseract, the OCR framework I leverage in NormCap, initially had some support for detecting equations. But its results were quite weak, so it got abandoned. I doubt, that it is now feasible to train a Tesseract model for decent math detection. But it definitely would be possible to integrate an additional OCR framework into NormCap, which is optimized for LaTeX/Equations. Some open source frameworks actually deliver quite promising results, e.g. pix2text or LaTeX-OCR. However, the difficulty is to find one that satisfies non-functional requirements by NormCap:
Unfortunately, this probably rules out all Those are just some initial thought, I'm interested to read opinions by others! 🙂 |
Describe your problem:
while studying math from pdfs it would be nice to be able to copy equations
Solution you'd like to see:
train the model on math equations
Alternatives you considered:
No response
Additional information or remarks:
No response
The text was updated successfully, but these errors were encountered: