Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to train using the original lmdb dataset from this repo but also using a custom dataset? #422

Open
CharlyJazz opened this issue May 19, 2024 · 2 comments

Comments

@CharlyJazz
Copy link

I need to merge the lmdb dataset that this repo use in the instructions with my dataset? Because when I use only my dataset the OCR is not as good as the OCR trained with the dataset created for this repo. Thanks!

@AdamKheire
Copy link

@CharlyJazz did you get the solution

@CharlyJazz
Copy link
Author

There is a class ConcatenatedDataset, you can put your dataset side by side of the others datasets but keep in mind that the the are sorted by folder name, and it gonna take a long time training on the originals lmdb, so your dataset should be first, to make sure it gonna be use in the training session first. I start my dataset name with "a"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants