Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regex rules and rule collections (including global rules), imports and exports #69

Open
SafeTex opened this issue Mar 2, 2023 · 2 comments
Labels
enhancement New feature or request

Comments

@SafeTex
Copy link

SafeTex commented Mar 2, 2023

Hello Tommi and all

I have some questions about "good management" practice with regex rules and how to best manage them.
Here are some observations - hence the question marks - for you to comment on please to say if I'm right or wrong, is there a better method or that I have misunderstood something.

  1. I should NOT create them with a trained MT engine as I have to retrain such engines after every job and I risk losing them when I delete the old trained engine?

  2. So I should perhaps save them to a with an installed model that I then use for training but would they be automatically transferred to the new trained MT engine?

  3. I started to play around with all this and exported a few rules and collections to see if I could reimport them into another trained MT engine but here I got a real shock as when I looked at the collections/rules that I had saved to my special folder, they had names like "63e8e851-3936-49f8-9c3d-125476b5033e.yml" . And NotePad ++ does not seem to like opening them. It takes a long time but once open, I can see what each file contains. But on reimport, I have to remember or note down what all these files contain and then hunt for the right one(s). This was mind blowing and I only had a few collections/rules on my desktop
    Furthermore, there was no clear sign in the file as to what installed model the rule/collection had come from (French or Swedish to English). So I guess that I should always put this in the name of the regex rule/collection name and perhaps save them to different sub-folders in future

To resume, this was just a first test run and I hope to get better, but I can see that creating rules and collections might need careful planning and I'm particularly concerned about the long "non-transparent" names individual rules and collections are given, making reimport very confusing

Any advice please?

Thanks in advance

Dave Neve

@TommiNieminen TommiNieminen added the enhancement New feature or request label Mar 2, 2023
@TommiNieminen
Copy link
Collaborator

Those are good points, I'll have to make the system more friendly with descriptive names.

@SafeTex
Copy link
Author

SafeTex commented Mar 2, 2023

Thanks Tommi

I think I can get around the other problems too if we can add to or create the names. Then we can add stuff like FR (French) and SE (Swedish) as some rules have exactly the same description as they do the same thing, but the regexes themselves are different due to the layout of the languages (such as with numbers)

Do you think this enhancement will be ready by next Monday ? (English humour 😂😂😂)

Have a nice weekend

Dave Neve

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants