-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rerunning frog on already frogged FoliA #70
Comments
We do have to consider the realistic case where somebody only runs certain modules of Frog (say PoS-tagging and lemmatisation) and someone else at a later stage wants to add something else, like NER or parsing.
The random component strategy I use is easy and works well for that.
Add a new one, since it's completely new Frog run, which can be done at a very different time and by a very different user on a very different machine than the older one.
Yes, as a temporary solution that seems quite acceptable, it will take some time for users to run into this issue anyway. |
Ok, so running a new (part-of) Frog requites new provenance record. Regarding the ID's
I already started implementing along these lines, but: So my solution will probably be to register the used ID's while processing the document. Within the Frog pipeline this won't be a problem. Other tool might use a different scheme for ID's but that will be opaque to Frog. Although this seems relatively easy to implement, I will keep this on the wish-list for after the 2.0 release |
Frog now assigns provenance data to FoLiA, which a.o. allows us to detect a rerun of (parts of) Frog on a FoLiA documents. BUT:
Handling this is quite dangerous and needs a lot of thinking.
As I don't want to postpone the FoLiA 2.0 Release, I suggest for the time being to just FORBID running Frog again on FoLiA with frog provenance data. That will not break existing cases, and will for sure NOT introduce artifacts that would bother us in the future.
@proycon Any comments?
The text was updated successfully, but these errors were encountered: