-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Current thinking on validation #118
Comments
I'm very much in favour of (A). For consistency doing the same for serialization would make sense, but I wouldn't make this a high priority! Regarding clearing errors, I don't really see any benefit to doing something more complex than (1). Surely the most common case is that you want to report errors straight after validating a model? If you do want to validate several models and keep all errors, you can create multiple Validator instances to do so. |
I'm in favour of (A), separate enablers for the model. For me it keeps the model classes clean and encapsulated better, allows for more consistency if the parser is definitely an external object, and other people can use the same form if they need to add anything bespoke. |
I am in favour of (A) and (1). (Agreed with Jonathan regarding (1).) |
A and 1 for me too. |
I am in favour of option 2. I wouldn't want the errors to be cleared until I said so. |
Depends what you mean by "said so" - for me, asking to validate a different model means you no longer care about errors from a previous model! |
Yes in the main I would agree but we would be restricting the ways in which the library can be used. By requiring a developer to explicitly clear the errors and leaving errors to accumulate you are not restricting them to single passes. Also, a lot of other libraries work this way where errors are not reset until instructed. |
While I still believe errors should be cleared whenever But I still can't see why you'd want to have the added overhead of tracking errors from multiple sources in the same validator instance? |
So then let's just have a call to clear errors at the start of the validate method and we are in a fine position going forward. Sorted. |
In the case of the |
For me that is a second level priority. Let's not concern ourselves with that at this time, that functionality can easily be added once the validator can validate models. |
I agree, getting validation of a model is the top priority. |
Fair enough- I suppose it is difficult to define what a 'valid' state is for an entity outside of its model context anyway. For instance, there is no way to determine if a variable |
How far do we want to go with MathML validation at this stage? I think we should probably try to validate the math string by parsing it with libxml2 + the W3C MathML DTD. That will also give us a something to test the MathML printer/parser with when we move on to integrating a CAS. |
yep - that seems reasonable. It is also needed to be able to easily iterate over |
In a quick discussion at the ABI last week, it was brought up that it would be useful for the
I think it makes sense to add another attribute to our
and "where" the error occurred by allowing the the user to do
Personally, I don't think this needs to be more complicated than (A) unless we still expect the Spec rule ordering to shift substantially. |
(A) would indeed be nice, but I would find it even more useful if libCellML could also provide us with the URL to the full specification rule (unless we could safely determine that URL from |
Just had a quick chat with @nickerso and it sounds like a unique identifier (B) might be the way to go. Then we can have separate methods to convert that identifier to the section title, a URL, or the full rule with a map/dictionary. Before we start adding URLs, we probably want to wait until the Specification's resting place is agreed upon (there is apparently some discussion ongoing on whether to keep it as a GoogleDoc for ease of contribution). But we can start with the |
Another minor issue in terms of implementing our I can see why this was done- we don't want to confuse a user with the differences between If we want to keep
Other possibilities are:
I would personally be fine with (3) as the most expedient option but (1) is probably the most "correct" way to sort this out. |
(3) is not only the most expedient, but also the "correct" way in my view of libCellML. |
Ah OK maybe I was overthinking this then. Is just a case of the more straightforward approach being the correct one? |
I actually agree with that. My bad for originally suggesting an 'improved' version of (A). Otherwise, agreed on (3) too. |
Building on the decision to go with (3)(validation of each |
When I originally saw the Indeed, from what I remember and see in the ... or am I missing something?... |
That was my original thinking behind suggesting (1):
I was trying to make it so we could have all our validation operations within the My thinking with (3) is that (1) might overcomplicate things for what is ultimately one or two methods. But for (3) we also need to get access to a Of course, this is also open to any further suggestions :-) |
Well, (3) is not an option for me anymore since it effectively breaks our agreed approach with regards to the So, maybe the solution consists of making the |
Just thinking about this, I think this is the same issue that is going to crop up when we want to actually do anything with How about adding a method like this: Units::getUnit(int index, std::string& name, double& prefix, double& exponent, double& multiplier, double& offset) const; or maybe: const std::string& Units::getUnitName(int index);
double Units::getUnitPrefix(int index);
double Units::getUnitExponent(int index);
... or both to match the |
@nickerso, I indeed had something like that in mind, but... then the question is whether we want libCellML users to be able to access those methods? I thought the answer was no and that it was the reason the So, if we don't mind libCellML users to have access to those methods, then your suggestion looks fine to me. On the other hand, if we don't want libCellML users to have access to them, then we would have no other choice but to make |
I think @nickerso's I think I personally have a slight preference for the |
I'm now leaning toward users have access to that information, as I think this is the "cleanest" way to expose this information. And if we want to keep the math handling external to libCellML, then we'll likely need access to that information without being able to use |
+1 for |
The current plan is to use the MathML DTD to separately validate our opaque math string. However, when parsing a valid CellML model, we may end up with a situation where the I had a chat with @nickerso and a possible solution might be to just say that we expect a valid math string to be independently valid. During parsing, We would copy any namespaces declared in parent elements into In addition to helping out with validation, this "math as a valid subdocument" requirement will probably also be useful if we want to hang on to units attributes when parsing our math string into a CAS. |
I am confused, are we not removing non-MathML elements from the math string anymore (see here)? If not, then copying the namespaces into
I clearly missed it the first time round, but what does CAS stand for exactly? |
The plan is still to remove non-MathML element to validate the MathML itself, but we also need to validate the units. But to validate the units we need to keep track of the namespace of the units attributes - information that is lost if not contained in the math string. CAS = computational algebra system. |
Edit: this comment is essentially a simultaneous duplicate of @nickerso 's above
That's still the plan but in order to get to our
Sorry that's a Computer Algebra System (see #95) for symbolic manipulation. Current thinking is still leaning toward SymEngine. |
Ok, thanks for the clarification and the meaning of CAS. |
In a meeting at the ABI on Tuesday, there was some discussion about how we might implement validation as the next step in the use-cases doc.
The team here was in favour of (A) setting up a
Validator
object, similar to how we deal with theParser
in the changeset under review in #114 . Consistent with this, it was also suggested that we should probably move our serialisation routines into a separateSerialiser
object (we might want to call thisPrinter
or something else instead), rather than doing serialisation from the objects themselves.For example, a user wishing to check if a model is valid, output it if so or output errors if not might do:
An alternative (B) might be to do validation on entities as we currently do for serialisation, for example:
While B is a bit more concise, the thinking is that it is worth the extra line or two to make our reading, writing, and validation error logging consistent. And
Parser
may need to pick up errors that do not necessarily translate into an Entity (e.g. unrecognised or mismatched XML tags).With the validator, there is also the question of when/if to automatically clear errors. For instance, consider if we call the following validation on two very simple models:
validator
might now contain the following errors regarding the requirement for valid models to have names:validate()
is called.m1
andm2
if we allow validation errors to persist until the user callsvalidator.clearErrors()
m2
only if we remove any previous errors pertaining to the object being validated from thevalidator
and add back any new errors from the updated object.I don't have a strong preference but would probably go with 3.
The text was updated successfully, but these errors were encountered: