Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Option to allow type coercion/cast/type conversion. #95

Closed
darrin opened this issue Jan 5, 2016 · 41 comments
Closed

Option to allow type coercion/cast/type conversion. #95

darrin opened this issue Jan 5, 2016 · 41 comments
Labels

Comments

@darrin
Copy link

darrin commented Jan 5, 2016

Hi there -

Thanks so much for writing this fantastic validator!

I'm using it for POSTed forms with checkboxes and other cases where basically I'd love to support conversion to a different type from a string and (as long as conversion from string works) changing the object in place, allowing validation to continue without errors.

I realize this is a corruption of pure validation and there would be a performance hit but I suspect there are others out there that would like coercion to work as well. An option to 'allowTypeCoercion' or 'laxMode' could allow transformation of primitives like integers, floats and boolean when a 'string' is found.

In the absence of support for this - I'm doing validation of the request.body with string types - and then if I need them manually converting each string to a primitive or other type. Beyond requiring additional code in which there's opportunity for inconsistencies with the schema - errors are found further downstream which we need to ripple upward. Support for this via an attribute would eliminate a bunch of redundant code...

It looks like the best I can do currently is to specify a string 'format' to try to enforce a numeric value for example or perhaps an enum to enforce the string 'true', 'false' in the case of a boolean. I might be able to use keywords as described in your documentation to try to make this more consistent.

Am I missing something that would allow conversion or support this in a more elegant way? Are there others interested in this sort of thing?

Thanks in advance,
-Darrin

@epoberezkin
Copy link
Member

Thanks.

It can be done.

You could try using custom keyword (say coerceType), but custom keywords are validated in the end, after all other keywords, so the type coercion it could do would be too late. So to do it with a custom keyword, there should be some option in keyword definition to add it to the beginning rather than to the end of keywords list... I don't like it to be honest.

The option coerceTypes could be implemented quite easily. Type validation is in this file: https://github.com/epoberezkin/ajv/blob/master/lib/dot/validate.jst . You have to add additional code to compiled function to coerce type if the option and type keyword are present (plus remove the code that validates types without coercion, plus tests :). laxMode would be much more problematic as it would have to affect most keywords, so probably it is not a good idea. PR for coerceTypes is welcome.

@Santinell
Copy link

+1 for this feature. if I have enough time - I can try to create PR.

@epoberezkin
Copy link
Member

Much appreciated, good luck. Any questions - feel free to ask.

@Santinell
Copy link

When i try to run tests - i get:

mocha spec/*.spec.js -R spec

module.js:340
throw err;
^

Error: Cannot find module './JSON-Schema-Test-Suite/remotes/integer.json'

@epoberezkin
Copy link
Member

You need git submodule update --init

@Santinell
Copy link

Thanks.
I have some error on changind data:
Cannot assign to read only property 'typeId' of 12

schema:

...
typeId: {type: 'integer'},
...

data:

...
typeId: '12',
...

Code

{{? $typeSchema && !$typeChecked  }}
  {{
    var $schemaPath = it.schemaPath + '.type'
      , $errSchemaPath = it.errSchemaPath + '/type'
      , $isArray = Array.isArray($typeSchema)
      , $method = $isArray ? 'checkDataTypes' : 'checkDataType'
      , $propertyKey = it.util.getKey(it.schemaPath)
      , $passData = $data + it.util.getProperty($propertyKey);
  }}

  if ({{= it.util[$method]($typeSchema, $data, true) }}) {
    {{? it.opts.coerceTypes && $propertyKey !==  undefined && {{=$passData}} !== undefined }}
      {{= $passData}} = {{= it.util.coerceType($typeSchema, $passData)}};
    {{??}}
      {{# def.error:'type' }}
    {{?}}
  }
{{?}}

@epoberezkin
Copy link
Member

I don't see all the code/error stack. Is it compile time or validation time, what is the line/code that triggers the error?

What I see wrong in the snippet (not related to the error) is that it assumes it is always possible to coerce type. But depending on data type it may not be possible to coerce it and if it isn't it should still create type error, so in case you have it.opts.coerceType option the decision whether to create error is made at validation time.

Also you won't be able to coerce type on the top level of data as there is no parent data to update, so in case it's top level it should simply fail validation .

But I guess it's work in progress...

@Santinell
Copy link

Full error message:

undefined:1
 validate = function (data, dataPath) { 'use strict'; var vErrors = null;  var errors = 0;      if ((data && typeof data === "object" && !Array.isArray(data))) {   var errs__0 = errors;var valid1 = true; for (var key0 in data) {  var isAdditional0 = !(false  || key0 == 'name'  || key0 == 'typeId'  || key0 == 'checked'  ); if (isAdditional0) {  delete data[key0];  }  }   var data1 = data.name;  if (data1 !== undefined) {   var errs_1 = errors; if (typeof data1 !== "string") {  data1.name = data1.name;  } var valid1 = errors === errs_1; }  var data1 = data.typeId;  if (data1 !== undefined) {   var errs_1 = errors; if ((typeof data1 !== "number" || (data1 % 1))) {  data1.typeId = parseInt(data1.typeId);  } var valid1 = errors === errs_1; }  var data1 = data.checked;  if (data1 !== undefined) {   var errs_1 = errors; if (typeof data1 !== "boolean") {  data1.checked = Boolean(data1.checked);  } var valid1 = errors === errs_1; }   }  else {  var err =  { keyword: 'type' , dataPath: (dataPath || '') + 

TypeError: Cannot assign to read only property 'typeId' of 12
    at validate (eval at localCompile (/home/pavel/Documents/ajv/lib/compile/index.js:75:12), <anonymous>:1:685)
    at Object.<anonymous> (/home/pavel/Documents/ajv/test.js:35:13)
    at Module._compile (module.js:435:26)
    at Object.Module._extensions..js (module.js:442:10)
    at Module.load (module.js:356:32)
    at Function.Module._load (module.js:313:12)
    at Function.Module.runMain (module.js:467:10)
    at startup (node.js:136:18)
    at node.js:963:3

@Santinell
Copy link

changed file validate.jst from line 104 to 115
And also i don't know how use coercion only for user schemas, and don't coerce data of internal schemas...

@Santinell
Copy link

Looks like {{= $passData}} evaluetes to value of field, not to field name as i thinked...
dotJS is very hardcore ^_^

@epoberezkin
Copy link
Member

:)

@Santinell
Copy link

How get variable name to rewrite it value?
for example on root level for simple fields that simple:

$propertyKey = it.schemaPath.split('.').pop();
var field = 'data.'+$propertyKey;

but when i use array or object...
Or may be i do something wrong?

@epoberezkin
Copy link
Member

you need a previous level data + current property which is something like {{='data' + ((it.dataLvl-1)||'') + it.util.getProperty(it.dataPathArr[it.dataPathArr.length-1]) }} (assuming you are in validate.jst).

On the root level (Edited: it.dataLvl is 0) you simply can't do it (or maybe you can but the caller won't see the change, although maybe it's fine - it would still pass validation. In this case it's just data).

@darrin
Copy link
Author

darrin commented Jan 14, 2016

The original idea was indeed to replace the string property with the correctly typed value…

On Jan 14, 2016, at 7:51 AM, Evgeny Poberezkin [email protected] wrote:

you need a previous level data + current property which is something like {{='data' + ((it.dataLvl-1)||'') + it.util.getProperty(it.dataPathArr[it.dataPathArr.length-1]) }} (assuming you are in validate.jst).

On the root level ($top is true) you simply can't do it (or maybe you can but the caller won't see the change, although maybe it's fine - it would still pass validation. In this case it's just data).


Reply to this email directly or view it on GitHub #95 (comment).

@epoberezkin
Copy link
Member

not sure I follow

@darrin
Copy link
Author

darrin commented Jan 14, 2016

I'm commenting on this:

maybe it's fine - it would still pass validation. In this case it's just data

What I had in mind was that values either match the schema or need to be coerced to match the schema to pass validation. If it cannot be coerced for whatever reason it shouldn't pass validation. Just my $0.02.

It sounds like another way of getting around the level 0 shortcoming you're referring to above might be to shift everything down one level - e.g. wrap the current level 0 in an object. This would no doubt require a bigger change but should allow you to coerce level 0 properties. Just thinking aloud.

@epoberezkin
Copy link
Member

level 0 is the data passed to validate function. For example the schema is { type: 'number' } and you call var data = '0'; validate(data) that should pass in case of coercion, but the caller's variable won't be updated.

If data passed to validate is an object then its properties are level 1 already.

@epoberezkin
Copy link
Member

Btw, maybe you should be able to coerce array to object (together with length property), and object to array (if all properties are strings of numbers). But again, if it's a top level object/array, the caller won't see the change. But maybe it's overkill...

@darrin
Copy link
Author

darrin commented Jan 14, 2016

For example the schema is { type: 'number' } and you call var data = '0'; validate(data) 
that should pass in case of coercion, but the caller's variable won't be updated.

Thanks @epoberezkin that case makes perfect sense. However I would want validation to fail bc we cannot change the data we cannot coerce to make the input match the schema.

function increment (data) {
   if (validate(data)) {
       return data + 1;
   } else {
       throw new Error("Cannot increment " + data);
   }
}

Another thought occurs to me as well... just as we have 'validate.errors' we could build a coerced object that might be accessible in 'validate.coerced' or 'validate.data'. In this way we could support coercion of primitives as well as objects as you suggest (at level-0)... but it would have to be understood that for primitives the top level value cannot change. Then that actual increment line of code would look like:

           return validate.data + 1;

I don't love it by the way... but it might be a clever work-around if people needed coercion for level 0 primitives. Thoughts?


maybe you should be able to coerce array to object (together with length property), and object to array

I definitely agree it could be used beyond strings... it just felt like the simplest 90% solution to my problem. By the way - my use case doesn't require the level-0 stuff you're talking about above. I'm perfectly happy to always pass in objects. I'm just continuing the discussion bc I realize this only makes sense for you if it's a clean general solution.

@epoberezkin
Copy link
Member

Ok, thanks

I think collecting "coercions" is a bit overkill, too much code for something of limited use... Top level is a theoretic case, usually objects/arrays are validated, but for consistency it is better to coerce it even though it won't be returned to caller. Also easier to test coercions.

object <-> array coercion is also a bit too much.

What should be coerced I think should be similar (but not exactly the same) to what JS coerces:

  • boolean <-> number/integer (false <-> 0, true <-> 1, not 0 -> true)
  • boolean <-> string ('false' <-> false, 'true' <-> true, 'everything else' -> false)
  • boolean <-> null (false <-> null, true should fail to coerce to null)
  • string <-> number (string should fail to coerce if it is not a number)
  • string <-> integer (string should fail to coerce if it is not an integer)
  • string <-> null ('null' <-> null, '' -> null, other not empty strings should fail to coerce to null)
  • number/integer <-> null (0 <-> null, not 0 should fail to coerce to null)

@epoberezkin
Copy link
Member

As usual, everything is more complex than it seems on the surface

@darrin
Copy link
Author

darrin commented Jan 14, 2016

@epoberezkin +1 - Yes - you've nailed it.

The one thing that felt like we might disagree on was that I'd suggest that we don't allow coercion at the top level to indicate a pass. Rationale: the framework cannot 'correct'/coerce primitives (level-0) - therefore 'validate(data)' should be false unless type is already correct if any level-0 properties are of the wrong type. Comments on that bit?

@darrin
Copy link
Author

darrin commented Jan 14, 2016

Actually on second read - a couple questions:

string <-> null ('null' <-> null, '' -> null, other not empty strings should fail to coerce to null)

Why bother with this one? Is null a type or a value? Does it hold special significance in the framework? IMHO null is a valid value for any type - though I guess I'm not sure how this is handled in the JSON schema... so maybe this is a stupid question.

number/integer <-> null (0 <-> null, not 0 should fail to coerce to null)

I do like the idea of '' <-> null.
BUT 0 is clearly NOT null rather '0' --> 0... right?

@epoberezkin
Copy link
Member

"null" is a type in JSON-schema standard, so it can be coerced to if the schema has { type: "null" }

By the way, the whole thing gets complicated by the fact that there can be multiple types in type keyword... So I guess it should try to coerce in order, and be happy with the first that works.

So null value won't be valid for any type of data, unless you use, e.g., { type: ["string", "null"] }

And only coerce if it matches none of the types... omg

@darrin
Copy link
Author

darrin commented Jan 14, 2016

Gotcha wrt null... I still think 0 is not null... That said - I could see the case for '' - that's what I'd be faced with using this with a form most likely.

I think you're right... coerce in order sounds like a very reasonable approach.

@epoberezkin
Copy link
Member

JavaScript thinks it shouldn't coerce... English thinks it's the same... I think it's either nothing should coerce to null (like in JavaScript) or all of the above.

But if you try to stay closer to JavaScript then "false" shouldn't coerce to false... Or maybe coerceType option should have values true / "natural" and "JS" :) ... That's a Pandora box that coercion.

@darrin
Copy link
Author

darrin commented Jan 14, 2016

I'm looking at this through the lens of using it in web forms where I'll primarily be getting string data back. In that case - for the most part I'll be dealing with strings. This feature will be of no use to me without the basic string conversions you've laid out for boolean, number, integer and primitives in general.

And within that context

 string<->null: '' <-> null - makes perfect sense to me. 

I struggle a bit more with this in a numeric sense:

 integer<->null: 0<->null 

It's self-centered but bc I don't see a use case for it... I simply wouldn't implement it.

As mentioned I could see many more use cases but I would forestall implementing those types of things until there was a clear story for them.

@Santinell
Copy link

how we can coerce array?
array can have several types in items...

@epoberezkin
Copy link
Member

Please don't coerce array. validate.jst is used to generate code for any internal schema, if you mean types there.

@Santinell
Copy link

Yes, i mean some schema like that:

{
  type: 'object',
  properties: {
    arr: {
      type: 'array',
      items: [{type: 'integer'}]
    }
  }
}

We can coerce if array have one type of values. but can't if several

{
  arr: ['1', '2', '3']
}

->

{
  arr: [1, 2, 3]
}

@epoberezkin
Copy link
Member

validate.jst generates code for all subschemas, why do you need to worry about arrays in particular?

@Santinell
Copy link

Because in arrays order of types is not guaranteed
for schema

type: 'array',
items: [{type: 'integer'},{type: 'boolean'},{type: 'string'}]

data can be

[1,'false',true,6,'12', false, '0']

and we don't know how cast this types

@epoberezkin
Copy link
Member

If the schema is as above, the first subschema applies to the 1st item only, and so on. validate.jst will be executed once for each subschema to generate code for each of them.

So for the first item it has to cast to integer, but data matches already; for the second - from string to boolean, for the third - from boolean to string.

So what is the problem?

@Santinell
Copy link

My mistake - i thinked that array in types is equal items: { oneOf: [...] }

@epoberezkin
Copy link
Member

The problem of multiple types exists though if types keyword is an array. In this case you have to match if data matches any of them and if not try to coerce in order (see above).

oneOf issue I would simply disregard, if each schema in oneOf has type there potentially will be multiple coercions but given that they are all reversible I don't see it as a problem. Although depending on coercion rules string value may change after it's coerced to boolean and back for example...

@darrin oneOf/etc. with different types in subschemas is another reason (in addition to having a generic solution) to implement coercions for all scalar types and not just for strings. And probably they should be one-to-one relationships so that coercing to any type and back does not change the value.

@Santinell
Copy link

Looks like I overestimated myself. It was too difficult for me...

@epoberezkin
Copy link
Member

No problem.

@epoberezkin
Copy link
Member

So I'll do it then

@epoberezkin
Copy link
Member

In 3.4.0

@Santinell
Copy link

Super! You are awesome!

@epoberezkin
Copy link
Member

:)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

No branches or pull requests

3 participants