Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Significant performance hit using dereferenced schemas to validate #216

Closed
mann-david opened this issue Feb 11, 2021 · 4 comments
Closed

Comments

@mann-david
Copy link

Version: 9.0.7

I have some fairly basic schemas - one top-level which references about a dozen sub-schemas based on different conditions in the files being validated. Running my test suite on the original schemas completes in <1 second. If I run the exact same tests against a consolidated/dereferenced copy of the schemas, it take 30+ seconds. All of my tests still pass, it just takes a long time.

I'm generating the dereferenced file like this:

$RefParser.dereference(inFile, async (err, schema) => {
        if (err) {
            console.error(err);
        }
        else {

            fs.writeFile(outFile, JSON.stringify(schema), () => { });
        }
        
    })

inFile is the top-level schema which references the sub-schemas. I've looked at outFile - the consolidated schema - and everything looks right.

I've also tried using $RefParser.bundle instead of $RefParser.dereference but that throws other errors.

The consolidated schema is here: https://gist.github.com/mann-david/818a84405b4a122dd086f26368352a46

And a test file for validation: https://gist.github.com/mann-david/38817ffca42e38fa941c825f9e5859c1

Looking at some timings output by my test suite, the first file that is validated takes 99.9% of the time(i.e. in the last run, it took 37.2 seconds and the whole suite ran in just over 38 seconds). The rest of the ~25 files that are validated happen in milliseconds.

I'm also looking into the validation library I'm using (Manatee.Json) to see if the issue could be there. I've confirmed that it is not in the load time of either the schema file or test file - it is strictly when I call validate which leads me to believe that it is something in the consolidated schema file that is causing the problem.

Any ideas?
Thanks.

@philsturgeon
Copy link
Member

I would expect a dereferenced schema to be much slower for validation, you're making a very very very large file full of duplicate information simply to avoid $ref's existing.

I see you tried to use bundle, which would create better results, but found some errors. Could you share those errors so we can get you using bundle?

@mann-david
Copy link
Author

Thanks for the reply. If bundle is going to be faster then I'll go that way.

The problems are all similar to Cannot resolve schema referenced at '#/allOf/0/then/properties/payload/allOf/0/then/properties/details/properties/owners/items'.

It looks like what is happening is that one $ref is pointing to an example like the above, but if you follow the path, that referenced schema is an array and the items property points to another $ref, for example "items": { "$ref": "#/properties/requestorId" }
Is this "double hop" a problem?

@bmogensen
Copy link

I have the same problem with bundled schemas. I bundle my “raw” schemas but AJV in latest version cannot compile them. The AJV CLI 3.3.0 is able to compile though.

@philsturgeon
Copy link
Member

@mann-david if you could create a brand new issue for that problem you're describing, please focus on "here's the spec I have, here's the command I am running, this is whats happening" etc.

I worry that issues like this one become to meta and expand massively with everyone who has a performance issue coming in to see if this is the cause of their specific issue. We already have #211 which is solving some performance regression introduced in 9.0.7 specifically, which makes things quicker for most people but had some sucky effects for some edge cases.

That might be your dereference problem, and we can look at your bundle problem in another issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants