Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix empty queries #1783

Merged
merged 4 commits into from
Jan 17, 2020
Merged

fix empty queries #1783

merged 4 commits into from
Jan 17, 2020

Conversation

tonysun83
Copy link
Contributor

@tonysun83 tonysun83 commented Nov 30, 2018

Overview

Mango text indexes currently throw function clauses when queries in two
situations:

  1. Empty Selector
  2. Operators with empty arrays.

We fix this issue and change the behavior as follows:

  1. Any empty selector will fall back on _all_docs (like json indexes).
  2. $or, $and, $in, $all that specify an empty arrays
    will be treated as no-op when combined with other operators.
  3. A single no-nop query such as {"$or": []} will not be executed and
    return nothing just like json indexes.
  4. $nin with an empty array will return all docs that match the field,
    just like json indexes.

Testing recommendations

New tests were added to the python suite and can be run with
MANGO_TEXT_INDEXES=1 nosetests test/21-empty-selector-tests.py

This requires a clouseau setup.

Related Issues or Pull Requests

The issue is this PR.

Checklist

  • Code is written and works correctly;
  • Changes are covered by tests;
  • Documentation reflects the changes;

@willholley
Copy link
Member

I'll defer to others to comment on the Erlang side. Regarding the tests, similar to #1719 it would be good to see tests which verify the results returned are the same with/without an index present.

match({[{<<"$or">>, Args}]}, Value, Cmp) ->
Pred = fun(SubSel) -> match(SubSel, Value, Cmp) end,
lists:any(Pred, Args);

match({[{<<"$not">>, Arg}]}, Value, Cmp) ->
not match(Arg, Value, Cmp);

match({[{<<"$all">>, []}]}, _, _) ->
true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a duplicate clause from L404?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

404 is $and, this is $all

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also of note, a duplicate clause would have triggered a compilation error.

assert len(docs) == 1

def test_empty_arrays_complex(self):
resp = self.db.find({"$or": [], "a": {"$in" : []}}, explain=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"simple" combinations of empty queries will also short-circuit to no execution right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

correct

@michaelbreslin
Copy link

Is there a fix for this going in anytime soon?

@wohali
Copy link
Member

wohali commented Dec 20, 2018

Hi @michaelbreslin , probably not before the holiday break at this point - but I expect it'll land shortly thereafter. Looks like we may need some more tests before this can be merged.

@tonysun83
Copy link
Contributor Author

@willholley : Your additional testing suggestion revealed a conflicting issue with special/view indexes vs text indexes.

{"$or": []} 
{"$all": []} 
{"$in": []} 
{"$and": []} 

Special/View - We do a full range scan, then filter.
Text Index - The current implementation doesn't execute the query.

I think the solution is to make special/view indexes ignore the query(not execute the fabric call) as well. But it get's a little ugly as the _explain endpoint needs to reflect this as well.

cc @michaelbreslin

@willholley
Copy link
Member

@tonysun83 is the text index behaviour just an optimisation or does it generate an error? If it's just an optimisation then I agree we should be consistent and apply that at a higher level regardless of the index type. If you're proposing generating an error where Mango would previously return an empty result set then I think it needs more discussion as it would be a breaking change.

@tonysun83
Copy link
Contributor Author

@willholley : It doesn't generate an error and will be an improvement. My only concern is that it will break existing functionality. I have a test suite for that so that shouldn't be a problem. The issue I'm currently running into is getting all 3 to produce the same result.

Ex:

1) {"age": 22, "$or": []}

2) {"$or": []}

produce different results in json, _all_docs, and text depending on where part of the code is implemented. The correct solution should be IMO should be

  1. We get all documents back where the age is 22
  2. We get 0 documents back.

for all 3 index types. I'm debugging why it's currently not the case

@tonysun83
Copy link
Contributor Author

Full regression + new tests passed. I kept the existing functionality intact.

Copy link
Member

@davisp davisp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@janl
Copy link
Member

janl commented Jan 5, 2020

After fixing the linter errors and allowing for CI flakes in view_compaction.js, this now fails with an error in the new tests:

Traceback (most recent call last):
8269  File "/home/travis/build/apache/couchdb/src/mango/test/21-empty-selector-tests.py", line 42, in test_empty_array_and_with_age
8270    self.assertEqual(resp["index"]["type"], klass.INDEX_TYPE)
8271AssertionError: 'special' != 'text'
8272- special
8273+ text

@tonysun83 do you have time to look into this?

@janl janl added this to the 3.0.0 milestone Jan 5, 2020
@jiangphcn
Copy link
Contributor

I would like to continue, and found that export MANGO_TEXT_INDEXES=1 is helpful after some investigation. Will add new commit.

tonysun83 and others added 3 commits January 16, 2020 18:30
Mango text indexes currently throw function clauses when queries in two
situations:

1) Empty Selector
2) Operators with empty arrays.

We fix this issue and change the behavior as follows:

1) Any empty selector will fall back on _all_docs (like json indexes).
2) $or, $and, $in, $all that specify an empty arrays
will be treated as no-op when combined with other operators.
3) A single no-nop query such as {"$or": []} will not be executed and
return nothing just like json indexes.
4) $nin with an empty array will return all docs that match the field,
just like json indexes.
@tonysun83 tonysun83 merged commit 018ec6b into master Jan 17, 2020
@nickva nickva deleted the fix-empty-selector-issues branch January 29, 2020 15:32
tonysun83 added a commit that referenced this pull request Apr 21, 2020
Previously, in #1783, the logic
was wrong in relation to how certain operators interacted with empty
arrays. We modify this logic to make it such that:

{"foo":"bar", "bar":{"$in":[]}}
and
{"foo":"bar", "bar":{"$all":[]}}

should return 0 results.
tonysun83 added a commit that referenced this pull request Apr 21, 2020
Previously, in #1783, the logic
was wrong in relation to how certain operators interacted with empty
arrays. We modify this logic to make it such that:

{"foo":"bar", "bar":{"$in":[]}}
and
{"foo":"bar", "bar":{"$all":[]}}

should return 0 results.
tonysun83 added a commit that referenced this pull request Apr 21, 2020
Previously, in #1783, the logic
was wrong in relation to how certain operators interacted with empty
arrays. We modify this logic to make it such that:

{"foo":"bar", "bar":{"$in":[]}}
and
{"foo":"bar", "bar":{"$all":[]}}

should return 0 results.
tonysun83 added a commit that referenced this pull request Apr 21, 2020
Previously, in #1783, the logic
was wrong in relation to how certain operators interacted with empty
arrays. We modify this logic to make it such that:

{"foo":"bar", "bar":{"$in":[]}}
and
{"foo":"bar", "bar":{"$all":[]}}

should return 0 results.
tonysun83 added a commit that referenced this pull request Apr 21, 2020
Previously, in #1783, the logic
was wrong in relation to how certain operators interacted with empty
arrays. We modify this logic to make it such that:

{"foo":"bar", "bar":{"$in":[]}}
and
{"foo":"bar", "bar":{"$all":[]}}

should return 0 results.
tonysun83 added a commit that referenced this pull request Apr 22, 2020
Previously, in #1783, the logic
was wrong in relation to how certain operators interacted with empty
arrays. We modify this logic to make it such that:

{"foo":"bar", "bar":{"$in":[]}}
and
{"foo":"bar", "bar":{"$all":[]}}

should return 0 results.
tonysun83 added a commit that referenced this pull request Apr 22, 2020
Previously, in #1783, the logic
was wrong in relation to how certain operators interacted with empty
arrays. We modify this logic to make it such that:

{"foo":"bar", "bar":{"$in":[]}}
and
{"foo":"bar", "bar":{"$all":[]}}

should return 0 results.
wohali added a commit that referenced this pull request Apr 22, 2020
Previously, in #1783, the logic
was wrong in relation to how certain operators interacted with empty
arrays. We modify this logic to make it such that:

{"foo":"bar", "bar":{"$in":[]}}
and
{"foo":"bar", "bar":{"$all":[]}}

should return 0 results.

Co-authored-by: Joan Touzet <[email protected]>
wohali added a commit that referenced this pull request Apr 22, 2020
Previously, in #1783, the logic
was wrong in relation to how certain operators interacted with empty
arrays. We modify this logic to make it such that:

{"foo":"bar", "bar":{"$in":[]}}
and
{"foo":"bar", "bar":{"$all":[]}}

should return 0 results.

Co-authored-by: Joan Touzet <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants