Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adds operators $sizeLte and $sizeGte #3582

Closed
wants to merge 941 commits into from
Closed

adds operators $sizeLte and $sizeGte #3582

wants to merge 941 commits into from

Conversation

hklarner
Copy link

Overview

  • supports inequality matches for $size
    • example: between 10 and 20 customers in array
      • "doc.customers": {"$sizeGte": 10}
      • "doc.customers": {"$sizeLte": 20}

Testing recommendations

Tests are added to

  • src/mango/test/03-operator-test.py

Related Issues or Pull Requests

None

Checklist

jjrodrig and others added 30 commits September 11, 2020 09:07
We need to call StartFun as it might add headers, etc.
This fixes a94e693 because a race
condition exisited where the 'DOWN' message could be received
before the compactor pid is spawned. Adding a synchronous call to
get the compactor pid guarantees that the couch_db_updater process
handling of finish_compaction has occurred.
This implementation was difficult to understand and had behavior that
was too difficult to predict. It would break if view behavior changed in
significant ways from what was originally expected.
The couch_views_batch module is responsible for sensing the largest
batch sizes that can be successfully processed for a given indexer
process. It works by initially searching for the maximum number of
documents that can be included in a batch. Once this threshold is found
it then works by slowly increasing the batch size and decreasing when
its found again.

This approach works to maximise batch sizes while being reactive to when
a larger batch would cross over the FoundationDB transaction limits
which causes the entire batch to be aborted and retried which wastes
time during view builds.
This test doesn't fail correctly any longer. Rather than attempting to
create a new pathological view case I've just moved it to eunit where we
can use meck to throw errors directly.
`after_db_create/2` and `after_db_delete/2` are when databases are created and
deleted respectively. The callbacks are called with both the database name and
the database instance UUID values.
Previously the data was read from the parser in the transaction. If the
transaction had to retry, for example, because of a conflict, the parser would
have been drained and exited resulting the request failing with a 500
"mp_parser noproc" error.

Since FDB cannot handle transactions larger than 10MB opt to read the
attachment data into memory first, before the transaction starts.
Also, ensure to use the same options as other couch apps: force_utf8 and
dedupe_keys.
fold_jobs/4 is a generalized folding API which can be used to process all the
jobs of a particular type.

get_pending_count/2,3 can be used to get the count of pending jobs. It takes
the same options as accept, including a limit and `max_sched_time`. Just like
accept it reads the range of pending jobs as a snapshot to avoid generating
read conflicts.
These modules are not used by the new replicator.
 * Remove unused functions and some function used only from one place like `sum_stats/2`.

 * Update time functions to use the more modern `erlang:system_time/1` API.

 * `parse_int_param/5` and `parse_replication_states/1` was moved from the old
    _httpd_util module as they were they only ones need from there.

 * `default_headers_map/0` Used to the default httpd record headers as a map
   since part of the replication data will be kept as map object.

 * `proplist_options/1` Some parts of the replicator, like _httpc and _api_wrap
   still use proplist options, so this function can be used to translate
   options as maps to a proplist version.
This module is in responsible for parsing either an HTTP `_replicate` request
body, or a _replicator doc into an internal `Rep` object (an Erlang map).

`parse_transient_rep/2` parses _replicate requests. It also handles
cancelations, where requests bodies look like ```{"id": ..., "cancel": true}```
instead of having all the expected parameters.

`parse_rep_doc/1` parses _replicator docs.

Parsing consists of 3 main parts:

 - Parsing the endpoint definitions: source and target url, headers, TLS bits
   and proxies

 - Parsing options into an options map, possibly using defaults from config
   parameters

 - Parsing socket parameters. These now have a hard-coded allow-list as opposed
   accepting all possible Erlang socket options.

The parsing function also double as validation function which gets called from
the _replicator's before_doc_update callback when users update replication
documents. They would get an immediate feedback if their replicationd document
is malformed.

Everything is turned into a map object. This object should be able to be
serialized and de-serialized to (from) JSON.

Since maps are used, add the definitions of some common fields
couch_replicator.hrl. Mistyping them should raise a compiler error.

couch_replicator_docs lost all of its parsing function and also functions which
update intermediate replication doc states (triggered and error). It still
handles function which relate to interacting with _replicator docs.
The goal is to keep everything below the _api_wrap module level relatively
intact. To achieve that handle option maps in some places, or translate back to
a proplist or `#httpd{}` records in others.

The `couch_replicator_api:db_from_json/1` function is where endpoint map object
from a `Rep` object are translated into `#httpdb{}` records. Headers are
translated back to lists and ibrowse options into proplist with atom keys.
This module is responsible for calculating replication IDs. It inspects all the
replication options which may affect the replication results and hashes them
into a single ID. CouchDB replicator tries to maintain compatibility with older
versions of itself so it keep tracks of how to calculate replication IDs used
by previous version of CouchDB. Replication ID calculation algorithms have
their own version, the latest one is at version 4.

One of the goals of this update is to not alter the replication ID algorithm
and keep it at version 4, such that for all the same parameters the replication
IDs should stay the same as they would be on CouchDB <= 3.x. That is why in
some internal function, options maps and binares are turned back into proplist
and tuples before hashing is performed. There is a unit tests which asserts
that the replication ID calcuated with this update matches what was calcuated
in CouchDB 3.x.

Internal representation of the replication ID has changed slighly. Previously
it was represented by a tuple of `{BaseId, ExtId}`, where `BaseId` was the ID
without any options such as `continuous` or `create_target`, and `ExtId` was
the concatenated list of those options. In most cases it was useful to operate
on the full ID and in only a few place the `BaseId` was needed. So the
calculation function was updated to return `{RepId, BaseId}` instead. `RepId`
is a binary that is the full relication ID (base + extensions) and `BaseId` is
just the base.

The function which calculated the base ID was updated to actually be called
`base_id/2` as opposed to `replication_id/2`.

Another update to the module is a function which calculates replication job
IDs. A `JobId` is used to identify replication jobs in the `couch_jobs` API. A
`JobId`, unlike a `RepId` never requires making a network round-trip to
calculate. For replications created from `_replicator` docs, `JobId` is defined
as the concatenation of the database instance UUID and document ID. For a
transient jobs it is calculated by hashing the source, target endpoint
parameters, replication options. In fact, it is almost the same as a
replication ID, with one important difference that the filter design doc name
and function name are used instead of the contents of the filter from the
source, so no network round-trip is necessary to calculate it.
This is the `couch_jobs` abstraction module. All replicator calls to
`couch_jobs` should go through it. This module takes care of adding types to
some of the API calls, handles maintencence of the RepId -> JobId mappings when
jobs are added and removed, and some subscription logic.

`fabric2.hrl` include file is updated with the definition of the
`?REPLICATION_IDS` prefix where the RepId -> JobId keyspace lives.
The frontend is the part responsible for parsing replication parameters and
creating replication jobs. Most of that happens in the `couch_replicator`
module.

 - `couch_replicator:replicate/2` is the main API for creating transient
   replication jobs.

 - Replication jobs from `_replicator` documents are updated from
  `couch_replicator:after_*` callbacks. `after_db_create/2` besides being
  called on db creation also gets called when a database is undeleted and
  `add_jobs_from_db/1` function will attempt to parse them all.

`couch_replicator` exports monitoring functions `docs/2,3 and jobs/0,1`. Those
get called from HTTP handlers for `_scheduler/docs` and `_scheduler/jobs`
respectively.

For hands-on remsh access there some debuging functions such as:

 - rescan_jobs/0,1 : Simulates a db being re-created so all the jobs are added
 - reenqueue_jobs/0,1 : Deletes all the jobs from a db then re-adds them
 - remove_jobs/0 : Removes all the replication jobs
 - get_job_ids/0 : Read the RepId -> JobId mapping area
Backend replicator modules execute replication jobs. The two main modules
reponsible for job management and execution are `couch_replicator_job` and
`couch_replicator_job_server`.

`couch_replicator_job`

 - Is the main process of each replication job. When this process starts, it
   waits in the `couch_jobs:accept/2` call. This may take an indefinite amount
   of time. The spawned `couch_replicator_job` waiting in accept like that is
   called internally an "acceptor". The main pattern of execution is multiple
   acceptors are started, and after some of them accept jobs, they become
   "workers".

 - After it accepts a job, it parses the `couch_jobs` job data, which contains
   the `Rep` object and calculates the replication ID from it. Replication ID
   calculation may involve making a call to the source endpoint in order to
   fetch the contents of the javascript filter. Then, the `Rep` object and the
   replication ID is used to construct the internal `#rep_state{}` state record
   of the `gen_server`.

 - Multiple replication jobs may end up trying to run the same replication
   (with the same replication ID) concurrently. To manage these types of
   colisions, `check_ownership/3` function is called to determine if the
   current replication is the correct `owner` of that replication. If it is
   not, then the job maybe fail and exit.

 - There is a periodic checkpoint timer which sends a `checkpoint` message. The
   checkpoint frequency is calculated as the minimum of the `couch_jobs`
   activity timeout and the configured checkpoint interval. During each
   checkpoint attempt, there is a call to `couch_jobs:update/3` which updates
   the job's data with latest state and ensure the job doesn't get re-enqueued
   due to inactivity.

 - If the job completes, then `couch_jobs:finish/3` is called and the
   replication process exits `normal`. If the job crashes, there is a
   consecutive error count field (`?ERROR_COUNT`) which, is used to calculate
   the backoff penalty. There is an exponential backoff schedule, that starts
   with the base value, then doubles, but only up to a maximum value. Both the
   base and the maximum values are configurable with the
   `min_backoff_penalty_sec` and `max_backoff_penalty_sec` settings
   respecively. This is an improvement from before where the users could only
   influence the maximum backoff penalty by reducing the number of failed
   events kept by each job.

`couch_replicator_server`

 - This is a module which spawns and keeps track of `couch_replicator_job`
   processes.

 - Periodically, every `interval_sec` seconds, it runs the `reschedule` function
   which checks for pending jobs. If they are some, it will start acceptors to
   run them. If those acceptors become workers, and if the total number of
   workers goes above the `max_jobs` setting, the oldest `continuous` workers
   will be stopped until the total number of jobs falls below `max_jobs` value.

 - In addition to `max_jobs` limit, there is a `max_churn` limit which
   determines up to how many job starts to allow during each scheduling
   interval. As jobs are started, they reduce the available churn "budget" for
   that cycle and after it goes below 0 no more jobs can start until the next
   cycle.

 - This module also performs transient job cleanup. After transient jobs stop
   running previously they simply vanished but with this update they maybe
   linger for at least `transient_job_max_age_sec` seconds.
Stich everything together: the backend, frontend and http handlers.

The supervisor `couch_replicator_sup` handles starting a set of fronted or
backend children. It may also start both or neither.

The HTTP layer for monitoring and creating jobs is simpler than before since
there is rpc and clustering involved.
Update settings with defaults. Also comment out values which are already set to
default in the code.
Remove sections which don't apply anymore and describe briefly how frontend and
backend interact.
Update tests to use the new replicator. Also clean up redundancy and re-use
some of the newer macros from fabric2 (?TDEF_FE).

Make sure replicator tests are included in `make check`
Previously, in 3.x we re-parsed the endpoint URLs with
`ibrowse_lib:parse_url/1` when stripping credentials, which threw an error if
the URL was invalid. So we try to preserve that same logic.

Backport some tests from 3.x to make sure URL stripping works when URL is valid
and, also use the nicer ?TDEF and ?TDEF_FE test helpers.
Job exits are asynchronous so we ensure we wait for exit signals to be handled
before checking the state.
Caught during Elixir tests. I've added a unit test to `ebtree.erl` to
ensure we don't regress in the future.
This fixes compilation if CouchDB is used as a dependency.
Terreii and others added 17 commits May 5, 2021 10:31
Previously the default value for md5 field was `<<>>`. This value changed to
undefined when we switch to using maps instead of erlang records.
The change break the `couch_att:to_json/4` funciton because `base64:encode/1`
cannot handle atom `undefined`.

The issue happens for inline attachments only this is why it wasn't discovered
earlier.
Handle the case when md5 field is undefined
Set the `worker_trap_exits = false` setting to ensure our replication worker
pool properly cleans up worker processes.

Ref: #3208
Add `buggify-elixir-suite` target to run Elixir integration tests
under FoundationDB's client buggify mode [1]. In this mode, the FDB C
client in the `erlfdb` application will periodically throw mostly
retryable errors (`1009`, `1007`, etc). Transaction closures should
properly handle retryable errors without side-effects such as
re-sending response data to the user more than once or, attempt to
re-read data from the socket after it was already read once.

In order to avoid false positives, provide a custom .ini settings file
which disables transaction timeouts (`1031` errors). Those are not
retryable by default, as far as the `on_error` callback is
concerned. Ff we do have timeouts set ( = 60000), it signals the
FoundationDB client that we expect to handle timeouts in buggify mode,
so it starts throwing them [2]. Since we don't handle those everywhere
we get quite a few false positive errors.

Buggify settings I believe are the default -- 25% chance to activate
an error, and 25% chance of firing the error when the code passes over
that section. In most test runs this should result in a pass, but
sometimes, due to lingering bugs, there will be timeouts, 409
conflicts and other failures so we cannot yet turn this into a
reliable integration test step.

[1] https://apple.github.io/foundationdb/client-testing.html

[2] https://github.com/apple/foundationdb/blob/master/fdbclient/ReadYourWrites.actor.cpp#L1191-L1194
During buggify runs we disable max tx retries by setting it to -1. That's FDB's
documented way to of doing it. However, when we re-use that setting to handle
restart_tx logic we don't account for -1, so we that's what this PR fixes.
Switching crypto functions to use the new ones such as:

```
crypto:hmac(Alg, Key, Message) -> crypto:mac(hmac, Alg, Key, Message)
```

To simplify Erlang 24 support, in which some crypto functions have
been removed, bump the minimum version to 22.

Other fixes were in dependencies:

  * Bumped meck to 0.9.2. New meck from upstream supports Erlang
    24. Also required bumping folsom since it depends on meck

  * Example in passage module would not compile, so commented out
  the parse transform. Required bumping jaeger passage since it
  depends on passage
When ApiMod:acquire_map_context returns with {error, any()}, the caller(s) fail with badmatch. This commit changes the caller hierarchy so that those {error, _} are propagated back and a failing update gets retried (up to retry_limit times).
Previously, if a transaction got a `commit_unknown_result`, during the next
successful attempt that result would be returned to the user. However, if the
next attempt was another retryable error, then the committed result was ignored
and the whole transaction would be applied again. This could result in document
update transactions conflicting with themselves as described in issue
#3560

To prevent that from happening we remember that there was an
`commit_unknown_result` error during the course of any previous retries and try
to return that result.
Solved conflicts from "cherry-pick" 3.x commit
Eliminate compiler warnings
@@ -75,6 +75,21 @@ def test_keymap_match(self):
docs = self.db.find({"foo": {"$keyMapMatch": {"$eq": "aa"}}})
self.assertEqual(len(docs), 1)

def test_size(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add another test that shows it handles the case where the sizeGte value is not an integer and another when it is acting on a field that is not an array.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added tests for failures but I don't know how to "assert failure" in this test script (need help).

@glynnbird
Copy link
Contributor

glynnbird commented May 26, 2021

Thanks for the contribution @hklarner. On a broader question the new operators you are creating:

  • $sizeGte
  • $sizeLte

have no precedence in MongoDB which was the original inspiration for the MongoDB language. If I'm reading the MongoDB docs right, the MongoDB way of performing this query would be to allow the following syntax:

{
  "selector": {
     "customers": {
        "$size:": { "$gte": 20}
      }
   }
}

The advantage of this approach is that

a) we already have a $size operator.
b) we have plenty of other comparison operators $eq, $gte etc etc
c) it wouldn't be creating new operators that differ from the "MongoDB way"

Note that currently the Mango query parser doesn't support this syntax, but I'm suggesting it could be modified to, rather than adding lots of new $size... operators.

What do you thinkg @hklarner & @garrensmith?

@hklarner
Copy link
Author

@glynnbird I see. Agreed, the MongoDB consistency is preferable. But I don't know how to implement it 😅

@garrensmith
Copy link
Member

@hklarner take a look at some of the other operators that handle nested values. Maybe $and and look at implementing something similar.

@jjrodrig
Copy link
Contributor

Hi @hklarner,

I was checking this. It seems quite simple to introduce nesting the behaviour for the $size operator. I've been testing with these changes and seems to work.

--- a/src/mango/src/mango_selector.erl
+++ b/src/mango/src/mango_selector.erl
@@ -145,6 +145,8 @@ norm_ops({[{<<"$keyMapMatch">>, Arg}]}) ->
 
 norm_ops({[{<<"$size">>, Arg}]}) when is_integer(Arg), Arg >= 0 ->
     {[{<<"$size">>, Arg}]};
+norm_ops({[{<<"$size">>, {_}=Arg}]}) ->
+    {[{<<"$size">>, norm_ops(Arg)}]};
 norm_ops({[{<<"$size">>, Arg}]}) ->
     ?MANGO_ERROR({bad_arg, '$size', Arg});
 
@@ -581,8 +583,10 @@ match({[{<<"$regex">>, Regex}]}, Value, _Cmp) when is_binary(Value) ->
 match({[{<<"$regex">>, _}]}, _Value, _Cmp) ->
     false;
 
-match({[{<<"$size">>, Arg}]}, Values, _Cmp) when is_list(Values) ->
+match({[{<<"$size">>, Arg}]}, Values, _Cmp) when is_list(Values), is_integer(Arg) ->
     length(Values) == Arg;
+match({[{<<"$size">>, Arg}]}, Values, _Cmp) when is_list(Values) ->
+    match(Arg, length(Values), _Cmp);
 match({[{<<"$size">>, _}]}, _Value, _Cmp) ->
     false;

I hope this could help.

@hklarner
Copy link
Author

hklarner commented May 28, 2021

@jjrodrig that's great! Hey, because of your answer on stackoverflow I opened this PR! Thanks for helping.

@bessbd
Copy link
Member

bessbd commented Jun 2, 2021

@hklarner : #3568 is very close to a merge. Do you want me to hold off with it so that this can be merged before?

@hklarner
Copy link
Author

hklarner commented Jun 2, 2021

@bessbd no, because I won't be able to continue with this for at least another week. All I have to do to be up to date is pull before I continue, right?

@bessbd
Copy link
Member

bessbd commented Jun 2, 2021

@bessbd no, because I won't be able to continue with this for at least another week. All I have to do to be up to date is pull before I continue, right?

That's pretty much it. In case you are changing files that are modified by the reformat, the conflict resolution may become a little difficult, but that shouldn't be a big deal.
In case you need help with anything, please let me know.

@hklarner hklarner closed this by deleting the head repository Jan 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.