Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot index deleted documents with Mango #1355

Open
garbados opened this issue May 30, 2018 · 5 comments
Open

Cannot index deleted documents with Mango #1355

garbados opened this issue May 30, 2018 · 5 comments

Comments

@garbados
Copy link
Contributor

Mango indexes cannot be used to index deleted documents, though it would sure be nice if they could.

Expected Behavior

Given a Mango index like this:

{
	"index": {
		"partial_filter_selector": {
			"_deleted": {
				"$exists": true
			}
		},
		"fields": ["_deleted"]
	},
	"ddoc": "deleted",
	"name": "deleted",
	"type": "json"
}

A query against that index should return any deleted documents. For example, this request with curl:

$ curl -H 'content-type:application/json' $COUCH_URL/mango-delete-test/_find -d '{"selector":{"_deleted":{"$exists":true}},"use_index":"deleted"}'

{"docs":[
{"_id":"6231e1e26655d19f51ea23e183001081","_rev":"1-967a00dff5e02add41819138abb3284d","_deleted":true}
],
"bookmark": "nil"}

It should return deleted documents.

Current Behavior

Indexing the _deleted field creates indexes that return no documents. For example, consider this query using the index from above:

$ curl -H 'content-type:application/json' $COUCH_URL/mango-delete-test/_find -d '{"selector":{"_deleted":{"$exists":true}},"use_index":"deleted"}'

{"docs":[
],
"bookmark": "nil"}

It returns nothing, but it's clear that Mango considers the index valid and indeed can detect that it has been requested.

Whereas a Map-Reduce view can be used to filter a changes feed to return changes pertaining to deleted documents, using a Mango index in this way results in an error:

$ curl "$COUCH_URL/mango-delete-test/_changes?view=deleted/deleted&filter=_view"

{"error":"error","reason":"{timeout,{gen_server,call,\n                     [couch_proc_manager,\n                      {get_proc,{doc,<<\"_design/deleted\">>,\n                                     {1,\n                                      [<<124,254,100,228,235,159,75,216,229,\n                                         48,135,104,37,85,76,42>>]},\n                                     {[{<<\"language\">>,<<\"query\">>},\n                                       {<<\"views\">>,\n                                        {[{<<\"deleted\">>,\n                                           {[{<<\"map\">>,\n                                              {[{<<\"fields\">>,\n                                                 {[{<<\"_deleted\">>,\n                                                    <<\"asc\">>}]}},\n                                                {<<\"partial_filter_selector\">>,\n                                                 {[{<<\"_deleted\">>,\n                                                    {[{<<\"$exists\">>,\n                                                       true}]}}]}}]}},\n                                             {<<\"reduce\">>,<<\"_count\">>},\n                                             {<<\"options\">>,\n                                              {[{<<\"def\">>,\n                                                 {[{<<\"partial_filter_selector\">>,\n                                                    {[{<<\"_deleted\">>,\n                                                       {[{<<\"$exists\">>,\n                                                          true}]}}]}},\n                                                   {<<\"fields\">>,\n                                                    [<<\"_deleted\">>]}]}}]}}]}}]}}]},\n                                     [],false,[]},\n                                {<<\"_design/deleted\">>,\n                                 <<\"1-7cfe64e4eb9f4bd8e530876825554c2a\">>}},\n                      5000]}}"}

As such, it is currently not possible to index deleted documents or to filter changes for deleted documents using a Mango index.

Possible Solution

🤷‍♀️

If a Map-Reduce view can be used to filter a changes feed for deleted (or un-deleted) documents, then perhaps Mango indexes can be applied in a similar way to achieve the same result.

Steps to Reproduce (for bugs)

Here are the curl commands I used to produce this issue:

# create the db
$ curl -X PUT $COUCH_URL/mango-delete-test
# create and delete a document
$ curl -X POST -H 'content-type:application/json' $COUCH_URL/mango-delete-test -d '{}'
$ curl -X DELETE $COUCH_URL/mango-delete-test/{id}?rev={rev}
# create the index
$ curl -X POST -H 'content-type:application/json' $COUCH_URL/mango-delete-test/_index -d '{"index":{"partial_filter_selector":{"_deleted":{"$exists":true}},"fields":["_deleted"]},"ddoc":"deleted","name":"deleted","type":"json"}'
# query the index
$ curl -H 'content-type:application/json' $COUCH_URL/mango-delete-test/_find -d '{"selector":{"_deleted":{"$exists":true}},"use_index":"deleted"}'

That final request should yield a response that includes no documents.

Context

It is occasionally important to find deleted documents en masse, such as while managing tombstones under conditions where other mitigation techniques are not available.

Your Environment

  • Version used: 2.1.1
  • Browser Name and version: curl 7.47.0 (x86_64-pc-linux-gnu) libcurl/7.47.0 GnuTLS/3.4.10 zlib/1.2.8 libidn/1.32 librtmp/2.3
  • Operating System and version (desktop or mobile): Ubuntu 16.04.4 LTS
  • Link to your project: N/A
@wohali
Copy link
Member

wohali commented May 31, 2018

Interestingly, this issue goes all the way back to COUCHDB-1530, aka "Add a mode to _all_docs to include deleted docs". In that issue @davisp was +1 on the idea.

Right now the only way to get a list of tombstones in a DB is to pull the changes feed and filter on _deleted:true.

@rdewolff
Copy link

Is this still the case and the only way to get a list of deleted documents, use a list of tombstones via change feed ? 😞

@wohali
Copy link
Member

wohali commented Aug 30, 2020

Yes.

@rdewolff
Copy link

Why? But why did CouchDB choose these strange pattern...

@wohali
Copy link
Member

wohali commented Aug 31, 2020

Because no one has written the code to change it yet. It clearly hasn't been a priority for anyone.

Complaining here isn't going to change anything. Writing the code to change the behaviour, and submitting it as a pull request, probably would.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants