Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using custom revision numbers when creating documents #782

Closed
S-Aggarwal opened this issue Aug 30, 2017 · 4 comments
Closed

Using custom revision numbers when creating documents #782

S-Aggarwal opened this issue Aug 30, 2017 · 4 comments

Comments

@S-Aggarwal
Copy link

S-Aggarwal commented Aug 30, 2017

We were using our own revision numbers with couchDB 1.6 to mask the number of edits made on a document. This was achieved by adding a "_rev" property to the JSON document we inserted into the database and making an API request with a _rev parameter (as if performing an update). The revision number was generated using the following Go code:

// randomRev produces a deterministic revision number from the plaintext, 1-2^16, and a hash
// After compaction, the number of revisions for a doc is now masked on the server
// Input should be the document
func randomRev(doc interface{}) string {
	encoded, err := json.Marshal(doc)
	if err != nil {
		log.Error(errors.Trace(err))
		return ""
	}
	multihash, err := mh.Sum(append([]byte("revsalt"), encoded...), mh.SHA1, 16)
	if err != nil {
		log.Error(errors.Trace(err))
		return ""
	}
	revNum := binary.BigEndian.Uint16(multihash[15:17]) // two bytes ie 0-2^16
	return strconv.Itoa(int(revNum)) + "-" + multihash.HexString()[4:]
}

In Couch 1.6, couch treated the document creation as an update and then the revision number was just an increment of what we provided (eg if we gave _rev=41543-c1a310b73245bfac2a69583319a2470c, then after completion the stored document had _rev=41544-<some hash value hexstring>).

In Couch 2.1, this generally doesn't work for the most part and Couch gives us a Document update conflict but what's strange is it seems to work sometimes.
Eg
While inserting

{
	"_id": "randomid10",
	"_rev": "64769-1a0347892643024c68e945015bfd0141",
	"par" : "test"
}

we get a 409: Document update conflict.

But when inserting

{
	"_id": "randomid123",
	"_rev": "41543-c1a310b73245bfac2a69583319a2470c",
	"par": "test"
}

the document is successfully created. Also adding this same document to a different database (or even the same one after wiping it completely) gives back an update conflict.

I am not sure what is happening behind the scenes here but there seems to be some kind of bug here.

@wohali
Copy link
Member

wohali commented Aug 30, 2017

We do not support creating your own _rev values in 1.6 or forward. You should be treating the _rev value as opaque. The only thing we guarantee is that, upon saving a new version using the correct previous _rev value, the new one will be monotonically higher.

@wohali wohali closed this as completed Aug 30, 2017
@ondra-novak
Copy link

Custom revisions works without any issue in CouchDB 2.1.1. You just need to specify new_edits=false and include correct _revisions field. It is simper than in looks and more logical than using the last revision field.

PUT https://localhost:5984/custom_revs/customRevTest?new_edits=false

{
"_id":"customRevTest",
"_revisions":{"start":1,"ids":["my_custom_revision"]},
"foo":"bar"
}

The _rev field can be omitted if the document contains field _revisions. Because the revision of the document is always the first, so CouchDB can calculate it

{
"_id":"customRevTest",
"_revisions":{"start":2,"ids":["my_super_extra_long_custom_revision","my_custom_revision"]},
"foo":"bar2"
}

I have tested longer revision than usual and passed

{
"_id":"customRevTest",
"_revisions":{"start":3,"ids":["long_long_long_long_long_long_long_long_long_long_long_long_long_long_long_long_long_long","my_super_extra_long_custom_revision","my_custom_revision"]},
"foo":"bar4"
}

Apparently you don't need to include the whole revision history. Once there is matching record, the CouchDB is happy.

{
"_id":"customRevTest",
"_revisions":{"start":4,"ids":["short","long_long_long_long_long_long_long_long_long_long_long_long_long_long_long_long_long_long"]},
"foo":"bar5"
}

The CouchDB is still tracking revision history.

GET https://localhost:5984/custom_revs/customRevTest?revs=true

{"_id":"customRevTest","_rev":"4-short","foo":"bar5","_revisions":{"start":4,"ids":["short","long_long_long_long_long_long_long_long_long_long_long_long_long_long_long_long_long_long","my_super_extra_long_custom_revision","my_custom_revision"]}}

So you can generate a custom revision. You can encode various information into it: For instance

N-<timestamp>_<userid>_<random>

you can easily track who made changes and when

Has it some disadvantages? Yes. I still considering this technique as non-standard. It works, because it is part of the replication protocol. The database must accept any weird revisions as long as they are comprising correct history chain. From the database's perspective, the request looks as part of ongoing replication.

However, once the inconsistency is submitted, there is no 409 error. It results to creation of the conflicted document.

Also note that two different documents with the same _rev and _id are considered as equal.

@touy
Copy link

touy commented May 31, 2018

it seems this is a big disadvantage for couchdb , I use use websocket nodeJS, when user double click the link to request and write some log , 2 clicks have the same _rev , which I think I have slow down the system ?

@wohali
Copy link
Member

wohali commented May 31, 2018

@touy If you don't use what the previous poster said, your situation won't create 2 revisions, because the documents are identical and your 2 versions would both be updating with the same prior _rev value. So only one new revision will be created.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants