Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inquiry regarding id generation for encrypted record #19

Open
eric1234 opened this issue Apr 19, 2022 · 3 comments
Open

Inquiry regarding id generation for encrypted record #19

eric1234 opened this issue Apr 19, 2022 · 3 comments
Labels
question Further information is requested

Comments

@eric1234
Copy link

When generating the id for the encrypted record it looks like you are hashing all the data in the record:

https://github.com/garbados/comdb/blob/master/index.js#L101-L103

I understand the reason for needing the ID to be deterministic but why use the hash of the entire record? It seems if you change any data on a record that would create a new record vs doing a revision on the previous record. I'm unsure what impact this has on syncing but at the very least it means compaction won't work as from PouchDB's perspective those are two different records rather than the current record and a previous record.

Would it be perhaps better to just hash the cleartext ID to create an obfuscated deterministic ID for the encrypted record? If you were worried about someone determining the cleartext ID (which might contain sensitive data since PouchDB encourages intelligent keys) via rainbow table or something you could always generate a per-record salt to be stored with the encrypted data.

Just wanted to inquire to make sure I'm not missing anything obvious on why that is a bad idea.

@garbados
Copy link
Owner

I haven't exactly thought hard about this in a while so I think you're probably spot on. I wanted each document in the encrypted database to represent one document transformation, whether that's a create, update, or delete, so that an attacker wouldn't be able to know which documents are being modified. If encrypted records map 1:1 to decrypted documents, then an attacker might concentrate their decryption efforts on a file that changes often, or by monitoring CouchDB server activity (such as if the attacker is the server admin) to know exactly when specific files were updated. Depending on the attacker's knowledge of the program, that information -- what documents are being updated often, and when documents are being updated -- could be very revealing. Instead, although hashing the entire document is a bit heftier of an approach, the attacker can only determine when some kind of document interaction occurred: not which document, nor what kind of interaction.

This is probably overkill for most security paradigms. If you wanted to produce a PR with a different approach, I'd be happy to review it.

@eric1234
Copy link
Author

I really appreciate the insight. With the right program I could see how keeping a consistent ID might reveal something about the information being stored. I'd be curious if the changing of the ID impacts the Couch replication in any way. Right now the only impact I know of is that compaction wouldn't work. If that is the only impact it's just a trade-off between security vs storage costs.

For your needs, it sounds like changing IDs is worth the storage cost. For other applications being able to compact might be worth the slight loss in security. And of course if you did decide to have a consistent ID do you just hash the real ID or do you hash the real ID + some salt? That would depend on if the application is using the real ID to store information (intelligent key) or the ID is random (UUID).

Because of these different needs if there was a PR it seem the method of generating the encrypted ID would need to be configurable. Maybe a callback function so the application could inject whatever behavior they want. This callback could have a reasonable default. The breaking of compaction seems surprising to me so it seems the default would be best if it kept the ID consistent. Since Pouch/Couch encourages intelligent keys salting that hash also seems good for a default. But then an application could override with something different depending on their needs.

What I'm not sure of is adding an option like this making things too complicated? What about backwards compat? If the default callback was different than the current behavior that would be a breaking change.

@garbados
Copy link
Owner

You're right about the trade-offs. I could see providing a toggle around how ComDB hashes IDs, allowing the user to choose between salt-hashing the doc (current behavior) or just the decrypted ID, but nothing more complicated than a toggle.

If you feel a toggle is insufficient, I'd be open to hearing a further case for other alternatives.

@garbados garbados added the question Further information is requested label May 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants