-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Safely maintaining a rolling history of large documents #8244
Comments
Hello @jpike88 intriguing case you have here. My first question should be: do you need replication to Couchdb or Cloudant? If that is the case, maybe you should read carefully these: Long story short, it is not recommended to have large documents stored, transferred and replicated between the Couchdb Replication protocol. I have my share of nightmares with large images (between 5 and 20 MB) between Pouchdb and Cloudant. It is not very performant and can block access to Cloudant if several users are replicating at the same time (lots of 409 errors). You have to take into consideration that when you "delete" a document in Pouchdb-Couchdb the revisions are kept. You can recover a little bit of disk space with compact, but it is minimal. In the event you do not need or plan for replication and only require to store attachments or files, maybe you should consider store them directly to IndexedDB. You do not need to worry too much about free space: There are a couple of libraries that I recommend that can complement Pouchdb, one is Localforage https://localforage.github.io/localForage/ and idb https://github.com/jakearchibald/idb Both of them can store files to IndexedDB. When I need replication and attachments I save the file with Localforage with a key, that key is added to a Document in Pouchdb for offline access, then if I have internet connection I save the file to cloudinary and save the key to Pouchdb. With all being said, if you really need to store attachments directly to Pouchdb I suggest you first make sure they are in blob format and save them individually. Hope all of this makes sense for you. |
I should also add: The files are actually just JSON, so they've naturally worked as documents. Is there a big performance/reliability difference between storing them in their natural JSON form vs using them as attachments? |
Hello @jpike88 , I am terribly sorry. I did not read that you specifically pointed out that the actual documents were that size. Well, personally speaking I would not work with JSON Documents larger than 2 MB's. I would rather having several KB documents than 1 single document 1 MB size. This stackoverflow question was shared with me long time ago: To sum up, you can store larger documents as attachments, however, it is not recommended. I would store the larger JSON directly to IndexedDB. |
We dont actually currently store JSON natively in indexeddb, there were problems with deeply nested objects and so we end up stringifying them via https://github.com/nolanlawson/vuvuzela although thats code I would like to remove. One thing to be very aware of is we will store a number of copies of every object saved, that number can be configured via |
@daleharvey thanks for your insight, do you have a recommendation for my particular scenario? I already have max copies set to 1 as I don't really need that function. Is there a change in risk if I i don't use pouchdb for this particular large JSON autosave use case (I still use pouchdb for plenty of other stuff and it works great) |
This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 7 days |
Got a question...
Up until now I have an 'autosave' bucket that I dump a large JSON (can be up to 30-300mb) into from time to time, with the key 'autosave'. I just upsert when a new autosave is done, in order not to bloat storage. I've noticed that in rare cases, the process of autosaving can fail in some way, and for some reason on a refresh, autosave fails to complete correctly and data loss might occur (I'm still not sure how this happens, but it's a rare occurrence)
I want to upgrade this by having a sort of rolling history of autosaves... is there a benefit to using 10 buckets with one document each, vs 10 documents in the one bucket? How does this impact risk of data corruption/failure, is there a solid failsafe approach here? And should such an approach be recommended in pouchdb documentation?
The text was updated successfully, but these errors were encountered: