Schema extraction #1525

wohali · 2018-08-07T15:44:04Z

I have half an (old) patch that extracts top level fields from a document and stores them with a hash in an “attachment” to the database header. So we only end up storing doc values and the schema hash. First of all this trades storage for CPU time (I haven’t measured anything yet), but more interestingly, we could use that schema data to do smart things like auto-generating a validation function / mango expression based on the data that is already in the database. And other fun things like easier schema migration operations that are native in CouchDB and thus a lot faster than external ones. For the curious ones, I’ve got the idea from V8’s property access optimisation strategy.

@kocolosk:

Cloudant has some work on a metadata system that computes the schemas for various clusters of documents in a database. First use case for us was schlepping the data into a relational data warehouse for analytics. Not sure if we can open source the code but agreeing on a schema format would be good.

wohali added api feature roadmap labels Aug 7, 2018

wohali mentioned this issue Aug 7, 2018

Schema Extraction janl/couchdb-next#42

Closed

wohali added this to In Discussion in Roadmap Aug 7, 2018

wohali moved this from Proposed for 3.x to Proposed (backlog) in Roadmap Jul 11, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Schema extraction #1525

Schema extraction #1525

wohali commented Aug 7, 2018

Schema extraction #1525

Schema extraction #1525

Comments

wohali commented Aug 7, 2018