-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add aggregation functions to Mango #1254
Comments
@katsel Hi! Really glad to see this pop up here. Feel free to post a pull request when you're ready. If you want a code review prior to the code being 100% ready, just mark your PR with [WIP] and people will know that it's not in its final state. Thank you again for taking the initiative on this work! |
I believe #1323 could make use of this |
This is the destination of the original roadmap ticket, janl/couchdb-next#19 |
I was asked by @willholley to contribute this Markdown which describes a possible What would Mango aggregation look like?At the moment, the Mango query language only performs data selection - a portion of a larger data set can be returned by providing a JSON query. If my data looks like this: {
_id: "someid",
date: "2018-08-24",
status: "provisional",
invoiceAddress: {
street: "10 Front Street",
city: "Dallas",
state: "Texas"
},
amount: 7.99,
tax: 0,
totalAmount: 7.99,
customerId: "A65522",
lineItems: [
{
productId: "P1",
name: "fish",
cost: 6.0
},
{
productId: "P2",
name: "chips",
cost: 1.99
}
]
} and I want only the "complete" orders from a database I could perform a query. {
selector: {
status : "complete"
},
fields: ["totalAmount", "date"]
} which would give a paged result set in blocks of 200 records: {
"docs": [
{
"totalAmount": 7.99,
"date": "2018-08-24"
},
{
"totalAmount": 4.50,
"date": "2018-08-24"
},
...
],
"bookmark": "g1AAAAA6eJzLYWBgYMpgSmHgKy5JLCrJTq2MT8lPzkzJBYqzVqUmFSWCJDlgkgjhLADXERDn",
"execution_stats": {
"total_keys_examined": 0,
"total_docs_examined": 10,
"total_quorum_docs_examined": 0,
"results_returned": 8,
"execution_time_ms": 2.75
},
"warning": "no matching index found, create an index to optimize query time"
} But what I really want is a grand total of the This document imagines that Mango magically does support aggregation. Everything that follows is fictional syntax in my imaginary world. --- start of imaginary world --- My first Mango aggregationI can use the new "aggregator" object in a Mango query. My query still has a selector, because I don't want to aggregate ALL the documents, only the "complete" orders as before: {
selector: {
status : "complete"
},
aggregator: {
operation: "sum",
of: ["totalAmount"]
}
} Note:
If I'm using the {
selector: {
status : "complete"
},
aggregator: {
operation: "count"
}
} A grand total exampleUsing an
or fields:
or by emptying the "selector", all documents are aggregated:
GroupingIn this example I am performing a more complex selector and introduce grouping:
Any valid Mango selector is allowed in the "selector" object. The "selector" is evaluated at index time and decides which portion of the data makes it to the index. If the selector is changed, a new index is required to calculate the result. The optional "aggregator.group" is an array of keys by which the sum is grouped in the result set. It is an array, so I can have multi-dimensional grouping:
The above example groups by "invoiceAddress.state" which demonstrates selecting data from a sub-object. How this worksCouchDB cannot perform aggregations without an index, but instead of insisting that the user perform an additional "index" step, CouchDB will create the appropriate index (if it doesn't already exist) and return the results once indexing is complete. ** this is a big leap - but, hey, this is an imaginary world ** e.g. for an aggregation query like this:
we need an index that looks like this in old-school JavaScript MapReduce map:
function(doc) {
if (doc.status === 'complete') {
emit([doc.date], [doc.amount])
}
}
reduce:
"_sum" The "selector" in an aggregation is really a Limit the result set with rangeThe trouble with this approach is that the aggregation query always returns all of the aggregations e.g. if one sum per day for every day that you have data. You may only want this month's per-day aggregates. This is where
The ---- end of imaginary world --- |
@glynnbird a really nice concept. This has some really great potential. Thanks for sharing. |
Is there any time estimate for the aggregate feature to be released? |
Not at the moment. It’s not being worked on at the moment.
…________________________________
From: htejwani <[email protected]>
Sent: Wednesday, June 26, 2019 12:13 PM
To: apache/couchdb
Cc: garren smith; Comment
Subject: Re: [apache/couchdb] Add aggregation functions to Mango (#1254)
Is there any time estimate for the aggregate feature to be released?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub<#1254?email_source=notifications&email_token=AABL2ARNCA6JPS6EZ7C2N2TP4M6N7A5CNFSM4EYCBVF2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODYTBNWY#issuecomment-505812699>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AABL2AVIEPMTRJEX2WRK3EDP4M6N7ANCNFSM4EYCBVFQ>.
|
@glynnbird Wow, that is how feature requests should be described. This is exactly what I'm missing in Mango queries. Too bad this was left in an imaginary world... |
Is it implemented? If yes, May I know in which version of couchdb it is implemented...as I had the same requirement, so I just want to know. |
@NagaPranavi9 it is not implemented yet, we'll close the issue as and when that happens. |
The Mango query language provides CRUD operations and basic selector syntax for document retrieval.
There are no aggregation functions.
Thus, Mango does currently not have an equivalent to 'reduce' and 'rereduce' functions as provided by JavaScript MapReduce.
Expected Behavior
The
_find
endpoint should provide syntax and sensible keywords such as sum, average, minimum, maximum to facilitate the aggregation of values into a single field.This change would enable users to write more powerful queries in Mango without prior knowledge of JavaScript or the MapReduce framework.
Current Behavior
None.
Mango is limited to selector/'map' syntax, so to aggregate/'reduce' data, users have to turn to JavaScript MapReduce.
Context
I'm a student in the final stages of my master's degree in computer science. I made a few non-code contributions to CouchDB. I would like to add this functionality to Mango, and make this my first code contribution to the project.
The text was updated successfully, but these errors were encountered: