Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Grouped reductions break ICU collation #2008

Closed
davisp opened this issue Apr 19, 2019 · 3 comments
Closed

Grouped reductions break ICU collation #2008

davisp opened this issue Apr 19, 2019 · 3 comments

Comments

@davisp
Copy link
Member

davisp commented Apr 19, 2019

ICU Collation of Grouped Reductions is broken

Description

This was encountered as an issue by @nevans on CouchDB Slack. It turns out that we're not properly using UCA sorting when dealing with grouped reduce rows due to the use of an erlang dict for storing rows.

Steps to Reproduce

#!/usr/bin/env python

import json

import requests


s = requests.session()
s.auth = ("adm", "pass")
s.headers["Content-Type"] = "application/json"

DB_URL = "https://127.0.0.1:15984/test"

s.delete(DB_URL)
r = s.put(DB_URL)
# q=1&n=1 "fixes" the issue because fabric only
# ends up with a single group
#r = s.put(DB_URL, params={"q": "1", "n": "1"})
r.raise_for_status()


ddoc = {
    "_id": "_design/bar",
    "views": {
        "uca": {
            "map": "function(doc) {emit(doc.value, null);}",
            "reduce": "_count"
        }
    }
}
r = s.put(DB_URL + "/_design/bar", data=json.dumps(ddoc))
r.raise_for_status()

values = [u"\u2708\ufe0f New Arrivals", u"\u2708 New Arrivals"]

for i in range(10):
    for v in values:
        doc = {"value": v}
        r = s.post(DB_URL, data=json.dumps(doc))
        r.raise_for_status()

print "View Rows"
r = s.get(DB_URL + "/_design/bar/_view/uca", params={"reduce": "false"})
r.raise_for_status()
print json.dumps(r.json(), indent=4, sort_keys=True)
print

print "Total Reudce"
r = s.get(DB_URL + "/_design/bar/_view/uca")
r.raise_for_status()
print json.dumps(r.json(), indent=4, sort_keys=True)
print

print "Grouped Reduce"
r = s.get(DB_URL + "/_design/bar/_view/uca", params={"group": "true"})
r.raise_for_status()
print json.dumps(r.json(), indent=4, sort_keys=True)

Expected Behaviour

The above script returns two grouped rows that have keys that sort equal using UCA.

Your Environment

Unimportant

Additional context

There's previous work to address this issue for map views which are similar but not quite the same issue. A fix for this issue will look somewhat like the changes in this commit:

1be5506

@kocolosk
Copy link
Member

kocolosk commented Nov 4, 2021

Rediscovered in #3773 and fixed by #3783

@kocolosk kocolosk closed this as completed Nov 4, 2021
@nevans
Copy link

nevans commented Nov 20, 2021

Fixed? 🎉 🥳 🎉

@kocolosk
Copy link
Member

Better late than never!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants