Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

frequency: add --other option #1774

Closed
jqnatividad opened this issue Apr 23, 2024 · 5 comments · Fixed by #1775
Closed

frequency: add --other option #1774

jqnatividad opened this issue Apr 23, 2024 · 5 comments · Fixed by #1775

Comments

@jqnatividad
Copy link
Owner

When compiling frequency tables with the different limit options, give the user the option to create an "other" aggregation, summing up the other values beyond the limit, e.g. if we run frequency with the default limit of 10...

field,value,count
state,NY,100
state,NJ,70
state,CA,60
state,MA,55
state,FL,45
state,TX,43
state,NM,40
state,AZ,39
state,NV,38
state,MI,35
state,OTHER,250
@rzmk
Copy link
Collaborator

rzmk commented Apr 23, 2024

Another thing that may be useful is adding a flag to also add a column for the percentage/decimal based on the frequency/(total rows). Could maybe use qsv count for the total rows part.

@rzmk
Copy link
Collaborator

rzmk commented Apr 23, 2024

Also I'm curious how --other would handle if a value itself is named OTHER.

@jqnatividad
Copy link
Owner Author

The percentage option is a great idea. And for --other, maybe we can add another option called --other-text with a default value of OTHER. It should be easy to add some field value collision logic on the off chance a valid field value is OTHER.

@rzmk
Copy link
Collaborator

rzmk commented Apr 23, 2024

Maybe for --other it could be consolidated to --other <text> with default OTHER?

@jqnatividad
Copy link
Owner Author

jqnatividad commented Apr 23, 2024

Given @rzmk 's feedback, we'll add a percentage column:

field, value, count, percentage
state, NY, 100, 12.90
state, NJ, 70, 9.03
state, CA, 60, 7.74
state, MA, 55, 7.10
state, FL, 45, 5.81
state, TX, 43, 5.55
state, NM, 40, 5.16
state, AZ, 39, 5.03
state, NV, 38, 4.90
state, MI, 35, 4.52
state, OTHER (40), 250, 32.26

with the OTHER text being followed by the sum of other unique field values, enclosed in parentheses.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants