Idea: Bivariate analyses #171

sebastian · 2020-06-19T14:25:13Z

At TeamBank the data scientist said it would be radical if he could, when seeing the analysis/distribution of ages, select a second dimension (say gender) and see a breakdown per age.

If you do multi column analyses then this is likely data you already have available. We should just think about a way of meaningfully exposing it!

dandanlen · 2020-11-02T10:39:46Z

Now that we have multi-column correlations / joint probabilities, this is indeed closer to being a possibility. Enabling this functionality on-demand can be done in one of two ways, one more client-heavy, the other more explorer-heavy:

Package up and returning the joint probability matrices for all available column combinations and then using these client-side to build the 2-dimensional breakdown.

😄 Requires some client-side data analysis - API consumer will need to perform some non-trivial work to generate visualisation data from raw probabilities.
😄 Easy to implement in the explorer.
😞 Requires a lot of extra data to be returned through the API (for n columns there are on the order of n^2 pairs).

Storing the matrices in the explorer and exposing a new API endpoint to request a multi-column data summary.

😄 Reuses existing explorer logic / datatypes for data analysis.
😄 Minimises work on the client-side.
😞 Requires new infrastructure in the explorer to allow persisting exploration results for subsequent requests.

sebastian · 2020-11-02T10:51:57Z

I vote for providing more metrics through the API and letting the client make decisions based on the data.

sebastian · 2020-11-03T14:21:21Z

Let's ignore it for now. We can revisit given time.
What needs to be decided in that case is what shape the data should take to be somewhat consumable.

sebastian changed the title ~~Bivariate analyses~~ Idea: Bivariate analyses Jun 19, 2020

sebastian added the low priority Lower priority and to be downprioritized in favor of other work label Nov 3, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Idea: Bivariate analyses #171

Idea: Bivariate analyses #171

sebastian commented Jun 19, 2020 •

edited

Loading

dandanlen commented Nov 2, 2020

sebastian commented Nov 2, 2020 •

edited

Loading

sebastian commented Nov 3, 2020 •

edited

Loading

Idea: Bivariate analyses #171

Idea: Bivariate analyses #171

Comments

sebastian commented Jun 19, 2020 • edited Loading

dandanlen commented Nov 2, 2020

sebastian commented Nov 2, 2020 • edited Loading

sebastian commented Nov 3, 2020 • edited Loading

sebastian commented Jun 19, 2020 •

edited

Loading

sebastian commented Nov 2, 2020 •

edited

Loading

sebastian commented Nov 3, 2020 •

edited

Loading