Documentation for data format #91

SichongP · 2020-03-17T05:36:08Z

A draft documentation for data format. Some html tables rendered fine with sphinx but seems weird on github. This is my first time working with rst files so let me know if there are any errors. Thanks :)

jnothman · 2020-03-17T22:26:15Z

This is looking like a good start, though I am not particularly happy with all the HTML tables in there. I think I'd be happier using nbsphinx so that you can just implement the doc in a Jupyter notebook. WDYT?

jnothman · 2020-03-17T22:26:52Z

Please also add the data format guide to the toctree in index.rst.

jnothman · 2020-03-17T22:30:47Z

The current version here is focused on "here's a representation and here's what it means and how to use it". Do you think it would be more informative if it was structured around different use cases (as listed in #57) and for each use case would give some examples of how you may already have the data represented, describing how to transform it for use in upsetplot?

SichongP · 2020-03-17T22:50:00Z

This is looking like a good start, though I am not particularly happy with all the HTML tables in there. I think I'd be happier using nbsphinx so that you can just implement the doc in a Jupyter notebook. WDYT?

I actually wrote it in notebook so yes nbsphinx sounds like a great idea!

SichongP · 2020-03-17T23:26:12Z

The current version here is focused on "here's a representation and here's what it means and how to use it". Do you think it would be more informative if it was structured around different use cases (as listed in #57) and for each use case would give some examples of how you may already have the data represented, describing how to transform it for use in upsetplot?

I thought that it'd be helpful to start by explaining the data structures required for input in more detail so people can reformat their data (which may come in a wide variety) to this format.

I do think that a few examples of different use cases can be very helpful. So here are my understandings of your points:

Representing counts only

Is movies dataset good enough for this use case? We can demonstrate how to use subset_size='count' with a DataFrame as well as how to group data, filter by counts, and plot with sum_over

Representing data elements and their associated sets

Are you talking about something like this?

[
    ['cat0', 'cat1'],
    ['cat0', 'cat2'],
    ['cat1']
]

I find it difficult to see a use case with this kind of data format... Do you have an example I can start with?

Representing additional attributes for each data element

Would this be the Boston data example in Doc? I think that is by itself a great example and could just be used here.

jnothman · 2020-03-17T23:30:18Z

I actually wrote it in notebook so yes nbsphinx sounds like a great idea!

Are you able to make this change?

SichongP · 2020-03-18T18:58:27Z

I actually wrote it in notebook so yes nbsphinx sounds like a great idea!
Are you able to make this change?

Okay hopefully it should be working with my ipynb file now.

jnothman · 2021-06-06T15:24:12Z

Thank you @SichongP! sorry for the delay!

Documentation for data format

d8ceb51

SichongP mentioned this pull request Mar 17, 2020

Warnlarge #84

Closed

SichongP added 3 commits March 18, 2020 11:41

remove rst files

c4bf465

add ipynb file

1d74ed1

add nbsphinx module

69c9440

SichongP and others added 2 commits March 21, 2020 08:06

Add format guide doc to index

a4bcba2

Tweaks to style

e2d27d1

jnothman merged commit b5fb27f into jnothman:master Jun 6, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Documentation for data format #91

Documentation for data format #91

SichongP commented Mar 17, 2020

jnothman commented Mar 17, 2020

jnothman commented Mar 17, 2020

jnothman commented Mar 17, 2020

SichongP commented Mar 17, 2020

SichongP commented Mar 17, 2020

jnothman commented Mar 17, 2020 via email

SichongP commented Mar 18, 2020

jnothman commented Jun 6, 2021

Documentation for data format #91

Documentation for data format #91

Conversation

SichongP commented Mar 17, 2020

jnothman commented Mar 17, 2020

jnothman commented Mar 17, 2020

jnothman commented Mar 17, 2020

SichongP commented Mar 17, 2020

SichongP commented Mar 17, 2020

jnothman commented Mar 17, 2020 via email

SichongP commented Mar 18, 2020

jnothman commented Jun 6, 2021