Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle cytokit csv #14

Open
mccalluc opened this issue Feb 25, 2020 · 2 comments
Open

Handle cytokit csv #14

mccalluc opened this issue Feb 25, 2020 · 2 comments

Comments

@mccalluc
Copy link
Contributor

mccalluc commented Feb 25, 2020

Jesus Penaloza gave us a sample CSV with this header:

region_index,tile_index,tile_x,tile_y,rid,rx,ry,id,x,y,z,cm:circularity,cm:diameter,cm:diameter_vx,cm:perimeter,cm:size,cm:size_vx,cm:solidity,nm:circularity,nm:diameter,nm:diameter_vx,nm:perimeter,nm:size,nm:size_vx,nm:solidity,cg:n_neighbors,cg:neighbor_ids,cg:adj_neighbor_pct,cg:adj_bg_pct,cb:on_border,nb:on_border,ci:DAPI-002:mean,ci:CD31:mean,ci:CD8:mean,ci:CD45:mean,ci:DAPI-003:mean,ci:CD20:mean,ci:Ki67:mean,ci:CD3e:mean,ci:DAPI-004:mean,ci:Actin:mean,ci:Podoplanin:mean,ci:CD68:mean,ci:DAPI-005:mean,ci:PanCK:mean,ci:CD21:mean,ci:CD4:mean,ci:DAPI-006:mean,ci:EMPTY:mean,ci:CD45RO:mean,ci:CD11c:mean,ci:DAPI-007:mean,ci:EMPTY:mean.1,ci:E_CAD:mean,ci:CD107a:mean,ci:DAPI-008:mean,ci:EMPTY:mean.2,ci:CD44:mean,ci:HistoneH3:mean,ni:DAPI-002:mean,ni:CD31:mean,ni:CD8:mean,ni:CD45:mean,ni:DAPI-003:mean,ni:CD20:mean,ni:Ki67:mean,ni:CD3e:mean,ni:DAPI-004:mean,ni:Actin:mean,ni:Podoplanin:mean,ni:CD68:mean,ni:DAPI-005:mean,ni:PanCK:mean,ni:CD21:mean,ni:CD4:mean,ni:DAPI-006:mean,ni:EMPTY:mean,ni:CD45RO:mean,ni:CD11c:mean,ni:DAPI-007:mean,ni:EMPTY:mean.1,ni:E_CAD:mean,ni:CD107a:mean,ni:DAPI-008:mean,ni:EMPTY:mean.2,ni:CD44:mean,ni:HistoneH3:mean

... and then it has 120K rows of data.

We want to make this into cells.json/arrow. Other questions:

Q:

For the first columns in the file, like “cm:diameter_vx” or “nm:solidity” is there documentation about the meaning of these fields?

A (Maria Keays):

Cytokit’s documentation is a bit thin currently, I think I’ve seen something somewhere about what these abbreviations mean so will try and dig that out … if memory servers “cm” means “cell morphology” and “ni” means “nucleus intensity”

Q:

To clarify the division of responsibilities, your pipelines would not give us the polygon, but instead we would need to compute it, given centroids and the segmentation mask?

A (Jesus Penaloza):

Yes.

Q:

We’ve seen something like this before, and weren’t sure what the “EMPTY”s meant. Or would you be happy if we just presented the headers the same downsteam?

A:

Empty headers and black should be disregard downstream. This files are use for background subtraction, but is not need for analysis

Q:

Is the neighbor information something you’d like us to do something with?

A:

It will be great to see this since it could help further classify each single cell not only by marker but also by proximity

Q:

Each row is one cell, right?

A:

Yes sir

@mccalluc
Copy link
Contributor Author

I think this was asked in a separate channel which I can't find right now, but can the CSV which my code receives be given a more distinctive name? Perhaps cytokit.csv, or something even more descriptive, so we can distinguish it from other CSVs that might be in the directory.

@mccalluc
Copy link
Contributor Author

I learned on the call this morning that this is not the correct file. Waiting to be pointed at the correct one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant