Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pre-processed data for the multimodal representation learning task #34

Closed
hathawayxxh opened this issue Aug 20, 2024 · 5 comments
Closed

Comments

@hathawayxxh
Copy link

Hi Authors,

Thanks for your excellent work. I am very interested in developing algorithms based on the HEST-1k database.
I would like to know how to get access to the pre-processed data for the multimodal representation learning task, which corresponds to the experimental results in Table 2.
I look forward to your reply.

Best,
Xiaohan

@guillaumejaume
Copy link
Collaborator

Hi Xiaohan, experiments presented in Table 2 are based on all human Xenium breast samples (see HEST-1k metadata). You can query those samples using our download pipeline (see tutorial 1). We only did log1p normalization. The code for contrastive alignment is not public yet, but it is quite standard.

@hathawayxxh
Copy link
Author

Hi Guillaume,
Thanks for your prompt reply. I have checked the metadata and noticed there are six Xenium invasive breast cancer samples (i.e., TENX 94-99). However, you indicated that you used five samples to finetune the CONCH model. Could you please indicate which five samples you used for finetuning? Thanks.

@guillaumejaume
Copy link
Collaborator

We used NCBI783 (IDC), NCBI785 (IDC), TENX95 (IDC), TENX99 (IDC) and TENX96 (ILC). Others are duplicates using a different gene panels. You can still use them (3 additional samples) but there is redundancy.

@guillaumejaume
Copy link
Collaborator

You can refer to the patient entry in the metadata.

@hathawayxxh
Copy link
Author

Thanks, I will try.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants