Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Keys in hdf5 #31

Open
maverick0004 opened this issue Jun 14, 2019 · 2 comments
Open

Keys in hdf5 #31

maverick0004 opened this issue Jun 14, 2019 · 2 comments

Comments

@maverick0004
Copy link

Hi, nice work done here.
I wanted to ask that in after pre processing raw data to hdf5 file there were primary, mask and tertiary keys so this means the model training only looks at amino acid sequence but according to AlQuraishi's paper shouldn't the input be amino acid sequence + PSSM ?

@JeppeHallgren
Copy link
Collaborator

Hey @maverick0004! Correct, currently this only uses the amino acid sequence. However, since the PSSM data it is in the ProteinNet data set it should be quick to include it in the hdf5/model :) Relevant code parsing the ProteinNet format is here https://github.com/OpenProtein/openprotein/blob/master/preprocessing.py#L53

@maverick0004
Copy link
Author

maverick0004 commented Jun 17, 2019

@JeppeHallgren So if just taking the amino acid sequence as input aren't the predictions less accurate than the ones using sequence + PSSM as done by AlQuraishi ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants