-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some model checkpoints are expired or not available #14
Comments
Hi @vishaal27, |
Hey @rtaori, thanks for your blazingly fast response! Is there anyone else with access who would be able to check this? |
Potentially, let me check. But if you don't hear back within the next week, then probably there's no way to get these checkpoints :( |
Sure thanks for checking, really appreciate this :) |
Hey,
I was running model evaluations on my own custom data-split for all models in the registry using:
where
<model>
comes from all the models in the registry (python db.py --list-models-registry
).However, for many of the models, I see a pickling error due to the checkpoint not being loaded correctly. See stack-trace below:
I see that this error happens for all of the low-resource models like
resnet18_100k_x_epochs
,resnet18_50k_x_epochs
etc. To fully ensure this is not an artefact of my own custom data-split, I also tested this on the imagenet-val split with no success.Are the low-resource models not available as checkpoints from the server?
Also, another set of errors I get when running this is due to some checkpoints still being stored on the vasa endpoint, see:
Are some of the checkpoints not migrated fully yet?
Sorry for the long verbose issue, but hope we can get this resolved :)
The text was updated successfully, but these errors were encountered: