Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data viewing / Batch Reconstruction utility #35

Closed
5 of 6 tasks
haileyschoelkopf opened this issue Dec 18, 2022 · 1 comment
Closed
5 of 6 tasks

Data viewing / Batch Reconstruction utility #35

haileyschoelkopf opened this issue Dec 18, 2022 · 1 comment
Assignees
Labels
feature request New feature or request

Comments

@haileyschoelkopf
Copy link
Collaborator

haileyschoelkopf commented Dec 18, 2022

I need to upload a utility / sample guide on how to inspect the data ordering / extract a batch at a given timestep. This'll essentially be a cleaned up version of the memorized seq util I'm working on.

Features we want:

  • Verified correct dataloader construction (verify w/ memorization)
  • Tool to extract + save a single timestep's batch to a numpy file
  • Take as argument a YML file from this repo
  • Detach from GPT-NeoX repo fully (or preserve this utility in a separate branch of NeoX)
  • Argparse to select: model, mode, some default statistics?
  • Support stepping through data over all timesteps + recording a statistic
@haileyschoelkopf haileyschoelkopf self-assigned this Dec 18, 2022
@haileyschoelkopf haileyschoelkopf added the feature request New feature or request label Dec 18, 2022
@haileyschoelkopf
Copy link
Collaborator Author

This was completed in #47 !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request
Projects
Status: Done
Development

No branches or pull requests

1 participant