Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modifying WeatherDataset.__getitem__ to return timestamps #64

Open
leifdenby opened this issue Jun 24, 2024 · 2 comments · May be fixed by #66
Open

Modifying WeatherDataset.__getitem__ to return timestamps #64

leifdenby opened this issue Jun 24, 2024 · 2 comments · May be fixed by #66
Assignees

Comments

@leifdenby
Copy link
Member

I'm making good progress on #54 and in going through it I noticed that @sadamov you modified the return signature of WeatherDataset.__getitem__ to also return batch_times (which is looks like are np.datetime64 converted to strings). I can see the use of this for fx being able to plot the input and predictions from the model with timestamps. I think if we want to be able to make these plots with timestamps we can avoid returning the time here too. I'm not sure about using strings thought...

What are your thoughts on this @sadamov and @joeloskarsson?

@joeloskarsson
Copy link
Collaborator

That is indeed the idea of including this in the batch.

I don't have any strong opinions on what format these should have, as long as they can be easily converted to np.datetime64. Can they just be kept np.datetime64? We don't need to turn these into pytorch object really, and there is no need to send them to the GPU.

Optimally we would not have to write a custom collate function (https://pytorch.org/docs/stable/data.html#working-with-collate-fn) for this, but just use the default one (https://github.com/pytorch/pytorch/blob/35c8f93fd238d42aaea8fd6e730c3da9e18257cc/torch/utils/data/dataloader.py#L196). I think it would be sufficient to just batch these up in a python list rather than something more fancy.

@sadamov
Copy link
Collaborator

sadamov commented Jun 24, 2024

Having them in a simple format available for plotting sounds good to me, no strong oppinion about which format exactly (currently it is indeed a list of strings). @leifdenby you once mentioned that you would rather keep track of the datetime in another fashion and remove the batch_times from __get_item__. Did you have another solution in mind already?

@sadamov sadamov linked a pull request Sep 6, 2024 that will close this issue
20 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants