Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about the calculation of time-gap matrix #294

Closed
GMavrak opened this issue Jan 10, 2024 · 5 comments
Closed

Question about the calculation of time-gap matrix #294

GMavrak opened this issue Jan 10, 2024 · 5 comments

Comments

@GMavrak
Copy link

GMavrak commented Jan 10, 2024

Hello,
I wish you a happy new year.

I have a question about the calculation of the delta matrix. By examining the nested function of the _parse_delta_torch function, the following code for populating the matrix for time-step > 1:
d.append(torch.ones(1, n_features, device=device) + (1 - mask[step]) * d[-1])
Shouldn't be mask[step - 1] instead of mask[step], as the calculation of the time-gap in time-step t depends upon the mask value of the time-step t-1?

Thank you,
George

@WenjieDu
Copy link
Owner

Hi there 👋,

Thank you so much for your attention to PyPOTS! You can follow me on GitHub to receive the latest news of PyPOTS. If you find PyPOTS helpful to your work, please star⭐️ this repository. Your star is your recognition, which can help more people notice PyPOTS and grow PyPOTS community. It matters and is definitely a kind of contribution to the community.

I have received your message and will respond ASAP. Thank you for your patience! 😃

Best,
Wenjie

@WenjieDu
Copy link
Owner

@GMavrak Thank you! I'll investigate this.

@WenjieDu
Copy link
Owner

Hi @GMavrak, I've made PR #297 to keep the delta matrix calculation the same as the paper proposal. Could you please review it?

And here are some of my comments that may deserve attention:

  1. I didn't observe obvious performance change after the modification in my PR;
  2. given a missing mask with 5 time steps and 2 features
array([[False,  True],
       [ True,  True],
       [ True,  True],
       [False,  False],
       [ True,  False]])

Clearly, the missingness masks of the 1st and 2nd features are different, especially at the 5th step.

With the current calculation method, the delta matrix is

array([[0., 0.],
       [1., 1.],
       [1., 1.],
       [2., 2.],
       [1., 3.]])

and is different in the 1st feature and 2nd one.

But with the calculation in the PR, the delta matrix we get is

array([[0., 0.],
       [1., 1.],
       [1., 1.],
       [1., 1.],
       [2., 2.]])

both features have the same values.

It seems that the current calculation is more reasonable.

@GMavrak
Copy link
Author

GMavrak commented Jan 15, 2024

@WenjieDu I have reviewed the PR and confirm that it is OK.

Concerning the comments, I see the point. However, the way that the time-gap matrix is defined, as far as I understand it, is:

  • not dependent on the features, i.e. it does not make sense to compare the features in this context.
  • each element in the delta matrix is the time-gap from the last (valid) observation to the current timestamp, irrespective of the timestamp's value.

Thank you,
George

@WenjieDu
Copy link
Owner

Thanks, George. Merged the PR.

That's true and I agree with you. My comments are left for issue tracking and for related information retrieval in the future.

If you like PyPOTS, please consider joining us on Slack https://join.slack.com/t/pypots-org/shared_invite/zt-1gq6ufwsi-p0OZdW~e9UW_IA4_f1OfxA 🤗

@GMavrak GMavrak closed this as completed Jan 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants