Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

does sequence length of each TS sample have to be the same? #315

Closed
Wulin-Tan opened this issue Mar 19, 2024 · 7 comments
Closed

does sequence length of each TS sample have to be the same? #315

Wulin-Tan opened this issue Mar 19, 2024 · 7 comments
Labels
question Further information is requested

Comments

@Wulin-Tan
Copy link

Issue description

Hi, pypots team
pypots is a great tool!
when I tried pypots example given as the 'physionet_2012' data. It is good that this example contains a lot of samples and each sample is a time series. This is similar to my own data. However, 'physionet_2012' samples has time sequence with same length. But my samples have different sequence length and some missing value, which needs imputation for next-step analysis.

Can I still move on with pypots? Any suggestion?

@Wulin-Tan Wulin-Tan added the question Further information is requested label Mar 19, 2024
@WenjieDu
Copy link
Owner

Hi there 👋,

Thank you so much for your attention to PyPOTS! You can follow me on GitHub to receive the latest news of PyPOTS. If you find PyPOTS helpful to your work, please star⭐️ this repository. Your star is your recognition, which can help more people notice PyPOTS and grow PyPOTS community. It matters and is definitely a kind of contribution to the community.

I have received your message and will respond ASAP. Thank you for your patience! 😃

Best,
Wenjie

@WenjieDu
Copy link
Owner

Hi Wulin, thanks for raising this question. The answer is yes, so far, all algorithms in PyPOTS request the input data samples to have the same length. Actually, this has been discussed in another issue #139. We are planning to make our methods able to handle samples having different lengths. You can follow me on GitHub and join our community to receive the latest news about PyPOTS.

@WenjieDu
Copy link
Owner

BTW, here is my suggestion that can help you continue modeling your dataset with PyPOTS. If your time series samples do not have strict beginnings and ends, you can apply a sliding window function to truncate a long time series sample into several shorter ones. After that, you have data samples sharing the same length and you can continue with PyPOTS.

@Wulin-Tan
Copy link
Author

Wulin-Tan commented Mar 19, 2024

Hi, @WenjieDu
I feel interested about your idea. To make good use of pypots, what window size and step size do you suggest?
Or do you plan to include an example about sliding window + pypots in the tutorial? I would like to give it a try.

@WenjieDu
Copy link
Owner

No, we don't offer such an example in our current tutorials. This idea is not a big deal, and this is more related to data preprocessing than the toolbox itself. But you can utilize the sliding window func offered in PyPOTS https://docs.pypots.com/en/latest/pypots.data.html#pypots.data.utils.sliding_window

If this scenario is important to you, like I mentioned above, I'd suggest you follow me and join PyPOTS community because we will release something awesome in the following months.

@Wulin-Tan
Copy link
Author

Wulin-Tan commented Mar 21, 2024

Hi, @WenjieDu
I am trying your suggestion.

  1. After getting the small chunks of TS by sliding(window_len=20, sliding=15), I got sample A to sub_sample A1 and A2, and sample B to sub_sample B1,B2,B3 (A and B with different length, but A1, A2, B1, B2,B3 with the same length), how can I move on with pypots? Do I need to do pypots to A1/A2/B1/B2/B3 one by one?
  2. And after pypots processing, is there any function in pypots to move the imputed A1/A2/B1/B2/B3 back to the A and B for downstream analysis?
  3. Longing for your further suggestions, thank you.

@WenjieDu
Copy link
Owner

WenjieDu commented Apr 1, 2024

Hi Wulin, yes, you can try to use models in PyPOTS to impute all your A1, A2, B1, B2, B3 samples, then merge them back to A and B raw data samples. Regarding the overlapped parts, you can apply the average strategy, namely average the imputed values of the overlapped parts to get the final imputation results. PyPOTS currently doesn't offer such a function to help do the merging operation. You can construct a function yourself, I believe it's not a difficult thing.

Repository owner locked and limited conversation to collaborators Apr 8, 2024
@WenjieDu WenjieDu converted this issue into discussion #341 Apr 8, 2024

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants