-
-
Notifications
You must be signed in to change notification settings - Fork 87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How can I customize my own dataset to fit PyPOTS SOTA imputation models? #141
Comments
Hi there 👋, Thank you so much for your attention to PyPOTS! If you find PyPOTS helpful to your work, please star⭐️ this repository. Your star is your recognition, which can help more people notice PyPOTS and grow PyPOTS community. It matters and is definitely a kind of contribution to the community. I have received your message and will respond ASAP. Thank you for your patience! 😃 Best, |
Hi, thank you for raising this issue. The only thing you need to do is, after your data preprocessing, ensure the shape of your data input into models has 3 dimensions [n_samples, n_steps, n_features]. |
What does n_steps indicate in my dataset? Is Data preprocessing consists of: |
Yes, of course, cleaning and normalization are included in preprocessing. You know, machine learning is not magic, you have to make things prepared for model processing. |
In my case number of time steps in each sample is same as length of dataframe. |
Please try simply to search with google or github, I believe you can figure it out fast. This is not a complicated algorithm, but just a simple method. |
Thanks a lot! |
My pleasure! @abhishekju06 Just remembered that you can find such a sliding window function from data-processing utilities in SAITS repo here. If you using SAITS model for your data imputation and think it's helpful, please kindly consider to star 🌟 SAITS repo to make more people notice this useful model. Many thanks! |
Can you please help me understand
======================== ==================== import h5py for key in f.keys(): for key in group_train.keys(): dataset_for_training = { dataset_for_validating = { dataset_for_testing = { from pypots.optim import Adam saits = SAITS( I am getting this error: File "C:\Users\1234\AppData\Local\Temp\ipykernel_13552\1758113713.py", line 41, in File "C:\Users\1234\Anaconda3\envs\pypots\lib\site-packages\pypots\imputation\saits\model.py", line 420, in fit File "C:\Users\1234\Anaconda3\envs\pypots\lib\site-packages\pypots\imputation\base.py", line 352, in _train_model AttributeError: 'float' object has no attribute 'item' Please help! |
According to the info you provided, I think the error is caused by the input data not properly prepared. Please check whether there are |
Should there be NaN present in my input dataset before I generate the dataset i.e., can my attribute columns contain NaN values? |
Datasets are OK with missing values, of course, PyPOTS is designed for datasets with missing data. But after generation, indicating_mask and X_intact should not have NaNs, and the missing part in X_intact should be imputed with some values like 0. Because PyPOTS will use them for loss calculation. NaN in indicating_mask or X_intact will result in NaN loss, just like in your case. |
So |
Sorry for missing a |
I have replaced NaN values with 0 in the 'indicating_mask'. dataset_for_validating = { If not please let me know what X_intact stands for. In SAITS/dataset_generating_scripts in line 86 X_hat[indices_for_holdout] = np.nan # X_hat contains artificial missing values It is evident that X_hat must contain NaN as it represents artificial missing values. So where I am going wrong? |
Please read the paper first https://arxiv.org/abs/2202.08516. Thanks. |
Hi, Without your help it is not possible for me to understand. |
Please read it carefully and take a look at the model's implementation code here for reference. |
Thanks a ton! |
No problem. If you have further questions regarding the SAITS model, you're welcome to raise issues in SAITS repo https://github.com/WenjieDu/SAITS/issues. |
1. Feature description
I want to run Pypots SOA models for my own dataset.
2. Motivation
I have a multivariate dataset and want to check how PyPots models are working on it for data imputation.
3. Your contribution
None so far
The text was updated successfully, but these errors were encountered: