-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Scaling with NaN #135
Comments
If needed we could refactor the code to make this possible, but for it has never come up until now. The reason is that we normally only calculate metrics like correlation etc. on matched time series in which both time series have NaN values removed already. What is the application of having a scaled time series including NaN values? The NaN values do not change anyway and if you need arrays of consistent size across your application you can add them back after the scaling is done. |
It's not about keeping the nans, but keeping values in the candidate that don't have a counter part in the reference. |
What you want is probably something similar to what we already provide for CDF matching. You want to calculate slope and intercept in one function and then apply it in another. See https://github.com/TUW-GEO/pytesmo/blob/master/pytesmo/scaling.py#L240 and https://github.com/TUW-GEO/pytesmo/blob/master/pytesmo/scaling.py#L266 for the CDF matching example. Feel free to refactor the |
exactly. I will try something like this, if you want I can make a PR, and then you can decide if you want to include it. |
Please make a PR. It shouldn't change much and could be useful. |
I made a PR. It was a very minor change after all but "fixes" the linreg scaling as I described above, if you want it in a separate function, we could do that (but I dont see a benefit yet). |
Hi! I'm using the linreg scaling for bias correction. Currently I get an error when the candidate and reference time series contain NaNs (at different points in time), because the regression cannot be calculated if the 2 TS don't match. But why can't we calculate the model from the coinciding values (dropna before linreg) and apply the correction to ALL values of the candidate (with nans)? Then the not-nan candidate values would be scaled and no values are dropped? I think this would be a better solution, but I guess there's also a reason why you implemented it differently?
The text was updated successfully, but these errors were encountered: