Allow some tolerance in test comparison #219

castelao · 2024-06-17T16:22:21Z

Is it OK to downgrade the comparison from array_equal to all_close with equal_nan? In this particular case I don't see a reason why this wouldn't be necessarily identical, thus array_equal should work, but this is now failing on kestrel. Have anyone had the same issue?

Note that the default is relative tol = 1e-5 & absolute tol = 1e-8. So the question is if this is acceptable or we need to dive into to understand why this tiny difference.

Note that the default is relative tol = 1e-5 & absolute tol = 1e-8. I'm also considering NaNs as equal values.

grantbuster · 2024-06-17T16:38:49Z

Is it OK to downgrade the comparison from array_equal to all_close with equal_nan? In this particular case I don't see a reason why this wouldn't be necessarily identical, thus array_equal should work, but this is now failing on kestrel. Have anyone had the same issue?

Note that the default is relative tol = 1e-5 & absolute tol = 1e-8. So the question is if this is acceptable or we need to dive into to understand why this tiny difference.

Can you answer:

why are there NaNs in the output? What fraction of the outputs are NaN? I don't think fwp should ever result in NaN outputs?
It looks like the outputs are u100m/v100m, what is the max difference?

castelao · 2024-06-17T17:03:49Z

Removed nan_equal.
Max difference: 1.92e-6
allclose() pass with rtol=1e-05, atol=1e-10.

@grantbuster

As suggested by @grantbuster since it was not supposed to have any NaN.

grantbuster · 2024-06-17T17:31:30Z

Removed nan_equal.

Max difference: 1.92e-6

allclose() pass with rtol=1e-05, atol=1e-10.

Okay this all looks good. Just wanted to make sure there weren't nan's in the output... that would be problematic haha

castelao · 2024-06-17T17:38:42Z

Removed nan_equal.

Max difference: 1.92e-6

allclose() pass with rtol=1e-05, atol=1e-10.

Okay this all looks good. Just wanted to make sure there weren't nan's in the output... that would be problematic haha

In that case, shall I place back the nan_equal, and add another check if is there any NaN? If this ever fails we would have a better hint on what is the problem.

Another question. I'm intrigued by why we didn't have this difference before. I know that something changed on CUDNN to approximate some values for speed, but I have no idea if that is the source. This difference seems small enough to ignore for now, but should I open an issue so we don't forget to maybe check this in the future?

grantbuster · 2024-06-17T17:53:00Z

Removed nan_equal.

Max difference: 1.92e-6

allclose() pass with rtol=1e-05, atol=1e-10.

Okay this all looks good. Just wanted to make sure there weren't nan's in the output... that would be problematic haha

In that case, shall I place back the nan_equal, and add another check if is there any NaN? If this ever fails we would have a better hint on what is the problem.

Another question. I'm intrigued by why we didn't have this difference before. I know that something changed on CUDNN to approximate some values for speed, but I have no idea if that is the source. This difference seems small enough to ignore for now, but should I open an issue so we don't forget to maybe check this in the future?

don't add the nan_equal... Current test will fail if there are any nans which is appropriate. If you want to add another line that is assert not np.isnan(arr).any() that makes sense to me. Fine to open an issue if you think it will be useful but i dont want you spending a ton of time tracking this down as this really should not affect the sup3r outputs.

@grantbuster

* Allow some tolerance in test comparison Note that the default is relative tol = 1e-5 & absolute tol = 1e-8. I'm also considering NaNs as equal values. * Removing nan_equal As suggested by @grantbuster since it was not supposed to have any NaN.

Allow some tolerance in test comparison

d8dfd06

Note that the default is relative tol = 1e-5 & absolute tol = 1e-8. I'm also considering NaNs as equal values.

castelao requested review from grantbuster and bnb32 June 17, 2024 16:22

castelao self-assigned this Jun 17, 2024

Removing nan_equal

11f0547

As suggested by @grantbuster since it was not supposed to have any NaN.

grantbuster approved these changes Jun 17, 2024

View reviewed changes

castelao merged commit 2f53a4c into main Jun 17, 2024
9 checks passed

castelao deleted the fix/forward_pass_test_tolerance branch June 17, 2024 18:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow some tolerance in test comparison #219

Allow some tolerance in test comparison #219

castelao commented Jun 17, 2024

grantbuster commented Jun 17, 2024

castelao commented Jun 17, 2024

grantbuster commented Jun 17, 2024

castelao commented Jun 17, 2024

grantbuster commented Jun 17, 2024

Allow some tolerance in test comparison #219

Allow some tolerance in test comparison #219

Conversation

castelao commented Jun 17, 2024

grantbuster commented Jun 17, 2024

castelao commented Jun 17, 2024

grantbuster commented Jun 17, 2024

castelao commented Jun 17, 2024

grantbuster commented Jun 17, 2024