Fix flaky ModelInsight tests #395

TuanNguyen27 · 2019-08-27T18:32:59Z

Related issues
https://github.com/salesforce/TransmogrifAI/blob/master/core/src/test/scala/com/salesforce/op/ModelInsightsTest.scala#L740

https://github.com/salesforce/TransmogrifAI/blob/master/core/src/test/scala/com/salesforce/op/ModelInsightsTest.scala#L762

are flaky due to a low tolerance threshold.

Describe the proposed solution
Change the test to compute a smaller ratio.

codecov · 2019-08-27T18:47:30Z

Codecov Report

Merging #395 into master will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##           master     #395   +/-   ##
=======================================
  Coverage   86.82%   86.82%           
=======================================
  Files         336      336           
  Lines       10962    10962           
  Branches      572      572           
=======================================
  Hits         9518     9518           
  Misses       1444     1444

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9815b59...b28865c. Read the comment docs.

Jauntbox · 2019-08-27T19:56:07Z

core/src/test/scala/com/salesforce/op/ModelInsightsTest.scala

@@ -755,8 +755,8 @@ class ModelInsightsTest extends FlatSpec with PassengerSparkFixtureTest with Dou
 val bigCoeffSum = orginalbigCoeff * math.sqrt(smallFeatureVariance) / labelStd + descaledbigCoeff
 val absError2 = math.abs(originalsmallCoeff * math.sqrt(bigFeatureVariance) / labelStd - descaledsmallCoeff)
 val smallCoeffSum = originalsmallCoeff * math.sqrt(bigFeatureVariance) / labelStd + descaledsmallCoeff
- absError / bigCoeffSum < tol shouldBe true
- absError2 / smallCoeffSum < tol shouldBe true
+ absError / (2 * bigCoeffSum) < tol shouldBe true


Why did you hardcode this change instead of adjusting the tolerance? Is there a reason for a factor of 2 here?

i think back when i wrote this test you suggested abs(x_1 - x_2) / 2 * (x_1 + x_2) as one way to compute how close x_1 is to x_2, or did I misremember?

I suggested difference / avg. value, which is abs(x_1 - x_2) * 2 / (x_1 + x_2). It probably makes more sense to change it to that, but you should probably increase the overall tolerance too to cut down on flakiness.

gerashegalov · 2019-08-29T21:58:34Z

core/src/test/scala/com/salesforce/op/ModelInsightsTest.scala

@@ -755,8 +755,8 @@ class ModelInsightsTest extends FlatSpec with PassengerSparkFixtureTest with Dou
 val bigCoeffSum = orginalbigCoeff * math.sqrt(smallFeatureVariance) / labelStd + descaledbigCoeff
 val absError2 = math.abs(originalsmallCoeff * math.sqrt(bigFeatureVariance) / labelStd - descaledsmallCoeff)
 val smallCoeffSum = originalsmallCoeff * math.sqrt(bigFeatureVariance) / labelStd + descaledsmallCoeff
- absError / bigCoeffSum < tol shouldBe true
- absError2 / smallCoeffSum < tol shouldBe true
+ 2 * absError / bigCoeffSum < tol shouldBe true


If I understand your fix correctly that you double the left-hand side the test would also be fixed by:

absError should be < tol * smallCoeffSum

at any rate please use "should be < tol" as it seems to read more clear.

that would also provide a better error message ;) cool!

Bug fixes: - Ensure correct metrics despite model failures on some CV folds [#404](#404) - Fix flaky `ModelInsight` tests [#395](#395) - Avoid creating `SparseVector`s for LOCO [#377](#377) New features / updates: - Model combiner [#385](#399) - Added new sample for HousingPrices [#365](#365) - Test to verify that custom metrics appear in model insight metrics [#387](#387) - Add `FeatureDistribution` to `SerializationFormat`s [#383](#383) - Add metadata to `OpStandadrdScaler` to allow for descaling [#378](#378) - Improve json serde error in `evalMetFromJson` [#380](#380) - Track mean & standard deviation as metrics for numeric features and for text length of text features [#354](#354) - Making model selectors robust to failing models [#372](#372) - Use compact and compressed model json by default [#375](#375) - Descale feature contribution for Linear Regression & Logistic Regression [#345](#345) Dependency updates: - Update tika version [#382](#382)

Update ModelInsightsTest.scala

1084fcf

TuanNguyen27 requested review from leahmcguire and tovbinm as code owners August 27, 2019 18:33

salesforce-cla bot added the cla:signed label Aug 27, 2019

TuanNguyen27 requested a review from Jauntbox August 27, 2019 18:33

TuanNguyen27 added the ready for review label Aug 27, 2019

Jauntbox reviewed Aug 27, 2019

View reviewed changes

TuanNguyen27 and others added 4 commits August 28, 2019 14:22

Update ModelInsightsTest.scala

ba82754

Update ModelInsightsTest.scala

5dc762c

Merge branch 'master' into tn/fix_flaky

9815b59

Merge branch 'master' into tn/fix_flaky

72f8f4b

tovbinm requested review from gerashegalov and wsuchy as code owners August 29, 2019 18:15

gerashegalov suggested changes Aug 29, 2019

View reviewed changes

TuanNguyen27 added 2 commits August 30, 2019 14:23

Update ModelInsightsTest.scala

5afb446

Update ModelInsightsTest.scala

b28865c

tovbinm approved these changes Aug 30, 2019

View reviewed changes

tovbinm merged commit b91ffe3 into master Aug 30, 2019

tovbinm deleted the tn/fix_flaky branch August 30, 2019 22:04

gerashegalov mentioned this pull request Sep 8, 2019

0.6.1 release #403

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix flaky ModelInsight tests #395

Fix flaky ModelInsight tests #395

TuanNguyen27 commented Aug 27, 2019

codecov bot commented Aug 27, 2019 •

edited

Loading

Jauntbox Aug 27, 2019

TuanNguyen27 Aug 27, 2019

Jauntbox Aug 27, 2019

gerashegalov Aug 29, 2019

gerashegalov Aug 29, 2019

tovbinm Aug 29, 2019

Fix flaky ModelInsight tests #395

Fix flaky ModelInsight tests #395

Conversation

TuanNguyen27 commented Aug 27, 2019

codecov bot commented Aug 27, 2019 • edited Loading

Codecov Report

Jauntbox Aug 27, 2019

Choose a reason for hiding this comment

TuanNguyen27 Aug 27, 2019

Choose a reason for hiding this comment

Jauntbox Aug 27, 2019

Choose a reason for hiding this comment

gerashegalov Aug 29, 2019

Choose a reason for hiding this comment

gerashegalov Aug 29, 2019

Choose a reason for hiding this comment

tovbinm Aug 29, 2019

Choose a reason for hiding this comment

codecov bot commented Aug 27, 2019 •

edited

Loading