Ensure correct metrics despite model failures on some CV folds #404

leahmcguire · 2019-09-09T20:28:49Z

Related issues
GLM optimizer is not very stable so sometimes runs will fail in some CV folds but not others - this was causing metrics and grids to not be computed correctly since we assumed that the same number of parameter grids successfully run in each cross validation fold

Describe the proposed solution
throw out grids which are not in every cv run (since they are likely to fail in some modeling runs)

Describe alternatives you've considered
use grids with metrics in a map to make sure that the correct metrics are combined in CV and simply take the best metric

Additional context
Add any other context about the changes here.

…lculations correct

codecov · 2019-09-09T20:48:25Z

Codecov Report

Merging #404 into master will decrease coverage by 0.01%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master     #404      +/-   ##
==========================================
- Coverage   86.96%   86.95%   -0.02%     
==========================================
  Files         337      337              
  Lines       11054    11060       +6     
  Branches      361      591     +230     
==========================================
+ Hits         9613     9617       +4     
- Misses       1441     1443       +2

Impacted Files	Coverage Δ
...orce/op/stages/impl/tuning/OpCrossValidation.scala	`97.95% <100%> (+0.28%)`	⬆️
...es/src/main/scala/com/salesforce/op/OpParams.scala	`85.71% <0%> (-4.09%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 95a77b1...6687a34. Read the comment docs.

tovbinm · 2019-09-09T21:05:56Z

core/src/main/scala/com/salesforce/op/stages/impl/tuning/OpCrossValidation.scala

- log.info(s"Best set of parameters:\n${grid(bestIndex)}")
+ require(folds.map(_.model.uid).toSet.size == 1) // Should be called only on instances of the same model
+ val gridCounts = folds.map(_.grids.map(_ -> 1).toMap).reduce(_ + _)
+ val maxFolds = gridCounts.maxBy(_._2)._2


For better readability please replace _._2 in val maxFolds = gridCounts.maxBy(_._2)._2 with val (_, maxFolds) = gridCounts.maxBy { case (_, folds) => folds }
Same in a few operations below.

leahmcguire · 2019-09-09T21:58:48Z

@tovbinm @gerashegalov I discovered some fun behavior using algebird implicit for maps:

val test = Seq(Map("a" -> 1, "b" -> 1), Map("a" -> 1, "b" -> 1))
val combined = test.reduce( _ + _) // Map("a" -> 2, "b" -> 2)) as expected

val test = Seq(Map("a" -> 0, "b" -> 0), Map("a" -> 0, "b" -> 0))
val combined = test.reduce( _ + _) // Map() :-(

Think I may need to write out the reduce function...

…eird aggregator behavior

gerashegalov · 2019-09-09T23:32:17Z

core/src/main/scala/com/salesforce/op/stages/impl/tuning/OpCrossValidation.scala

- else metrics.zipWithIndex.minBy(_._1)
- log.info(s"Best set of parameters:\n${grid(bestIndex)}")
+ require(folds.map(_.model.uid).toSet.size == 1) // Should be called only on instances of the same model
+ val gridCounts = folds.map(_.grids.map(_ -> 1).toMap).reduce(_ + _)


is it the same as

folds.flatMap(_.grids.map(_ -> 1)).sumByKey

gerashegalov · 2019-09-10T00:00:17Z

core/src/main/scala/com/salesforce/op/stages/impl/tuning/OpCrossValidation.scala

+ val keys = m1.keySet.union(m2.keySet)
+ keys.map(k => k -> (m1.getOrElse(k, 0.0) + m2.getOrElse(k, 0.0))).toMap
+ }
+ .filterKeys(gridsIn.contains)


let us filter first, so we have less to reduce, maybe

val gridMetrics = folds.flatMap(f => f.grids.zip(f.metrics)) .collect { case (pm, met) if gridsIn.contains(pm) => (pm, met / maxFolds) } .sumByKey

tovbinm · 2019-09-10T05:32:06Z

@leahmcguire Map behavior is as expected under monoid rules. In order to keep the map keys with 0 values use semigroup for map values. Example for Map[String, Long] below:

import com.twitter.algebird._
import com.twitter.algebird.Operators._

implicit val longSemigroup = Semigroup.from[Long](_ + _)
implicit val mapLongMonoid = Monoid.mapMonoid[String, Long](longSemigroup)

// works!
(Map("a" -> 0L) + Map("b" -> 0L)) shouldBe Map("a" -> 0L, "b" -> 0L)

gerashegalov · 2019-09-10T17:10:59Z

core/src/main/scala/com/salesforce/op/stages/impl/tuning/OpCrossValidation.scala

- if (evaluator.isLargerBetter) metrics.zipWithIndex.maxBy(_._1)
- else metrics.zipWithIndex.minBy(_._1)
- log.info(s"Best set of parameters:\n${grid(bestIndex)}")
+ require(folds.map(_.model.uid).toSet.size == 1) // Should be called only on instances of the same model


good to be defensive, my concern we call it in iteration in a private method completely in the scope here, and by construction we already know that folds are for the same model.

Sure, I can remove and just put a description on the method

gerashegalov

LGTM, some comments

…m/cvFix

tovbinm

nice!

Bug fixes: - Ensure correct metrics despite model failures on some CV folds [#404](#404) - Fix flaky `ModelInsight` tests [#395](#395) - Avoid creating `SparseVector`s for LOCO [#377](#377) New features / updates: - Model combiner [#385](#399) - Added new sample for HousingPrices [#365](#365) - Test to verify that custom metrics appear in model insight metrics [#387](#387) - Add `FeatureDistribution` to `SerializationFormat`s [#383](#383) - Add metadata to `OpStandadrdScaler` to allow for descaling [#378](#378) - Improve json serde error in `evalMetFromJson` [#380](#380) - Track mean & standard deviation as metrics for numeric features and for text length of text features [#354](#354) - Making model selectors robust to failing models [#372](#372) - Use compact and compressed model json by default [#375](#375) - Descale feature contribution for Linear Regression & Logistic Regression [#345](#345) Dependency updates: - Update tika version [#382](#382)

allow for model to fail on some CV folds and still get the metrics ca…

ca2afde

…lculations correct

leahmcguire requested review from gerashegalov, Jauntbox, tovbinm and wsuchy as code owners September 9, 2019 20:28

fixed import

cf373e8

leahmcguire requested a review from mweilsalesforce September 9, 2019 20:30

changed strategy to throw out partial failures

e5b76c0

leahmcguire mentioned this pull request Sep 9, 2019

0.6.1 release #403

Merged

tovbinm reviewed Sep 9, 2019

View reviewed changes

got rid of _._1 notation and fixed test that was failing because of w…

d9f07eb

…eird aggregator behavior

gerashegalov reviewed Sep 10, 2019

View reviewed changes

leahmcguire added 3 commits September 10, 2019 09:24

changed to flatmap and sumbykey

5f92294

removing map

7f40ad2

remove toSeq

8fe125e

gerashegalov reviewed Sep 10, 2019

View reviewed changes

gerashegalov approved these changes Sep 10, 2019

View reviewed changes

leahmcguire and others added 2 commits September 10, 2019 10:26

removed require

7f8eb4b

scalastyle fix

85eeab2

salesforce-cla bot added the cla:signed label Sep 10, 2019

leahmcguire added 7 commits September 10, 2019 11:12

cleanup

917f13c

Merge branch 'lm/cvFix' of github.com:salesforce/TransmogrifAI into l…

872fe35

…m/cvFix

explicity added summation rules

2714cd5

test fix

842d016

test fix

39189a2

test fix

71649f1

test fix

17cb9fa

tovbinm approved these changes Sep 10, 2019

View reviewed changes

cleanup

6687a34

gerashegalov changed the title ~~allow for model to fail on some CV folds and still get the metrics calculations correct~~ Ensure correct metrics despite model failures on some CV folds Sep 11, 2019

gerashegalov merged commit d223c46 into master Sep 11, 2019

gerashegalov deleted the lm/cvFix branch September 11, 2019 04:42

leahmcguire restored the lm/cvFix branch September 11, 2019 18:50

leahmcguire deleted the lm/cvFix branch September 11, 2019 19:23

tovbinm mentioned this pull request Sep 25, 2019

XGBoost error code 255 #181

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ensure correct metrics despite model failures on some CV folds #404

Ensure correct metrics despite model failures on some CV folds #404

leahmcguire commented Sep 9, 2019 •

edited

Loading

codecov bot commented Sep 9, 2019 •

edited

Loading

tovbinm Sep 9, 2019

leahmcguire commented Sep 9, 2019

gerashegalov Sep 9, 2019

gerashegalov Sep 10, 2019

tovbinm commented Sep 10, 2019 •

edited

Loading

gerashegalov Sep 10, 2019

leahmcguire Sep 10, 2019

gerashegalov left a comment

tovbinm left a comment

Ensure correct metrics despite model failures on some CV folds #404

Ensure correct metrics despite model failures on some CV folds #404

Conversation

leahmcguire commented Sep 9, 2019 • edited Loading

codecov bot commented Sep 9, 2019 • edited Loading

Codecov Report

tovbinm Sep 9, 2019

Choose a reason for hiding this comment

leahmcguire commented Sep 9, 2019

gerashegalov Sep 9, 2019

Choose a reason for hiding this comment

gerashegalov Sep 10, 2019

Choose a reason for hiding this comment

tovbinm commented Sep 10, 2019 • edited Loading

gerashegalov Sep 10, 2019

Choose a reason for hiding this comment

leahmcguire Sep 10, 2019

Choose a reason for hiding this comment

gerashegalov left a comment

Choose a reason for hiding this comment

tovbinm left a comment

Choose a reason for hiding this comment

leahmcguire commented Sep 9, 2019 •

edited

Loading

codecov bot commented Sep 9, 2019 •

edited

Loading

tovbinm commented Sep 10, 2019 •

edited

Loading