Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Making model selectors robust to failing models #372

Merged
merged 14 commits into from
Aug 2, 2019
Merged

Conversation

leahmcguire
Copy link
Collaborator

@leahmcguire leahmcguire commented Jul 26, 2019

Related issues
#370

Describe the proposed solution
Make model selector no fail when a portion of attempted models don't work. Also exposed parameter to limit max time will wait for modeling to finish in model selector

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context about the changes here.

@codecov
Copy link

codecov bot commented Jul 26, 2019

Codecov Report

Merging #372 into master will increase coverage by 0.02%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master    #372      +/-   ##
=========================================
+ Coverage   86.77%   86.8%   +0.02%     
=========================================
  Files         336     336              
  Lines       10922   10928       +6     
  Branches      335     354      +19     
=========================================
+ Hits         9478    9486       +8     
+ Misses       1444    1442       -2
Impacted Files Coverage Δ
...op/stages/impl/selector/ModelSelectorFactory.scala 85.71% <ø> (ø) ⬆️
...sification/BinaryClassificationModelSelector.scala 98.24% <ø> (ø) ⬆️
...ssification/MultiClassificationModelSelector.scala 97.56% <ø> (ø) ⬆️
...a/com/salesforce/op/filters/RawFeatureFilter.scala 92.93% <ø> (ø) ⬆️
...op/stages/impl/tuning/OpTrainValidationSplit.scala 100% <ø> (ø) ⬆️
.../src/main/scala/com/salesforce/op/OpWorkflow.scala 87.5% <100%> (ø) ⬆️
...sforce/op/stages/impl/selector/ModelSelector.scala 98.18% <100%> (-0.07%) ⬇️
...salesforce/op/stages/impl/tuning/OpValidator.scala 94.59% <100%> (+0.47%) ⬆️
...orce/op/stages/impl/tuning/OpCrossValidation.scala 97.67% <100%> (ø) ⬆️
...p/stages/impl/selector/DefaultSelectorParams.scala 100% <100%> (ø) ⬆️
... and 2 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b505ff7...365bbd6. Read the comment docs.

log.info(s"Got metric $metric for model $name trained with ${params(i)}.")
metrics(i) = metric
val paramsMetrics = params.map { p =>
Try {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might wanna consider to parallelize by splits AND models, I.e just use futures instead of tries nested inside futures.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried that but I think the copy of the model causes some kind of lock - when I had them both in futures the first test where the model selectors are ran for 30 minutes without ever finishing....

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let’s put a todo there or perhaps Chris S. Can help?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tovbinm @leahmcguire I was able to replace params with params.par and although it worked it was much slower than one would expect. First, does it make sense to fit and eval all models in parallel when there is only one spark context?
BTW do you have jvisualvm avail in Zulu?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can always get visualvm standalone https://visualvm.github.io/download.html

Copy link
Contributor

@wsuchy wsuchy Jul 29, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#373 - works fine for me. Though I am not sure about running anything in parallel with spark context as dependency.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please refer to the implementation of cross validation in Spark (it uses Futures and there is a good reason why).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And that’s why I propose to stay with Futures, but just use them correctly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tovbinm If you are referring to https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/tuning/CrossValidator.scala then the only difference with my PR is that they are using foldMetricFutures.map(ThreadUtils.awaitResult(_, Duration.Inf)) instead of Future.sequence therefore can you elaborate more?


val data = sc.parallelize(rawData).toDF("label", "features")
data.show()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No show() in tests.

log.warn(s"Model attempted in model selector failed with following issue: \n${e.getMessage}")
None
}}
val summary = SparkThreadUtils.utils.awaitResult(Future.sequence(summaryOfAttempts), maxWait).flatten.toArray
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps it’s worth to log the maxWait value

Copy link
Contributor

@Jauntbox Jauntbox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any explicit changes to the ModelInsights, so what does it look like there when a model fails. Is it easy to parse out a list of failed model types / settings from ModelInsights if we wanted to add these to downstream metrics?

It looks like the summary for a failed model is None, which then disappears when it's flatMapd over, is that right?

@leahmcguire
Copy link
Collaborator Author

If a model fails it doesnt appear in model insights - if you wanted to know which models you tried and failed you would need to compare the modelselector params to the sequence of evaluated models

@Jauntbox
Copy link
Contributor

If a model fails it doesnt appear in model insights - if you wanted to know which models you tried and failed you would need to compare the modelselector params to the sequence of evaluated models

Ok, so for our use case we could still get a list of failed models with that comparison.

What about more complicated cases where we have a random search or a data-dependent hyperparameter search? It looks like for random search we'd still have the info stored in the ParamMap in the models argument to ModelSelector so we'd probably do the same thing for a data-dependent hyperparam search too?

@leahmcguire
Copy link
Collaborator Author

Maybe - I still need to think about how we would implement data dependent hyperparmeter search. I can make it a requirement that the models tried exist somewhere...

}


it should "fail when all models fail" in {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you explain in comments or assertions which model fails and why in each case?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maxWait could be tested explicitly with some short duration.

val dataUse = dataOpt.getOrElse(data)

val theBestEstimator = validator.validate(modelInfo = modelsUse, dataset = dataUse,
label = in1.name, features = in2.name, dag = dag, splitter = splitter
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we use labelColName everywhere for consistency?

@leahmcguire
Copy link
Collaborator Author

addressed all the comments - can someone approve?

}
setInputSchema(dataset.schema).transformSchema(dataset.schema)
require(!dataset.isEmpty, "Dataset cannot be empty")
val data = dataset.select(in1.name, in2.name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: labelColName?

setInputSchema(dataset.schema).transformSchema(dataset.schema)
require(!dataset.isEmpty, "Dataset cannot be empty")
val data = dataset.select(in1.name, in2.name)
val (BestEstimator(name, estimator, summary), splitterSummary, datasetWithID) = bestEstimator.map{ e =>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: whitespace inconsistent, some map { and map{

val model = estimator.fit(train, p).asInstanceOf[M]
val metric = evaluator.evaluate(model.transform(test, p))
log.info(s"Got metric $metric for model $name trained with $p.")
Some(p -> metric)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we seem to prefer Option(blah) many places

metrics(i) = metric

val paramsMetricsF = params.seq.map { p =>
val f = Future {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we can get away without defining f since we don't use it much

Future {
//
}.recover {
//
}

Some(p -> metric)
}
f.recover({ case e: Throwable =>
log.warn(s"Model $name attempted in model selector with failed with following issue: \n${e.getMessage}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let us add the raw e as a second arg to warn to get a stack trace if logger is configured this way


val summaryOfAttempts = summaryFuts.map { f => f.map(Option(_)).recover {
case e: Throwable =>
log.warn(s"Model attempted in model selector failed with following issue: \n${e.getMessage}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prefer adding raw e as the second arg to logger

parallelism = 4,
seed = 10L,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the args are not in particular order, so can we put it back to the original spot to eliminate this diff

parallelism = 4,
seed = 10L,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let us get rid of that non-diff

summary2.selectedModelInfo.get.validationResults
).forall{ case (v1, v2) =>
println(v1.metricValues.asInstanceOf[SingleMetric].value, v2.metricValues.asInstanceOf[SingleMetric].value)
v1.metricValues.asInstanceOf[SingleMetric].value < v2.metricValues.asInstanceOf[SingleMetric].value
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: indentation

.setInput(label, features)


intercept[Exception](testEstimator.fit(data))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let us use more specific exceptions in intercepts. e.g. TimeoutException here

Copy link
Contributor

@Jauntbox Jauntbox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@gerashegalov gerashegalov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tovbinm tovbinm merged commit 496174c into master Aug 2, 2019
@tovbinm tovbinm deleted the lm/modelFail branch August 2, 2019 05:02
TuanNguyen27 added a commit that referenced this pull request Aug 2, 2019
@gerashegalov gerashegalov mentioned this pull request Sep 8, 2019
gerashegalov added a commit that referenced this pull request Sep 11, 2019
Bug fixes:
- Ensure correct metrics despite model failures on some CV folds [#404](#404)
- Fix flaky `ModelInsight` tests [#395](#395)
- Avoid creating `SparseVector`s for LOCO [#377](#377)

New features / updates:
- Model combiner [#385](#399)
- Added new sample for HousingPrices [#365](#365)
- Test to verify that custom metrics appear in model insight metrics [#387](#387)
- Add `FeatureDistribution` to `SerializationFormat`s [#383](#383)
- Add metadata to `OpStandadrdScaler` to allow for descaling [#378](#378)
- Improve json serde error in `evalMetFromJson` [#380](#380)
- Track mean & standard deviation as metrics for numeric features and for text length of text features [#354](#354)
- Making model selectors robust to failing models [#372](#372)
- Use compact and compressed model json by default [#375](#375)
- Descale feature contribution for Linear Regression & Logistic Regression [#345](#345)

Dependency updates:   
- Update tika version [#382](#382)
@salesforce-cla
Copy link

Thanks for the contribution! Before we can merge this, we need @wsuchy to sign the Salesforce.com Contributor License Agreement.

@salesforce-cla
Copy link

Thanks for the contribution! Unfortunately we can't verify the commit author(s): leahmcguire <l***@s***.com> Leah McGuire <l***@s***.com>. One possible solution is to add that email to your GitHub account. Alternatively you can change your commits to another email and force push the change. After getting your commits associated with your GitHub account, refresh the status of this Pull Request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants