Avoid having to have an implicit SparkSession in OpWorkFlow(Model) #468

nicodv · 2020-03-26T23:21:03Z

Related issues
N/A

Describe the proposed solution
Avoid having to provide a SparkSession implicit to OpWorkflow.loadModel/OpWorkflowModel.save to keep backward compatibility.

Describe alternatives you've considered
N/A

Additional context
Small follow-up on #467

… moving job grouping down to reader/writer

codecov · 2020-03-26T23:40:10Z

Codecov Report

Merging #468 into master will increase coverage by 5.74%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master     #468      +/-   ##
==========================================
+ Coverage   81.25%   86.99%   +5.74%     
==========================================
  Files         345      345              
  Lines       11616    11614       -2     
  Branches      376      374       -2     
==========================================
+ Hits         9438    10104     +666     
+ Misses       2178     1510     -668

Impacted Files	Coverage Δ
.../src/main/scala/com/salesforce/op/OpWorkflow.scala	`88.11% <100.00%> (ø)`
...main/scala/com/salesforce/op/OpWorkflowModel.scala	`93.90% <100.00%> (+14.63%)`	⬆️
...cala/com/salesforce/op/OpWorkflowModelReader.scala	`95.58% <100.00%> (+11.53%)`	⬆️
...cala/com/salesforce/op/OpWorkflowModelWriter.scala	`100.00% <100.00%> (ø)`
...com/salesforce/op/features/types/FeatureType.scala	`95.95% <0.00%> (+1.01%)`	⬆️
...scala/com/salesforce/op/features/FeatureLike.scala	`43.47% <0.00%> (+1.08%)`	⬆️
.../op/features/types/FeatureTypeSparkConverter.scala	`99.12% <0.00%> (+1.75%)`	⬆️
...scala/com/salesforce/op/testkit/RandomStream.scala	`100.00% <0.00%> (+2.38%)`	⬆️
.../com/salesforce/op/local/MLeapModelConverter.scala	`10.25% <0.00%> (+2.56%)`	⬆️
.../scala/com/salesforce/op/dsl/RichTextFeature.scala	`81.94% <0.00%> (+2.77%)`	⬆️
... and 52 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update da52ad9...6714e16. Read the comment docs.

gerashegalov · 2020-03-27T00:57:50Z

core/src/main/scala/com/salesforce/op/OpWorkflowModelReader.scala

- case Failure(error) => throw new RuntimeException(s"Failed to load Workflow from path '$path'", error)
- case Success(wf) => wf
+ implicit val spark: SparkSession = this.sparkSession
+ JobGroupUtil.withJobGroup(OpStep.ModelIO) {


nit: we could just pass sparkSession explicitly instead of creating an implicit val.

gerashegalov · 2020-03-27T16:39:04Z

core/src/main/scala/com/salesforce/op/OpWorkflowModelWriter.scala

- * @param path path to save the model and its stages
- * @param overwrite should overwrite the destination
- */
- def save(model: OpWorkflowModel, path: String, overwrite: Boolean = true): Unit = {


the scope of this PR did not require it, we should not do breaking changes such as removing public methods.

Oh, I missed that one! good catch @gerashegalov

@nicodv please revert this part of the change.

@gerashegalov / @tovbinm So we are OK with adding an implicit SparkSession to OpWorkflowModel.save() instead? (And having inconsistent approaches between model loading/saving.)

@nicodv this is not the choice we have to make. We can make it consistent by just changing saveImpl in class OpWorkflowModelWriter

override protected def saveImpl(path: String): Unit = { JobGroupUtil.withJobGroup(OpStep.ModelIO) { sc.parallelize(Seq(toJsonString(path)), 1) .saveAsTextFile(OpWorkflowModelReadWriteShared.jsonPath(path), classOf[GzipCodec]) }(sparkSession) }

If you do this and undo the save method move we should be ok

Ah, I overlooked the saveImpl method. New PR that reverts this: #470

avoid having to have an implicit SparkSession in OpWorkFlow(Model) by…

6714e16

… moving job grouping down to reader/writer

nicodv added the ready for review label Mar 26, 2020

nicodv requested review from gerashegalov and leahmcguire March 26, 2020 23:21

nicodv requested review from Jauntbox, tovbinm and wsuchy as code owners March 26, 2020 23:21

tovbinm approved these changes Mar 27, 2020

View reviewed changes

tovbinm merged commit e8f8ec6 into master Mar 27, 2020

tovbinm deleted the ndv/jobgroups-2 branch March 27, 2020 03:26

gerashegalov reviewed Mar 27, 2020

View reviewed changes

nicodv mentioned this pull request Mar 27, 2020

Revert jobgroups OpWorkflowModel.save changes #470

Merged

nicodv mentioned this pull request Jun 11, 2020

0.7.0 release #481

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoid having to have an implicit SparkSession in OpWorkFlow(Model) #468

Avoid having to have an implicit SparkSession in OpWorkFlow(Model) #468

nicodv commented Mar 26, 2020

codecov bot commented Mar 26, 2020 •

edited

Loading

gerashegalov Mar 27, 2020

gerashegalov Mar 27, 2020

tovbinm Mar 27, 2020

tovbinm Mar 27, 2020

nicodv Mar 27, 2020 •

edited

Loading

gerashegalov Mar 27, 2020 •

edited

Loading

nicodv Mar 27, 2020

Avoid having to have an implicit SparkSession in OpWorkFlow(Model) #468

Avoid having to have an implicit SparkSession in OpWorkFlow(Model) #468

Conversation

nicodv commented Mar 26, 2020

codecov bot commented Mar 26, 2020 • edited Loading

Codecov Report

gerashegalov Mar 27, 2020

Choose a reason for hiding this comment

gerashegalov Mar 27, 2020

Choose a reason for hiding this comment

tovbinm Mar 27, 2020

Choose a reason for hiding this comment

tovbinm Mar 27, 2020

Choose a reason for hiding this comment

nicodv Mar 27, 2020 • edited Loading

Choose a reason for hiding this comment

gerashegalov Mar 27, 2020 • edited Loading

Choose a reason for hiding this comment

nicodv Mar 27, 2020

Choose a reason for hiding this comment

codecov bot commented Mar 26, 2020 •

edited

Loading

nicodv Mar 27, 2020 •

edited

Loading

gerashegalov Mar 27, 2020 •

edited

Loading