Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release 3.2.3 #6

Merged
merged 3 commits into from
Feb 6, 2018
Merged

Release 3.2.3 #6

merged 3 commits into from
Feb 6, 2018

Conversation

tovbinm
Copy link
Collaborator

@tovbinm tovbinm commented Feb 6, 2018

Changes from 3.2.0 to 3.2.1:

  • Upgraded to Spark 2.2.1

Changes from 3.2.1 to 3.2.2:

  • Numeric bucketizers fixes: including metadata and integration with sanity checker.
  • Introduced split inclusion parameter for numerical bucketizers controlling should the splits be left or right inclusive (numeric.bucketize(splitInclusion) and numeric.autoBucketize(splitInclusion)
  • Introduced an option for numerical bucketizers to allow tracking invalid values such as NaN, -Inf, Inf or values that fall outside the buckets. (numeric.bucketize(trackInvalid) and numeric.autoBucketize(trackInvalid)).
  • Finalized the unification of vectorization for ALL the OP types, including Text, TextArea, Base64, Phone, URL and Geolocation.
  • Fixed track null behavior of vectorization for MulitPicklistMap to match MultiPicklist
  • OP cli --auto properly identifies the input csv schema as expected
  • OP cli generated code is now prettier
  • Minor bug fixes in vectorizers metadata and ctor args for sanity checker model

Changes from 3.2.2 to 3.2.3:

  • Added null tracking for map vectorizers
  • Added null tracking for hashed text features
  • Sanity Checker: fixed issue in feature removal for sibling features when one has a correlation of NaN with the label
  • Fixes for Decision Tree bucketizer vector metadata to allow it working with Sanity Checker
  • Correlation based record level insights
  • All model classes are now made public with private[op] ctors
  • Added inclusion indicators in bucket labels
  • Workflow now creates holdout during fitting after raw data creation and applies eval
  • Simplified OP CLI generated template
  • Added VersionInfo - which allows access to project version and git info at runtime and include it in AppMetrics
  • Make most OpWorkflowRunner ctor params optional and deprecate old ctors
  • OpWorkflowRunner.run now requires spark StreamingContext to be present
  • Implemented a new run type allowing streaming score (--run-type=streamingScore)

@tovbinm tovbinm requested a review from Jauntbox February 6, 2018 18:59
Copy link
Contributor

@Jauntbox Jauntbox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@tovbinm tovbinm merged commit ac208b0 into master Feb 6, 2018
@tovbinm tovbinm deleted the mt/3.2.3-release branch February 6, 2018 20:17
@tovbinm tovbinm restored the mt/3.2.3-release branch March 2, 2018 18:53
@tovbinm tovbinm deleted the mt/3.2.3-release branch March 2, 2018 18:53
ericwayman pushed a commit that referenced this pull request Feb 8, 2019
Release 3.2.3
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants