Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

0.7.0 release #481

Merged
merged 70 commits into from
Jun 11, 2020
Merged

0.7.0 release #481

merged 70 commits into from
Jun 11, 2020

Conversation

nicodv
Copy link
Contributor

@nicodv nicodv commented Jun 11, 2020

Bug fixes:

  • Fix flaky ModelInsight tests #407
  • Remove logging of tokens of text fields #420, #438, #447, #474
  • Add validation prepare call before model selection when no DAG is passed #424, #429
  • Fix Days.daysBetween int overflow #471

New features / updates:

  • Downsample the number of training samples to maxTrainingSample for regression #413 and multi-class classification #414
  • Refactor InsightLOCOTest #412
  • Enable more loss types for OpLinearRegression #421
  • Add property-based tests for regression model selection #427
  • Add option to calculate LOCO for dates/texts by leaving out their entire vector #418
  • Add Chinese and Korean examples to TextTokenizerTest #442
  • Add support for ignoring text that looks like IDs in SmartTextVectorizer #448, #455
  • Add a unary estimator for detecting names in text fields and transforming to likely gender #445
  • Allow result features to be removed by raw feature filter #458
  • Metadata changes for sensitive feature information #457
  • Add MinVarianceFilter which checks that computed features have a minimum variance #463, #465
  • Allow TextStats length distribution to be token-based and refactor for testability #464
  • Use Spark job grouping to distinguish steps of the machine learning flow #467, #468, #470
  • Add categorical detection to be coverage based in addition to unique count based #473
  • Remove duplicate features using sanity checker feature to feature correlations #476, #479
  • Lift the upper bound on number of hash features #477
  • Enable Html stripping on text-like features #478

Dependency updates (#402, #466):

  • Update Apache Spark version to 2.4.5
  • Avro is a built-in data source in Spark 2.4, so no longer using the spark-avro package
  • Avro to 1.8.2
  • XGBoost to 0.90
  • MLeap to 0.14.0
  • json4s to 3.5.3
  • JUnit to 4.12
  • chill to 0.9.3
  • gradle-avro-plugin to 0.16.0

Miscellaneous:

  • Add ROADMAP.md #394

@codecov
Copy link

codecov bot commented Jun 11, 2020

Codecov Report

Merging #481 into master will decrease coverage by 0.00%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #481      +/-   ##
==========================================
- Coverage   87.01%   87.00%   -0.01%     
==========================================
  Files         345      345              
  Lines       11680    11680              
  Branches      378      378              
==========================================
- Hits        10163    10162       -1     
- Misses       1517     1518       +1     
Impacted Files Coverage Δ
.../op/features/types/FeatureTypeSparkConverter.scala 98.24% <0.00%> (-0.88%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e48831a...e292501. Read the comment docs.

Copy link
Collaborator

@leahmcguire leahmcguire left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@tovbinm
Copy link
Collaborator

tovbinm commented Jun 11, 2020

🥳 🥳 🥳 🥳 🥳

@nicodv nicodv merged commit 036d1fc into master Jun 11, 2020
@nicodv nicodv deleted the 0.7.0-release branch June 11, 2020 22:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants