Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add metadata to OpStandardScaler to allow for descaling #378

Merged
merged 11 commits into from
Aug 5, 2019

Conversation

erica-chiu
Copy link
Contributor

Related issues
N/A

Describe the proposed solution
Add metadata to OpStandardScaler to allow DescalerTransformer to descale.

Describe alternatives you've considered
N/A

Additional context
This change is to temporarily fix the issue that optimal regression hyperparamters are data-dependent by allowing the standardizing and destandardizing of label data. A longer-term goal is to create hyperparameters that can adjust based on data.

@codecov
Copy link

codecov bot commented Aug 1, 2019

Codecov Report

Merging #378 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #378      +/-   ##
==========================================
+ Coverage   86.84%   86.84%   +<.01%     
==========================================
  Files         336      336              
  Lines       10943    10948       +5     
  Branches      343      580     +237     
==========================================
+ Hits         9503     9508       +5     
  Misses       1440     1440
Impacted Files Coverage Δ
...p/stages/impl/feature/OpScalarStandardScaler.scala 100% <100%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e1bab3b...4674762. Read the comment docs.

@erica-chiu erica-chiu removed their assignment Aug 1, 2019
Copy link
Contributor

@Jauntbox Jauntbox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Just a few small questions and comments.

Copy link
Collaborator

@leahmcguire leahmcguire left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@Jauntbox Jauntbox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yay, LGTM!

@erica-chiu erica-chiu merged commit 9cf80fd into master Aug 5, 2019
@erica-chiu erica-chiu deleted the ec/standardMetadata branch August 6, 2019 00:05
@gerashegalov gerashegalov mentioned this pull request Sep 8, 2019
gerashegalov added a commit that referenced this pull request Sep 11, 2019
Bug fixes:
- Ensure correct metrics despite model failures on some CV folds [#404](#404)
- Fix flaky `ModelInsight` tests [#395](#395)
- Avoid creating `SparseVector`s for LOCO [#377](#377)

New features / updates:
- Model combiner [#385](#399)
- Added new sample for HousingPrices [#365](#365)
- Test to verify that custom metrics appear in model insight metrics [#387](#387)
- Add `FeatureDistribution` to `SerializationFormat`s [#383](#383)
- Add metadata to `OpStandadrdScaler` to allow for descaling [#378](#378)
- Improve json serde error in `evalMetFromJson` [#380](#380)
- Track mean & standard deviation as metrics for numeric features and for text length of text features [#354](#354)
- Making model selectors robust to failing models [#372](#372)
- Use compact and compressed model json by default [#375](#375)
- Descale feature contribution for Linear Regression & Logistic Regression [#345](#345)

Dependency updates:   
- Update tika version [#382](#382)
@salesforce-cla
Copy link

Thanks for the contribution! Unfortunately we can't verify the commit author(s): Erica Chiu <e***@s***.com> erica-chiu <e***@s***.com>. One possible solution is to add that email to your GitHub account. Alternatively you can change your commits to another email and force push the change. After getting your commits associated with your GitHub account, refresh the status of this Pull Request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants