Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Addresses #237. Provides a way to project the entire dataset into a s… #238

Merged
merged 1 commit into from
Feb 4, 2018

Conversation

imarios
Copy link
Contributor

@imarios imarios commented Jan 30, 2018

…ingle column.

@codecov-io
Copy link

codecov-io commented Jan 30, 2018

Codecov Report

Merging #238 into master will increase coverage by 0.12%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #238      +/-   ##
==========================================
+ Coverage   96.84%   96.96%   +0.12%     
==========================================
  Files          52       52              
  Lines         855      858       +3     
  Branches       10       12       +2     
==========================================
+ Hits          828      832       +4     
+ Misses         27       26       -1
Impacted Files Coverage Δ
...ataset/src/main/scala/frameless/TypedDataset.scala 100% <100%> (ø) ⬆️
...c/main/scala/frameless/TypedDatasetForwarded.scala 74.28% <0%> (+2.85%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c6ed57a...2e8f96b. Read the comment docs.

@OlivierBlanvillain
Copy link
Contributor

Not convinced about the name. How is this done in vanilla?

@imarios
Copy link
Contributor Author

imarios commented Jan 30, 2018

@OlivierBlanvillain I am not super fun of the name either, but I couldn't think of anything better ... Do you have anything else we might try?

To the best of my knowledge vanilla doesn't have this. I can see it being useful though, say you want to add an extra column to a dataset, doing this you don't have to result to a Tuple and you don't have to define a new type. E.g.,

d: TypedDataset[Foo]
d.select(d.toCol, myudf(d('a) * 2)): TypedDetaset[(Foo,Long)] // Foo plus the new column we added

@imarios imarios added this to the 0.5-release milestone Jan 30, 2018
@imarios
Copy link
Contributor Author

imarios commented Feb 1, 2018

@OlivierBlanvillain @frosforever any better name suggestions for this?

@frosforever
Copy link
Contributor

I can't really think of a better name. Only time I've ever needed to do something like this I used groupBy / groupByKey instead. Sorry I'm not much help here.

@imarios imarios merged commit fc3a832 into typelevel:master Feb 4, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants