Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DataFrame] Changing the _default_index fn to a remote function #1617

Merged
merged 1 commit into from
Feb 27, 2018

Conversation

kunalgosar
Copy link
Contributor

What do these changes do?

Moving _default_index to a remote function speeds up creating a new DataFrame. Since _default_index will now return a futures object, the main thread is freed and returned to the user much quicker. This does not necessarily mean that the full computation has finished, but the main thread can continue running.

Updated Performance on Query against Pandas

Data: 76 MB of String Data
Machine: 2 Core Macbook
Partitions: 4

Pandas Benchmark:
%timeit pandas_df.query(query_func) # 172 ms

Ray:
%timeit ray_df.query(query_func) # 15.2 ms

@devin-petersohn devin-petersohn changed the title Changing the _default_index fn to a remote function [DataFrame] Changing the _default_index fn to a remote function Feb 27, 2018
@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/3984/
Test PASSed.

Copy link
Member

@devin-petersohn devin-petersohn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great optimization. Thanks @kunalgosar

@devin-petersohn
Copy link
Member

Passed the private-travis. OK to merge.

@devin-petersohn devin-petersohn merged commit f43328f into ray-project:master Feb 27, 2018
@kunalgosar kunalgosar deleted the _index branch March 14, 2018 20:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants