Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for rapidsai - cudf and cuml packages GPU Data Science #594

Closed
jperez999 opened this issue Jul 30, 2019 · 16 comments · Fixed by #898
Closed

Add support for rapidsai - cudf and cuml packages GPU Data Science #594

jperez999 opened this issue Jul 30, 2019 · 16 comments · Fixed by #898
Assignees

Comments

@jperez999
Copy link
Contributor

No description provided.

@jperez999
Copy link
Contributor Author

I have a PR ready for review with passing test cases for both cuml and cudf.

@rosbo
Copy link
Contributor

rosbo commented Dec 10, 2019

@jperez999

There is a few conflicts that prevent us from adding cuDF and cuML:

cudf=0.10 -> pandas[version='>=0.24.2,<0.25']

We are using pandas 0.25.3 and we have several packages requiring >= 0.25. Any particular reasons for requiring this restriction?

cudf=0.10 -> pyarrow=0.14.1

We are using pyarrow 0.15.1 and we also have several packages requiring >= 0.15. Any particular reasons for not allowing pyarrow 0.15.x?

@jperez999
Copy link
Contributor Author

The reason we are running pandas < 0.25 rapidsai/cudf#3486 And pyarrow lockdown is because of a conda-forge conflict rapidsai/cudf#3318

@rosbo
Copy link
Contributor

rosbo commented Jun 4, 2020

Hi @jperez999 and @kkraus14

Are you planning on relaxing the constraint on pyarrow added in rapidsai/cudf#3318? This is causing a cascading slew of downgrades the arrow-cpp lib and boost-cpp and so on which then causes conflicts with other libraries in the Kaggle image.

Our image is currently using pyarrow 1.16 and libboost 1.72.0.

Thanks

@kkraus14
Copy link

kkraus14 commented Jun 4, 2020

We are planning on upgrading to Arrow 0.17.1 in cudf 0.15 shortly. Unfortunately there was a bug introduced in 0.15.1 of Arrow that made it incompatible with nvcc which made us unable to upgrade until now.

Note that 0.15 won't be released for a few months though as we're currently in progress of the 0.14 release.

@kkraus14
Copy link

Hi @rosbo, we've upgraded to 0.17.1 in our nightlies but Arrow 1.0.0 has since released that we're considering upgrading to. Would that be a blocker for the kaggle container?

@rosbo
Copy link
Contributor

rosbo commented Jul 28, 2020

Hi @kkraus14,

It shouldn't be a blocker, arrow 0.17.1 and 1.0.0 both work with our version of the boost library (cause of the cascading changes earlier).

Thanks for checking.

@kkraus14
Copy link

kkraus14 commented Sep 21, 2020

@rosbo Any chance we could give this another shot? The 0.15 release uses Arrow 0.17.1 and boost 1.72 and has been released for a bit now.

rosbo added a commit that referenced this issue Sep 22, 2020
Fixes #594.

BUG=144522678
@rosbo
Copy link
Contributor

rosbo commented Sep 22, 2020

Kicked off a new build with 0.15 and will see if we get any conflicts: https://github.com/Kaggle/docker-python/tree/add-rapids-ai-0.15

@rosbo
Copy link
Contributor

rosbo commented Sep 24, 2020

We are in the process of migrating our GPU image to be based on gcr.io/deeplearning-platform-release/tf2-gpu.2-3.

Hitting conflicts again and I am waiting on conda to give me the list of conflicts... Will report hopefully soon.

docker run -it --rm gcr.io/deeplearning-platform-release/tf2-gpu.2-3:latest /bin/bash
$ conda install -c rapidsai -c nvidia -c conda-forge -c defaults rapids=0.15 python=3.7 cudatoolkit=10.1

@rosbo
Copy link
Contributor

rosbo commented Oct 1, 2020

The conflict resolver did eventually converge after more than 72h and printed a long list of conflict (longer than my bash terminal history limit setting). I will take a look at this more closely once we have migrated to the new base image.

@kkraus14
Copy link

kkraus14 commented Oct 1, 2020

Thanks @rosbo. We're planning on releasing 0.16 in ~2 weeks and it will upgrade Arrow to 1.0.1 and keep boost at 1.72.0 as the main version pinnings. Let me know if there's anything I can help with.

@kkraus14
Copy link

kkraus14 commented Nov 2, 2020

@rosbo quick ping here. 0.16 has been out for a while, any chance we could give this another shot?

@rosbo
Copy link
Contributor

rosbo commented Nov 2, 2020

Hi @kkraus14, I will try to give it another shot this week. Thanks for the ping about the new release.

@rosbo
Copy link
Contributor

rosbo commented Nov 5, 2020

@kkraus14 Great news!

I was able to successfully install cudf 0.16 on our images (no conflicts). I tried with only cudf first to reduce # of conflicts. I will try next to with cuml.

The original request was about adding cudf & cuml. I see now that Rapids includes several others packages. Should I try installing all of them with conda install rapids or just installed cudf and cuml?

Thank you

@kkraus14
Copy link

kkraus14 commented Nov 5, 2020

I think we should just do cudf and cuml to start and then we can expand from there. conda install rapids will pull in things like cuspatial which will pull in GDAL which could get us quickly back into dependency hell.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants