[Model] Add `dgl.nn.CuGraphSAGEConv` model #5137

tingyu66 · 2023-01-10T15:47:18Z

Description

This PR adds a GraphSAGE model Add dgl.nn.CuGraphSAGEConv that uses the accelerated sparse aggregation primitives in cugraph-ops. It requires pylibcugraphops >= 23.02.

Checklist

Please feel free to remove inapplicable items for your PR.

The PR title starts with [$CATEGORY] (such as [NN], [Model], [Doc], [Feature]])
I've leverage the tools to beautify the python and c++ code.
The PR is complete and small, read the Google eng practice (CL equals to PR) to understand more about small PR. In DGL, we consider PRs with less than 200 lines of core code change are small (example, test and documentation could be exempted).
All changes have test coverage
Code is well-documented
To the best of my knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change
Related issue is referred in this PR
If the PR is for a new model/paper, I've updated the example index here.

Changes

New nn.Module: dgl.nn.CuGraphSAGEConv
Test that validates its results against SAGEConv

Notes

Fixes rapidsai/cugraph-ops#177.

dgl-bot · 2023-01-10T15:47:46Z

Not authorized to trigger CI. Please ask core developer to help trigger via issuing comment:

@dgl-bot

dgl-bot · 2023-01-10T15:47:59Z

Commit ID: 5a7c648

Build ID: 1

Status: ❌ CI test failed in Stage [Authentication].

Report path: link

Full logs path: link

dgl-bot · 2023-01-10T15:53:39Z

Not authorized to trigger CI. Please ask core developer to help trigger via issuing comment:

@dgl-bot

dgl-bot · 2023-01-10T15:53:52Z

Commit ID: cd4c4fa

Build ID: 2

Status: ❌ CI test failed in Stage [Authentication].

Report path: link

Full logs path: link

Rhett-Ying · 2023-01-12T02:41:54Z

@dgl-bot

dgl-bot · 2023-01-12T04:42:06Z

Commit ID: 5a8066394ee08b3f110deb06044f55949580cb0a

Build ID: 3

Status: ✅ CI test succeeded

Report path: link

Full logs path: link

… to forward()

dgl-bot · 2023-01-19T22:08:27Z

Not authorized to trigger CI. Please ask core developer to help trigger via issuing comment:

@dgl-bot

dgl-bot · 2023-01-19T22:08:41Z

Commit ID: d0214db9c7c1b081abaf3c5c47da6ea640e01302

Build ID: 4

Status: ❌ CI test failed in Stage [Authentication].

Report path: link

Full logs path: link

dgl-bot · 2023-01-19T22:10:42Z

Not authorized to trigger CI. Please ask core developer to help trigger via issuing comment:

@dgl-bot

dgl-bot · 2023-01-19T22:10:56Z

Commit ID: 42ab8dc2ccda5bda17e3319105805e73ff10f29d

Build ID: 5

Status: ❌ CI test failed in Stage [Authentication].

Report path: link

Full logs path: link

dgl-bot · 2023-01-19T22:24:47Z

Not authorized to trigger CI. Please ask core developer to help trigger via issuing comment:

@dgl-bot

dgl-bot · 2023-01-19T22:25:00Z

Commit ID: 95fe352d3465143e92919e1cdec833ed262678ac

Build ID: 6

Status: ❌ CI test failed in Stage [Authentication].

Report path: link

Full logs path: link

dgl-bot · 2023-01-20T05:17:59Z

Not authorized to trigger CI. Please ask core developer to help trigger via issuing comment:

@dgl-bot

dgl-bot · 2023-01-20T05:18:12Z

Commit ID: e5e306a402ae8567d9db588a68c31ee1d8ca1216

Build ID: 7

Status: ❌ CI test failed in Stage [Authentication].

Report path: link

Full logs path: link

dgl-bot · 2023-01-26T17:41:19Z

Not authorized to trigger CI. Please ask core developer to help trigger via issuing comment:

@dgl-bot

dgl-bot · 2023-01-26T17:41:32Z

Commit ID: c878c34f3b2999d86c13eae73ea7d4c1c57d25b1

Build ID: 8

Status: ❌ CI test failed in Stage [Authentication].

Report path: link

Full logs path: link

dgl-bot · 2023-02-02T02:20:30Z

Not authorized to trigger CI. Please ask core developer to help trigger via issuing comment:

@dgl-bot

dgl-bot · 2023-02-02T02:20:43Z

Commit ID: 3d7f4f29656a2025b66e29eba30a21e7e013f74d

Build ID: 9

Status: ❌ CI test failed in Stage [Authentication].

Report path: link

Full logs path: link

mufeili · 2023-02-17T07:40:12Z

python/dgl/nn/pytorch/conv/cugraph_sageconv.py

+
+    def reset_parameters(self):
+        r"""Reinitialize learnable parameters."""
+        self.linear.reset_parameters()


Previously SageConv considers Xavier uniform while nn.Linear.reset_parameters considers Kaiming uniform. I'm not sure about the effects of this difference.

I think Kaiming is more suitable here as ReLU is often the choice for the nonlinearity in GNN; Xavier was designed for sigmoid function.

mufeili · 2023-02-17T07:43:16Z

python/dgl/nn/pytorch/conv/cugraph_sageconv.py

+        r"""Reinitialize learnable parameters."""
+        self.linear.reset_parameters()
+
+    def forward(self, g, feat, max_in_degree=None):


another difference, lack of support for edge_weight

python/dgl/nn/pytorch/conv/cugraph_sageconv.py

dgl-bot · 2023-02-17T08:31:47Z

Commit ID: 4f6fd15

Build ID: 13

Status: ✅ CI test succeeded

Report path: link

Full logs path: link

mufeili · 2023-02-17T08:46:01Z

examples/advanced/cugraph/graphsage.py

@@ -0,0 +1,200 @@
+import argparse


Did you run this script? If so, what performance number did you obtain?

Yes, in terms of pure training time (not including dataloading), SAGEConv takes 2.5s per epoch, while CuGraphSAGEConv takes 2.0s, despite the overhead of coo-to-csc conversion. Test accuracy is also the same.

Edit: add timings for both mode in the example

mode mixed (uva) pure gpu

CuGraphSAGEConv 2.0 s 1.2 s

SAGEConv 2.5 s 1.7 s

mufeili · 2023-02-17T09:00:07Z

examples/advanced/cugraph/graphsage.py

+    def forward(self, blocks, x):
+        h = x
+        for l, (layer, block) in enumerate(zip(self.layers, blocks)):
+            h = layer(block, h, max_in_degree=10)


This seems a bit ugly. Perhaps it's better to pass the argument to SAGE.__init__.

Can you explain what needs to be done here? Are you suggesting to unpack to loop like this?

h = F.relu(self.conv1(g[0], x)) h = F.relu(self.conv2(g[1], h)) ...

I meant the specification of max_in_degree.

I see and I do agree that it is not an ideal interface. We did not make max_in_degree an attribute of CuGraphSAGEConv since it is a property of the graph (i.e., block), rather than the model. I have removed it from the example as this flag is optional.
In the meantime, we are improving our aggregation primitives to be more flexible to eventually ditch this option.

mufeili · 2023-02-17T09:13:12Z

examples/advanced/cugraph/graphsage.py

+        default="mixed",
+        choices=["cpu", "mixed", "puregpu"],
+        help="Training mode. 'cpu' for CPU training, 'mixed' for CPU-GPU mixed training, "
+        "'puregpu' for pure-GPU training.",


This is automatically formatted by lintrunner. I removed the cpu mode, as it is not supported by the model

Changes pushed.

mufeili

done a pass

dgl-bot · 2023-02-17T22:03:30Z

Not authorized to trigger CI. Please ask core developer to help trigger via issuing comment:

@dgl-bot

dgl-bot · 2023-02-17T22:03:43Z

Commit ID: 55fe4fe57f5efb20abb76e3fcc0f5428c13abc5e

Build ID: 14

Status: ❌ CI test failed in Stage [Authentication].

Report path: link

Full logs path: link

docs/source/api/python/nn-pytorch.rst

dgl-bot · 2023-02-21T21:39:01Z

Not authorized to trigger CI. Please ask core developer to help trigger via issuing comment:

@dgl-bot

dgl-bot · 2023-02-21T21:39:14Z

Commit ID: 50aa20a2f4a63898d0b2f3d065157db8acd9cbeb

Build ID: 15

Status: ❌ CI test failed in Stage [Authentication].

Report path: link

Full logs path: link

tingyu66 · 2023-02-21T22:05:33Z

Thank you @mufeili for the review. Here is a list of disparities between CuGraphSAGEConv and SAGEConv:

SAGEConv allows different feature dimensions for source and destination nodes
They cover different aggregation types
CuGraphSAGEConv does not support edge weights

Some preliminary performance numbers using the included example:

mode	mixed (uva)	pure gpu
CuGraphSAGEConv	2.0 s	1.2 s
SAGEConv	2.5 s	1.7 s

(copied over from the review comment above for better visibility)

dgl-bot · 2023-02-21T22:10:26Z

Not authorized to trigger CI. Please ask core developer to help trigger via issuing comment:

@dgl-bot

dgl-bot · 2023-02-21T22:10:38Z

Commit ID: 8d2f6c5

Build ID: 16

Status: ❌ CI test failed in Stage [Authentication].

Report path: link

Full logs path: link

dgl-bot · 2023-02-22T04:29:11Z

Not authorized to trigger CI. Please ask core developer to help trigger via issuing comment:

@dgl-bot

mufeili · 2023-02-22T04:29:19Z

@dgl-bot

dgl-bot · 2023-02-22T04:29:24Z

Commit ID: f995b11

Build ID: 17

Status: ❌ CI test failed in Stage [Authentication].

Report path: link

Full logs path: link

dgl-bot · 2023-02-22T05:21:22Z

Commit ID: f995b11

Build ID: 18

Status: ✅ CI test succeeded.

Report path: link

Full logs path: link

* add CuGraphSAGEConv model * fix lint issues * update model to reflect changes in make_mfg_csr(), move max_in_degree to forward() * lintrunner * allow reset_parameters() * remove norm option, simplify test * allow full graph fallback option, add example * address comments * address reviews --------- Co-authored-by: Mufei Li <[email protected]>

add CuGraphSAGEConv model

5a7c648

tingyu66 marked this pull request as draft January 10, 2023 15:50

fix lint issues

cd4c4fa

update model to reflect changes in make_mfg_csr(), move max_in_degree…

d4e9688

… to forward()

lintrunner

310d5b2

allow reset_parameters()

59c0d87

tingyu66 mentioned this pull request Jan 20, 2023

Add/Update cugraph-ops models #5218

Closed

tingyu66 marked this pull request as ready for review January 20, 2023 03:56

remove norm option, simplify test

9e729a3

tingyu66 changed the title ~~[DO NOT MERGE][Model] Add dgl.nn.CuGraphSAGEConv model~~ [Model] Add dgl.nn.CuGraphSAGEConv model Feb 2, 2023

tingyu66 added 2 commits February 16, 2023 21:39

Merge remote-tracking branch 'upstream/master' into cugraphops-sageconv

86a0bcd

allow full graph fallback option, add example

a528321

mufeili reviewed Feb 17, 2023

View reviewed changes

python/dgl/nn/pytorch/conv/cugraph_sageconv.py Outdated Show resolved Hide resolved

mufeili reviewed Feb 17, 2023

View reviewed changes

python/dgl/nn/pytorch/conv/cugraph_sageconv.py Show resolved Hide resolved

mufeili reviewed Feb 17, 2023

View reviewed changes

address comments

796d73a

mufeili reviewed Feb 21, 2023

View reviewed changes

docs/source/api/python/nn-pytorch.rst Show resolved Hide resolved

address reviews

450c533

Merge branch 'master' into cugraphops-sageconv

8d2f6c5

Merge branch 'master' into cugraphops-sageconv

f995b11

mufeili approved these changes Feb 22, 2023

View reviewed changes

mufeili merged commit bcf9923 into dmlc:master Feb 22, 2023

tingyu66 deleted the cugraphops-sageconv branch February 22, 2023 15:10

[Model] Add dgl.nn.CuGraphSAGEConv model #5137

[Model] Add dgl.nn.CuGraphSAGEConv model #5137

Conversation

tingyu66 commented Jan 10, 2023 • edited Loading

Description

Checklist

Changes

Notes

dgl-bot commented Jan 10, 2023

dgl-bot commented Jan 10, 2023

dgl-bot commented Jan 10, 2023

dgl-bot commented Jan 10, 2023

Rhett-Ying commented Jan 12, 2023

dgl-bot commented Jan 12, 2023

dgl-bot commented Jan 19, 2023

dgl-bot commented Jan 19, 2023

dgl-bot commented Jan 19, 2023

dgl-bot commented Jan 19, 2023

dgl-bot commented Jan 19, 2023

dgl-bot commented Jan 19, 2023

dgl-bot commented Jan 20, 2023

dgl-bot commented Jan 20, 2023

dgl-bot commented Jan 26, 2023

dgl-bot commented Jan 26, 2023

dgl-bot commented Feb 2, 2023

dgl-bot commented Feb 2, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dgl-bot commented Feb 17, 2023

Choose a reason for hiding this comment

tingyu66 Feb 18, 2023 • edited Loading

Choose a reason for hiding this comment

mufeili Feb 17, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mufeili left a comment

Choose a reason for hiding this comment

dgl-bot commented Feb 17, 2023

dgl-bot commented Feb 17, 2023

dgl-bot commented Feb 21, 2023

dgl-bot commented Feb 21, 2023

tingyu66 commented Feb 21, 2023

dgl-bot commented Feb 21, 2023

dgl-bot commented Feb 21, 2023

dgl-bot commented Feb 22, 2023

mufeili commented Feb 22, 2023

dgl-bot commented Feb 22, 2023

dgl-bot commented Feb 22, 2023

[Model] Add `dgl.nn.CuGraphSAGEConv` model #5137

[Model] Add `dgl.nn.CuGraphSAGEConv` model #5137

tingyu66 commented Jan 10, 2023 •

edited

Loading

tingyu66 Feb 18, 2023 •

edited

Loading

mufeili Feb 17, 2023 •

edited

Loading