Make k2 ragged tensor more PyTorch-y like. #812

csukuangfj · 2021-08-25T14:33:10Z

This pull-request aims to kill k2.RaggedInt and k2.RaggedFloat.

Usage example

#!/usr/bin/env python3

import torch
import k2
import _k2

# TODO: will move _k2.ragged to k2.ragged

a = _k2.ragged.tensor([[1, 2], [3, 4.0]])
assert a.dtype == torch.float32

a = _k2.ragged.tensor([[1, 2], [3, 4]])
assert a.dtype == torch.int32

a = _k2.ragged.tensor([[1, 2], [3, 4]], dtype=torch.float64)
assert a.dtype == torch.float64
assert a.device == torch.device("cpu")

a = a.to(torch.device("cuda", 0))
assert a.device == torch.device("cuda", 0)

b = a.to(torch.int32)
b.dtype == torch.int32

a = b.to(torch.device("cpu")).to(torch.int64)
assert a.device == torch.device("cpu")
assert a.dtype == torch.int64

assert isinstance(a, k2.ragged.Tensor)

Will create a new class in C++ to wrap Ragged<Any>.

csukuangfj · 2021-08-25T14:37:15Z

A preview of the documentation can be found at
https://k3.readthedocs.io/en/latest/python_api/tensor.html

Will add usage examples to it once the code is finished.

danpovey · 2021-08-26T04:24:41Z

Cool!
Yes I think this is a good plan.
For others' clarity: the C++ class to wrap Ragged is so that we can support Torch-compatible backprop in C++. (We also have the option to do this in Python, but probably C++ will be more efficient and more future-proof when we want to do production stuff).

Likely this backprop would not be used for anything except float and double and half; we'd still do backprop for Ragged in Python, I assume, since it's probably better to only pass around the gradients for the scores only, not the whole Arc.

csukuangfj · 2021-08-26T04:46:07Z

k2/python/csrc/torch/any_tensor.h

+
+// AnyTensor is introduced to support backward propagations on
+// RaggedAny since there has to be a tensor involved during backprob
+class AnyTensor {


This class still in a WIP status.

I think RaggedAny would be clearer, since AnyTensor doesn't show a connection to "ragged", it looks like it could be a non-ragged tensor.

csukuangfj · 2021-08-26T15:55:06Z

k2/python/csrc/torch/autograd/sum.h

+
+namespace k2 {
+
+template <typename T>


@danpovey

This shows how to do autograd in C++, though the current implementation
does not give the correct gradient, but it produces a gradient, at least.

Still WIP.

The following screenshot is some test for the current commit:

danpovey · 2021-08-27T04:10:10Z

k2/python/csrc/torch/autograd/sum.h

+
+template <typename T>
+class SumFunction : public torch::autograd::Function<SumFunction<T>> {
+  static_assert(std::is_floating_point<T>::value);


I discussed this in person with @csukuangfj ... I feel that we are pushing the templating "too far out" here, that this SumFunction does not need to be templated, and we can push the dispatching (i.e. if(float) {.. } else if(double) {...}) further inside by overloading the implementation of SumPerSublist for the type Any (that template would do the dispatching, likely via some macro).

@csukuangfj was concerned that sometimes these implementation wrappers will need to use K2_EVAL(...) for this or that purpose. My response was that we can probably wrap K2_EVAL(..) in some dispatching macro in most cases, and if that turns out to be difficult it's OK to push the dispatching further out, but I don't want to get into the habit of doing the dispatching too far out from the actual implementation code, because:
(i) it will lead to a lot of unnecessary binary code duplication, i.e. the compiler has to create many copies of the functions that aren't meaningfully different, and
(ii) in case we ever merge more closely with the Torch codebase, it will be better to stick to patterns more similar to what they use, and the Torch codebase pushes the dispatching very far in, to directly where the actual data processing happens.

So we agreed that we'll push the dispatching further in, by overloading SumPerSublist for type Any in this case; but I was open to having dispatching further out on a case by base basis in case this pattern proves to be hard to use in specific cases.

csukuangfj · 2021-08-27T04:52:47Z

k2/python/csrc/torch/autograd/sum.h


-    SumPerSublist<T>(any.any_.Specialize<T>(), initial_value, &values);
-    return ToTorch(values);
+    FOR_REAL_AND_INT32_TYPES(t, T, {


The dispatching decision is now made here.

As SumPerSublist calls the template SegmentedReduce,
maybe we should replace the template SegmentedReduce with a non-templated
version accepting RaggedAny and do dispatching inside it.

It can reduce compilation time and reduce the size of the shared library, I think.

csukuangfj · 2021-08-27T04:53:23Z

k2/python/csrc/torch/autograd/sum.h

+      const T *grad_output_data = grad_output.data_ptr<T>();
+      T *ans_data = ans.data_ptr<T>();
+
+      K2_EVAL(


Calling a macro inside another macro is not that ugly, for this specific case, I think.

csukuangfj · 2021-08-28T10:17:20Z

k2/python/csrc/torch/doc/any.h

+namespace k2 {
+
+static constexpr const char *kRaggedAnyInitDataDoc = R"doc(
+Create a ragged tensor with two axes.


I am trying to put the documentation in C++ headers.
Users can view the help information in a usual way, i.e.,

>>> import k2.ragged >>> help(k2.ragged.Tensor.__init__)

The output is given below:

Also, the doc is going to be hosted at
https://k2-fsa.github.io/k2/index.html

The reason is that now the doc depends on the C++ source code and we have to compile
_k2 to generate the doc.

A preview is available at
https://csukuangfj.github.io/k2/python_api/tensor.html#k2.ragged.Tensor

The doc is easier to discover as it is bound to the actual class. There is no need to create a fake class just for documentation purposes.

danpovey · 2021-08-28T13:58:42Z

Great!

pkufool · 2021-08-29T09:21:01Z

k2/python/csrc/torch/doc/any.h

+  it throws.
+
+>>> import torch
+impor>>> import k2.ragged as k2r


delete prefix 'impor'.

pkufool · 2021-08-29T09:23:39Z

k2/python/csrc/torch/doc/any.h

+
+An example string for a 3-axis ragged tensor is given below::
+
+    [ [[1] [2 3]]  [[2] [] [3, 4,]] ]


no comma, [ [[1] [2 3]] [[2] [] [3, 4,]] ] > [ [[1] [2 3]] [[2] [] [3 4]] ]

pkufool · 2021-08-29T09:25:20Z

k2/python/csrc/torch/doc/any.h

+
+Caution:
+  Currently, only support for dtypes ``torch.int32``, ``torch.float32``, and
+  ``torch.float64`` are implemented. We can support other types if needed.


delete "are implemented"

pkufool · 2021-08-29T09:30:41Z

k2/python/csrc/torch/doc/any.h

+>>> a = k2r.Tensor([[1], [], [3, 4, 5, 6]])
+>>> a.numel()
+5
+>>> b = k2r.Tensor('[ [[1] [] []]  [[2 3]]]')


It is better to use the same constructor in one example.

Thanks, fixed.

pkufool · 2021-08-31T07:45:44Z

k2/python/csrc/torch/v2/any.cu

+  any.def("remove_values_eq", &RaggedAny::RemoveValuesEq, py::arg("target"));
+  any.def("argmax_per_sublist", &RaggedAny::ArgMaxPerSublist,
+          py::arg("initial_value"));
+  any.def("max_per_sublist", &RaggedAny::MaxPerSublist,


To make the interface more torch-like, I think we should change this to max and add one more parameter like axis, as dan suggested before. So as other sublist operators.

Of course, we could keep two interface if back compatible is needed.

Agreed, I think that would be a good idea.
It's OK to have an arg for axis, but only support -1 (or num_axes-1) for now.
Incidentally, in torch, axis is called dim. We should probably stay consistent within k2 for now though.

I agree to remove the per-sublist part.

All operations on the values are on the last axis, I think. Is there a need to add an extra argument axis?
How is it supposed to be used by users?

I remembered that Dan said we would have a plan to support the operations on other axis, we could add this argument and set the default value to -1. Or, we could add this argument when needed.

pkufool · 2021-09-05T08:32:35Z

k2/python/csrc/torch/fsa_algo.cu

@@ -686,72 +699,93 @@ static void PybindReplaceFsa(py::module &m) {
 }

 static void PybindCtcGraph(py::module &m) {
+  m.def(
+      "ctc_graph",
+      [](RaggedAny &symbols, torch::optional<torch::Device> = {},


I think we don't need device argument here.

The purpose is to make the call k2.linear_fsa in Python as uniform as possible: You don't need to know the type of lables.

k2/k2/python/k2/fsa_algo.py

Line 66 in 8988489

ragged_arc = _k2.linear_fsa(labels, device)

pkufool · 2021-09-05T08:44:28Z

k2/python/csrc/torch/fsa_algo.cu

-          context = GetCpuContext();
-        else
-          context = GetCudaContext(gpu_id);
+         torch::optional<torch::Device> device = {}, bool modified = false,


What's the benefit of passing argument device by torch::Device and std::string, why not use py::object to avoid implementing it twice.

we can get context like the code below, whether the passing in argument is cuda:0 or torch.device.

ContextPtr GetContext(py::object device_obj) { auto device = torch::Device(static_cast<py::str>(device_obj)); if (device.type() == torch::kCPU) { return GetCpuContext(); } else if (device.type() == torch::kCUDA) { return GetCudaContext(device.index()); } else { K2_LOG(FATAL) << "Unsupported device: " << device.str(); return GetCpuContext(); // unreachable code } }

Yes, your approach would work, but it produces less beautiful documentation.

The current implementation produces the following doc

import _k2 help(_k2.linear_fsa)

linear_fsa(...) method of builtins.PyCapsule instance linear_fsa(*args, **kwargs) Overloaded function. 1. linear_fsa(labels: k2::RaggedAny, device: Optional[torch::Device] = None) -> k2::Ragged<k2::Arc> 2. linear_fsa(labels: List[int], device: Optional[torch::Device] = None) -> k2::Ragged<k2::Arc> 3. linear_fsa(labels: List[int], device: Optional[str] = None) -> k2::Ragged<k2::Arc> 4. linear_fsa(labels: List[List[int]], device: Optional[torch::Device] = None) -> k2::Ragged<k2::Arc> 5. linear_fsa(labels: List[List[int]], device: Optional[str] = None) -> k2::Ragged<k2::Arc>

Cool！Thanks for explanation！

Replace _k2.ragged with k2.ragged and replace at::Tensor with torch.Tensor.

csukuangfj · 2021-09-07T08:00:40Z

Ready for review and merge.

I know it contains lots of changes, though most of them are documentation (several thousands of lines of documentation).

All existing test cases are passed. No test case is removed.

We need to add more tests to k2/python/tests/ragged_tensor_test.py.

Help is wanted (A single test function added to that file is also appreciated)

csukuangfj · 2021-09-07T08:02:12Z

Will add more tutorials to docs/source/python_tutorials/ragged when I have time.

danpovey · 2021-09-07T11:53:35Z

Great work!
I looked it over briefly and it looks great.
I think it's OK to merge.

csukuangfj · 2021-09-07T12:31:34Z

Merging.

Will release a new version tomorrow.

The documentation of this pull request can be found at
https://k2-fsa.github.io/k2/python_api/api.html#k2-ragged

It contains usage examples for all most every API.

Make k2 ragged tensor more PyTorch-y like.

f48563b

csukuangfj mentioned this pull request Aug 25, 2021

Support dtype conversions of ragged tensors. #808

Closed

1 task

Refactoring: Start to add the wrapper class AnyTensor.

06a6d20

csukuangfj commented Aug 26, 2021

View reviewed changes

csukuangfj added 2 commits August 26, 2021 19:54

Refactoring.

2a70298

initial attempt to support autograd.

6bc05bf

csukuangfj commented Aug 26, 2021

View reviewed changes

danpovey reviewed Aug 27, 2021

View reviewed changes

First working version with autograd for Sum().

c7bb9d5

csukuangfj commented Aug 27, 2021

View reviewed changes

csukuangfj added 7 commits August 27, 2021 12:55

Fix comments.

d569b42

Support __getitem__ and pickling.

dcea808

Add more docs for k2.ragged.Tensor

cb4f00f

Put documentation in header files.

1b5c015

Minor fixes.

a8d4a8e

Fix a typo.

1f78c93

Fix an error.

892fb04

csukuangfj commented Aug 28, 2021

View reviewed changes

Add more doc.

fb96d97

pkufool reviewed Aug 29, 2021

View reviewed changes

Wrap RaggedShape.

2f01361

csukuangfj mentioned this pull request Aug 30, 2021

[WIP]: Move k2.Fsa to C++ #814

Merged

Wrap more ragged ops.

5263cc5

pkufool reviewed Aug 31, 2021

View reviewed changes

csukuangfj added 2 commits August 31, 2021 22:46

Fixes after review.

6bad7d7

Support constructing a ragged tensor with arbitrary number of axes.

dd5505c

csukuangfj added 6 commits September 1, 2021 23:33

Handle empty sublists.

c8c25e0

Add more doc.

f6e6963

Add more documentation.

b34221b

Remove k2.RaggedInt and k2.RaggedFloat.

3a481fa

Merge remote-tracking branch 'dan/master' into doc

4c6d38b

fix style issues.

401ef13

pkufool reviewed Sep 5, 2021

View reviewed changes

csukuangfj added 3 commits September 5, 2021 18:22

Generate Python API doc automagically.

0920dc9

Begin to update doc.

8988489

Generate more readable documentation.

2405ebb

Replace _k2.ragged with k2.ragged and replace at::Tensor with torch.Tensor.

pkufool mentioned this pull request Sep 7, 2021

[WIP] Wrap ragged tensor #809

Closed

4 tasks

Minor fixes.

f55f24f

csukuangfj changed the title ~~WIP: Make k2 ragged tensor more PyTorch-y like.~~ Make k2 ragged tensor more PyTorch-y like. Sep 7, 2021

csukuangfj mentioned this pull request Sep 7, 2021

Use new APIs with k2.RaggedTensor k2-fsa/icefall#38

Merged

Minor fixes.

03be8df

Fix comments.

d7fbc46

csukuangfj merged commit fbb10a0 into k2-fsa:master Sep 7, 2021

csukuangfj deleted the any branch September 7, 2021 12:32

csukuangfj mentioned this pull request Oct 7, 2021

conda installation of k2 dosn't work k2-fsa/icefall#68

Closed

This was referenced Nov 8, 2021

'.to()' for ragged tensors #806

Closed

sum interface #805

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make k2 ragged tensor more PyTorch-y like. #812

Make k2 ragged tensor more PyTorch-y like. #812

csukuangfj commented Aug 25, 2021

csukuangfj commented Aug 25, 2021

danpovey commented Aug 26, 2021

csukuangfj Aug 26, 2021

danpovey Aug 26, 2021

csukuangfj Aug 26, 2021

danpovey Aug 27, 2021

csukuangfj Aug 27, 2021 •

edited

Loading

csukuangfj Aug 27, 2021

csukuangfj Aug 28, 2021

csukuangfj Aug 28, 2021

danpovey commented Aug 28, 2021

pkufool Aug 29, 2021

pkufool Aug 29, 2021

pkufool Aug 29, 2021

pkufool Aug 29, 2021

csukuangfj Sep 7, 2021

pkufool Aug 31, 2021

danpovey Aug 31, 2021

csukuangfj Aug 31, 2021

pkufool Sep 1, 2021 •

edited

Loading

pkufool Sep 5, 2021

csukuangfj Sep 5, 2021

pkufool Sep 5, 2021

csukuangfj Sep 5, 2021

pkufool Sep 5, 2021

csukuangfj commented Sep 7, 2021

csukuangfj commented Sep 7, 2021

danpovey commented Sep 7, 2021

csukuangfj commented Sep 7, 2021


		An example string for a 3-axis ragged tensor is given below::

		[ [[1] [2 3]] [[2] [] [3, 4,]] ]

Make k2 ragged tensor more PyTorch-y like. #812

Make k2 ragged tensor more PyTorch-y like. #812

Conversation

csukuangfj commented Aug 25, 2021

Usage example

csukuangfj commented Aug 25, 2021

danpovey commented Aug 26, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

csukuangfj Aug 27, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

danpovey commented Aug 28, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pkufool Sep 1, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

csukuangfj commented Sep 7, 2021

csukuangfj commented Sep 7, 2021

danpovey commented Sep 7, 2021

csukuangfj commented Sep 7, 2021

csukuangfj Aug 27, 2021 •

edited

Loading

pkufool Sep 1, 2021 •

edited

Loading