Bidirectional RNN #708

NeroBlackstone · 2024-06-16T12:17:01Z

issue #687

Please confirm whether the interface meets the requirements. Thank you.

codecov · 2024-06-16T12:27:18Z

Codecov Report

Attention: Patch coverage is 44.44444% with 10 lines in your changes missing coverage. Please review.

Project coverage is 87.00%. Comparing base (d448c43) to head (2dca0db).
Report is 1 commits behind head on main.

Files	Patch %	Lines
src/layers/recurrent.jl	0.00%	6 Missing ⚠️
ext/LuxReverseDiffExt/LuxReverseDiffExt.jl	33.33%	2 Missing ⚠️
ext/LuxTrackerExt.jl	33.33%	2 Missing ⚠️

❗ There is a different number of reports uploaded between BASE (d448c43) and HEAD (2dca0db). Click for more details.

HEAD has 5 uploads less than BASE

Flag BASE (d448c43) HEAD (2dca0db)

42 37

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #708      +/-   ##
==========================================
- Coverage   96.30%   87.00%   -9.31%     
==========================================
  Files          57       57              
  Lines        2789     2801      +12     
==========================================
- Hits         2686     2437     -249     
- Misses        103      364     +261

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

test/layers/recurrent_tests.jl

NeroBlackstone · 2024-06-17T16:55:30Z

@avik-pal Hi! I have completed the implementation of Bidirectional and written test code, trying to be as equivalent to Keras' API. Please review the changes, thank you!

NeroBlackstone · 2024-06-19T03:03:24Z

😭😭😭Any suggestions? I will fix it today

src/layers/recurrent.jl

test/layers/recurrent_tests.jl

NeroBlackstone · 2024-06-21T20:47:34Z

@avik-pal 🥹🥹Hi, could you please help me review the PR and find the reason why @jet test failed? I haven't any idea about that.

Thank you very much ♥️♥️♥️♥️

src/layers/recurrent.jl

NeroBlackstone · 2024-06-25T11:52:19Z

@avik-pal Sorry to bother you. I still don't know how to solve runtime dispatch error for this julia code...

(rnn::BidirectionalRNN)(x, ps, st::NamedTuple) = rnn.model(x, ps, st)

julia> @report_opt bi_rnn(x, ps, st)
═════ 1 possible error found ═════
┌ (::BidirectionalRNN)(x::Array{…}, ps::@NamedTuple{…}, st::@NamedTuple{…}) @ Lux ./Lux.jl/src/layers/recurrent.jl:690
│ runtime dispatch detected: %1::Parallel(x::Array{Float32, 3}, ps::@NamedTuple{layer_1::@NamedTuple{…}, layer_2::@NamedTuple{…}}, st::@NamedTuple{layer_1::@NamedTuple{…}, layer_2::@NamedTuple{…}})::Any
└────────────────────

avik-pal · 2024-06-25T23:43:39Z

I will take a look on the weekend

NeroBlackstone · 2024-06-30T06:57:04Z

I will take a look on the weekend

😥Hi, could you help me review this PR?..

avik-pal · 2024-06-30T19:30:39Z

Fix the gradient tests and it should be fine. They are probably originating from lazy reverse rrules for Zygote not being defined for GPU arrays

NeroBlackstone · 2024-07-01T11:34:47Z

Fix the gradient tests and it should be fine. They are probably originating from lazy reverse rrules for Zygote not being defined for GPU arrays

Thank you for your help!

Some gradient tests still failed at here, I have no idea about how to set lazy reverse rrules... I guess the rest of the implementation may have to be left to you, thanks!

19:22:03 | maxrss 20.0% | mem 67.2% | DONE  (1/1) test item "Bidirectional" 112.9 secs (68.5% compile, 0.1% recompile, 6.2% GC), 188.79 M allocs (13.948 GB)
Test Summary:                          | Pass  Error  Total     Time
ReTestItem Tests                       |  120     12    132  2m08.7s
  Bidirectional                        |   60      6     66  2m03.7s
    cpu                                |   30      3     33  1m08.8s
      cell: RNNCell                    |   10      1     11    52.2s
      cell: LSTMCell                   |   10      1     11     8.0s
      cell: GRUCell                    |   10      1     11     8.6s
    cuda                               |   30      3     33    43.5s
      cell: RNNCell                    |   10      1     11    27.5s
      cell: LSTMCell                   |   10      1     11     7.8s
      cell: GRUCell                    |   10      1     11     8.3s
  Lux                                  |   60      6     66  2m05.6s
    test                               |   60      6     66         
      test/layers                      |   60      6     66         
        test/layers/recurrent_tests.jl |   60      6     66         
          Bidirectional                |   60      6     66  2m03.7s
            cpu                        |   30      3     33  1m08.8s
              cell: RNNCell            |   10      1     11    52.2s
              cell: LSTMCell           |   10      1     11     8.0s
              cell: GRUCell            |   10      1     11     8.6s
            cuda                       |   30      3     33    43.5s
              cell: RNNCell            |   10      1     11    27.5s
              cell: LSTMCell           |   10      1     11     7.8s
              cell: GRUCell            |   10      1     11     8.3s
ERROR: LoadError: Some tests did not pass: 120 passed, 0 failed, 12 errored, 0 broken.
in expression starting at /home/nero/Documents/github/Lux.jl/test/runtests.jl:75

NeroBlackstone · 2024-07-03T04:56:35Z

@avik-pal Hi, I'm sorry to bother you again. I want to continue to push this PR, but it seems that there is a limit to what I can do. Is "lazy reverse rrules" a feature that Zygote is missing? Do I need to open an issue for Zygote?

avik-pal · 2024-07-04T02:37:03Z

Kind of, but not worth opening a Zygote issue for this.

If you look at https://buildkite.com/julialang/lux-dot-jl/builds/2994#01906a93-9e6b-4704-a6a1-d3b8e82bb694/350-1748, it is saying that we doing a broadcast vcat of Vector{<:CuArray} and Iterators.Reverse{<:CuArray}. Zygote doesn't have a rule for that.

The easiest way to resolve this would be to make the Iterators.Reverse into a Vector, since that is effectively materializing a vector of pointers, it is not expensive either.

NeroBlackstone · 2024-07-04T10:03:40Z

make the Iterators.Reverse into a Vector,

@avik-pal Thank you very much for pointing me in the right direction, but I can't really understand "make the Iterators.Reverse into a Vector" here, I think Iterators.Reverse already support vector:

julia> vec = [1,2,3,4,5]
julia> foreach(println, Iterators.reverse(vec))
5
4
3
2
1

Could you give me a few inputs and outputs as examples, and the signature of the function to define? I can implement it if I could

avik-pal · 2024-07-07T22:20:55Z

I think Iterators.Reverse already support vector:

No that is not what I meant. Try something like

x = [cu(rand(5)) for _ in 1:5]
x_rev = Iterators.reverse(x)

vcat(x, x_rev)

Now try to differentiate the result of vcat wrt x.

NeroBlackstone · 2024-07-08T17:57:33Z

Now try to differentiate the result of vcat wrt x.

@avik-pal So sorry to bother you... Forgive my poor understanding...

I totally don't understand the math meaning here! So I don't know how to differentiate this vcat function, how to express this use Zygote?

Let's say:

x = [1,2,3,4,5]
cat_x = [1,2,3,4,5,1,2,3,4,5]
y = cat_x
# how to differentiate the result???

Thanks all your OSS works, It's great. And I really need this feature.

NeroBlackstone changed the title ~~[WIP] Bidirectional RNN~~ Bidirectional RNN Jun 17, 2024

NeroBlackstone commented Jun 17, 2024

View reviewed changes

test/layers/recurrent_tests.jl Outdated Show resolved Hide resolved

avik-pal reviewed Jun 19, 2024

View reviewed changes

avik-pal force-pushed the Bidirectional branch from 6f75695 to a73bbf8 Compare June 19, 2024 03:24

avik-pal linked an issue Jun 19, 2024 that may be closed by this pull request

Feature request: Bidirectional for RNN layer. #687

Closed

NeroBlackstone commented Jun 20, 2024

View reviewed changes

test/layers/recurrent_tests.jl Show resolved Hide resolved

NeroBlackstone commented Jun 23, 2024

View reviewed changes

src/layers/recurrent.jl Show resolved Hide resolved

avik-pal force-pushed the main branch from dcab297 to ced869f Compare June 28, 2024 17:00

avik-pal force-pushed the Bidirectional branch 5 times, most recently from b97e29a to e7ae725 Compare June 30, 2024 19:15

avik-pal force-pushed the Bidirectional branch from e7ae725 to 39e3703 Compare July 4, 2024 05:17

avik-pal force-pushed the Bidirectional branch 2 times, most recently from b0e692e to 5692dfe Compare July 10, 2024 01:15

avik-pal approved these changes Jul 10, 2024

View reviewed changes

NeroBlackstone and others added 15 commits July 9, 2024 20:55

designe Bidirectional interface

74bd025

add applybidirectional() TODO

6955227

update

f1ada4b

update Bidirectional implementation

748c832

update Bidirectional implementation

726e57d

update

8fa6901

update test

828a53b

add Bidirectional test

cafb45f

update implementation

dd7c905

remove isnothing

5dff663

update birnn docs

cdc9f21

Re-add the tests

51bb3b8

fix: avoid lazy reverse

82dfbf5

fix: handle reverse for operator overloading AD

5e83ae4

fix: allow debug modes for recurrent layers

639f813

avik-pal force-pushed the Bidirectional branch from b9d5016 to e6f6993 Compare July 10, 2024 03:57

avik-pal mentioned this pull request Jul 10, 2024

feat: add type-stability guarantees with DispatchDoctor #752

Closed

11 tasks

fix: soa/aos handling for multigate

031d41c

avik-pal force-pushed the Bidirectional branch from e6f6993 to 031d41c Compare July 10, 2024 04:27

avik-pal merged commit d9aa5a6 into LuxDL:main Jul 10, 2024
56 of 64 checks passed

NeroBlackstone deleted the Bidirectional branch July 13, 2024 02:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bidirectional RNN #708

Bidirectional RNN #708

NeroBlackstone commented Jun 16, 2024

codecov bot commented Jun 16, 2024 •

edited

Loading

NeroBlackstone commented Jun 17, 2024

NeroBlackstone commented Jun 19, 2024

NeroBlackstone commented Jun 21, 2024

NeroBlackstone commented Jun 25, 2024

avik-pal commented Jun 25, 2024

NeroBlackstone commented Jun 30, 2024

avik-pal commented Jun 30, 2024

NeroBlackstone commented Jul 1, 2024

NeroBlackstone commented Jul 3, 2024

avik-pal commented Jul 4, 2024

NeroBlackstone commented Jul 4, 2024

avik-pal commented Jul 7, 2024

NeroBlackstone commented Jul 8, 2024

Bidirectional RNN #708

Bidirectional RNN #708

Conversation

NeroBlackstone commented Jun 16, 2024

codecov bot commented Jun 16, 2024 • edited Loading

Codecov Report

NeroBlackstone commented Jun 17, 2024

NeroBlackstone commented Jun 19, 2024

NeroBlackstone commented Jun 21, 2024

NeroBlackstone commented Jun 25, 2024

avik-pal commented Jun 25, 2024

NeroBlackstone commented Jun 30, 2024

avik-pal commented Jun 30, 2024

NeroBlackstone commented Jul 1, 2024

NeroBlackstone commented Jul 3, 2024

avik-pal commented Jul 4, 2024

NeroBlackstone commented Jul 4, 2024

avik-pal commented Jul 7, 2024

NeroBlackstone commented Jul 8, 2024

codecov bot commented Jun 16, 2024 •

edited

Loading