Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added GlobalMaxPool, GlobalMeanPool, and flatten layers #950

Merged
merged 4 commits into from
Mar 8, 2020

Conversation

gartangh
Copy link
Contributor

@gartangh gartangh commented Dec 2, 2019

No description provided.

Transforms (w,h,c,b)-shaped input into (w*h*c,b)-shaped output,
by linearizing all values for each element in the batch.
"""
struct Flatten end
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this actually have to be a struct, or could it just as well be a function?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About to say the same, we don't really need a struct here, just a function should be sufficient.

Copy link
Contributor Author

@gartangh gartangh Dec 3, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I did this, is because I am treating flattening as a layer (like Conv en Pool) and not as a function (like softmax). Technically, it could also have an activation function as a property.

struct Flatten{F}
  σ::F
  function Flatten::F = identity) where {F}
    return new{F}(σ)
  end
end

function (f::Flatten)(x::AbstractArray)
  σ = f.σ
  σ(reshape(x, :, size(x)[end]))
end

Keras also has flattening implemented as a layer btw.

I can also change the whole part on flattening to:

flatten(x::AbstractArray) = reshape(x, :, size(x)[end])

Would that be the way to go?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A layer is a function. Since this layer does not need to store any parameters, a regular function is enough. Layers with internal parameters are callable structs, I. E. They are functions as well.

flatten(x::AbstractArray) = reshape(x, :, size(x)[end])
Would that be the way to go?

I would say so :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still tend more towards the solution with the struct.
The Flatten Layer could have properties like an activation function, the number of parameters, a name, what dimensions it flattens, the data format, ...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could, but your implementation does not. An activation function is just the next function in the chain, or alternatively the activation function of the preceeding layer, as flatten is a linear operation. No other layers have names in Flux.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the other hand, the Conv layers do have their activation function as a property.
You can thus specify Conv((3,3), 32=>64, softmax), which is a bit cleaner than splitting them up in a layer and a function.
Maybe we need both? A flatten function that can be called whenever necessary and a Flatten layer that calls that function, but also contains other properties?

Copy link
Contributor Author

@gartangh gartangh Dec 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could, but your implementation does not.

I would commit the changes where the activation function is used as a property for the Flatten layer first, before this gets merged.

Copy link
Member

@DhairyaLGandhi DhairyaLGandhi Dec 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Under Flux, there is nothing special about a layer and the way its defined, a function, or any transform is treated exactly the same. For the case of layers here, since we don't really have any parameters, we should be good with a simple layers (see stateless layers as an example). The function can be an input, I feel

EDIT: If we were to add the parameters, making it a closure makes sense, and therefore, the struct.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added a flatten function that will be called in the Flatten layer and added support for activation functions in that layer. (see 2nd commit)

Copy link
Contributor Author

@gartangh gartangh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this ready to get merged now?

@MikeInnes
Copy link
Member

I think a flatten that respects batches is useful, but the layer seems unnecessary. Haven't reviewed everything yet.

@CarloLucibello
Copy link
Member

Hi @gartangh, this PR looks good, could you rebase?

Also, maybe you could remove the Flatten struct as it seems to be the majority's opinion and document and export the flatten function.

@gartangh
Copy link
Contributor Author

Hi @CarloLucibello , I rebased, removed the Flatten struct and exported the flatten function like you asked me.

@DhairyaLGandhi
Copy link
Member

Could you perhaps rebase on master? I just updated the environment, since we had a fix go in in Zygote

@CarloLucibello CarloLucibello changed the title added GlobalMaxPool, GlobalMeanPool, and Flatten layers added GlobalMaxPool, GlobalMeanPool, and flatten layers Feb 27, 2020
MeanPool
GlobalMeanPool
DepthwiseConv
ConvTranspose
CrossCor
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add flatten here? not sure if it is the right place, maybe we could use the NNlib section for the functional alternative of struct layers, but I think adding flatten here is ok for the time being

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@CarloLucibello , I rebased and added the flatten documentation tag.

@DhairyaLGandhi
Copy link
Member

Lgtm

@CarloLucibello
Copy link
Member

nice, thanks!

bors r+

@bors
Copy link
Contributor

bors bot commented Mar 8, 2020

Build succeeded

@bors bors bot merged commit d4cf143 into FluxML:master Mar 8, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants