Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Model structure redundancy #64

Open
grig-guz opened this issue Aug 10, 2020 · 2 comments
Open

Model structure redundancy #64

grig-guz opened this issue Aug 10, 2020 · 2 comments

Comments

@grig-guz
Copy link

grig-guz commented Aug 10, 2020

Hi,

The span width embedding over here:

span_width_emb = tf.get_variable("span_width_prior_embeddings", [self.config["max_span_width"], self.config["feature_size"]], initializer=tf.truncated_normal_initializer(stddev=0.02)) # [W, emb]

is pretty much equivalent to the span embedding over there, since the width embedding is concatenated to other span embeddings and then passed through a linear layer:
span_width_emb = tf.gather(tf.get_variable("span_width_embeddings", [self.config["max_span_width"], self.config["feature_size"]], initializer=tf.truncated_normal_initializer(stddev=0.02)), span_width_index) # [k, emb]

I am trying to reimplement your model in Pytorch, so I was just wondering if there is any rationale for using two sets of span width embeddings?

Thank you.

@Fantabulous-J
Copy link

Fantabulous-J commented Aug 13, 2020

Hi @grig-guz! I have also implemented this model using Pytorch but always have a performance gap of around 1.2 F1 scores with the official results reported on paper. How does your implementation go on? Maybe we could share some ideas and experiences with each other.

@grig-guz
Copy link
Author

Hi @Fantabulous-J, sure. I've got around 74 F1 on the dev set with Spanbert-Base, didn't run on the test set yet. My email is on my github page, you can write me there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants