Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BERT-flow是什么原理,还是不懂。 #20

Open
guotong1988 opened this issue May 24, 2023 · 3 comments
Open

BERT-flow是什么原理,还是不懂。 #20

guotong1988 opened this issue May 24, 2023 · 3 comments

Comments

@guotong1988
Copy link

多谢啊

@iweirman
Copy link

刚看了下,浅答一下,有什么问题的话,望大佬轻喷

paper :https://arxiv.org/pdf/2011.05864.pdf

要看懂这个论文,主要看下图1,公式4。说的很清楚了,将 BERT sentence embeddings 映射到 标准高斯分布空间。
看下这几个地方的代码:objective_towertop_prior

无监督的话主要看下model_fn_builder

@guotong1988
Copy link
Author

大佬,有监督 和 无监督 的 训练目标 各是什么啊?

@iweirman
Copy link

iweirman commented Aug 18, 2023

目标是一致的,无监督和有监督在这里的区别就是是否把监督学习的 loss 算进去,model_fn_builder
训练的是向量空间映射函数,假设这个函数是可逆的,增加了一个flow来学习这个函数。(上个评论已经说很清楚了)。

你具体要问的东西可能是无监督具体在学什么,应该在 objective_tower 函数。

with arg_scope(ops, init=init):
      encoder = glow_ops.encoder_decoder

      self.z, encoder_objective, self.eps, _, _ = encoder(
          "flow", x, self.hparams, eps=None, reverse=False)
      objective += encoder_objective

      self.z_top_shape = get_shape_list(self.z)
      prior_dist = self.top_prior()
      prior_objective = tf.reduce_sum(
          prior_dist.log_prob(self.z), axis=[1, 2, 3])
      #self.z_sample = prior_dist.sample()
      objective += prior_objective

    # bits per pixel
    _, h, w, c = get_shape_list(x)
    objective = -objective / (np.log(2) * h * w * c)

主要看下这几行

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants