Fix TODO in sample.sample_sequences- Avoid 'leaving last token calculation to while loop' #119

albertwujj · 2019-04-12T04:28:52Z

Hi,

This change runs the initial model step on the full context, by calling the body() function. I added an 'first' parameter defaulting to False to allow this.

bladedsupernova · 2019-04-14T14:03:35Z

Are you saying the AI was not considering all my context whatttt...huh? When I tested it it was great. What is the new change in English?

albertwujj · 2019-04-14T16:09:56Z

Run the initial prompt through the model all at once instead of the final token separately. It's just a tiny performance improvement.

bladedsupernova · 2019-04-14T16:15:30Z

How did you find it? Are you a collaborator or simply you deeply analyzed the code of GPT-2? I'm interested in paying someone a lot to understand GPT-2...let me know...

WuTheFWasThat

thanks for this! had some suggestions for making it nicer

WuTheFWasThat · 2019-05-23T23:34:56Z

src/sample.py

- def body(past, prev, output):
- next_outputs = step(hparams, prev[:, tf.newaxis], past=past)
+ def body(past, prev, output, first=False):
+ next_outputs = step(hparams, prev if first else prev[:, tf.newaxis], past=past)


how about we just always make it just prev, and don't do the later squeeze?

WuTheFWasThat · 2019-05-23T23:35:46Z

src/sample.py

 logits = next_outputs['logits'][:, -1, :] / tf.to_float(temperature)
 logits = top_k_logits(logits, k=top_k)
 samples = tf.multinomial(logits, num_samples=1, output_dtype=tf.int32)
 return [
- tf.concat([past, next_outputs['presents']], axis=-2),
+ next_outputs['presents'] if first else tf.concat([past, next_outputs['presents']], axis=-2),


i think this should probably be if past is not None instead of if first

WuTheFWasThat · 2019-05-23T23:36:52Z

src/sample.py

 ]

+ past, prev, output = body(None, context, None, first=True)


i would remove the first=True flag, and change output to just be tf.zeros([batch_size, 0]) or something like that

got it, I was considering that. Mine is simpler to read, but yours is kinda like just better.

Ok, so this was actually the source of the bug. We need to pass the context as the output.

merge fix sample.py TODO

albertwujj · 2019-05-24T03:07:31Z

In order to set the seed, I made this single change (not sure how Fire works). Then just called 'python interactive_conditional_samples.py' on the 117M model.
However, it seems that this does not result in the same output for the same prompt. This is from the original GPT-2 that I just forked:

And this is from my current fork (which has pulled from upstream):

Are there more random seeds to be fixed?

WuTheFWasThat · 2019-05-24T03:09:01Z

tensorflow seeds are dependent on the graph :( silly design

albertwujj · 2019-05-24T03:09:27Z

hmm, how do i fix that?

WuTheFWasThat · 2019-05-24T03:10:36Z

I would just test with k=1 or something. your outputs definitely look buggy though also

albertwujj · 2019-05-24T03:11:29Z

Maybe. not 'buggy' as I think I did exactly what the TODO requested. I'll take a look at the output, and re-check that the TODO request should not change the output tho.

albertwujj · 2019-05-24T03:11:54Z

Remember this is 117M. Second output looks fine.

albertwujj · 2019-05-24T03:12:42Z

Actually, is there anyone you can ask about the TODO request? I do not think it would change the output.

albertwujj · 2019-05-24T03:14:06Z

Oh yeah you do mean top_k, nvm. Thanks for the tip!

albertwujj · 2019-05-24T03:15:27Z

To put it very briefly, the TODO does not change the output because all Decoder states from past iterations are passed to the current iteration.

albertwujj · 2019-05-24T03:27:26Z

Close but not quite.
Original:

My fork:

Same prompt as above. Seems like just a bug, or at least fixable. Missing output in the beginning, but managed to get the exact same to repeat. I'll run more tests.

albertwujj · 2019-05-24T03:33:47Z

Wait, about your 'depends on graph' comment. I did not make your suggested change yet, I am testing with this, see my fork for the exact repo I am using, besides the top of the file, my version here:

albertwujj · 2019-05-24T03:36:44Z

Ok yes, it seems like my fork is completely correct except for missing the non-repeating output.
Original:

My fork:

The prompt:

albertwujj · 2019-05-24T03:38:09Z

Damn the model took the next few words right out of the article: https://www.theverge.com/2019/5/2/18525323/detective-pikachu-review-pokemon (jk obvs article released after).

I'll get to looking into the bug with my fork.

albertwujj · 2019-05-24T03:42:33Z

Testing one more time with the prompt as 'hi'

orig

my fork

Missing the comma.

albertwujj · 2019-05-24T03:53:47Z

Juuuuust in case, tested on Unconditional, with same seed fixes at the top of file:
orig

my fork

Missing a newline!
Really mysterious why this happens. I feel like it's gonna be obvious.

albertwujj · 2019-05-24T04:02:17Z

Certainly let me know if you or someone at OpenAI finds out why! I'm working on it too.

albertwujj · 2019-05-24T04:08:15Z

To summarize the above, I am testing my fork with the TODO fixed, against the actual repo, with the seeds fixed and top_k=1. It seems to match exactly, except for missing some of the output in the beginning.

albertwujj · 2019-05-24T06:32:49Z

Ok, comparing my code with the original, I think I see where I had a bug. Will fix and test.

albertwujj · 2019-05-24T07:12:42Z

I will work on removing the 'first' parameter.

albertwujj · 2019-05-24T07:50:49Z

Ok, tested after removing first param, same result:

Bug fixed! Let me know if you want me to test on other inputs. I will also test on the Detective Pikachu article quote, 'hi', and Unconditional.

P.S. after testing on the Detective Pikachu quote, I noticed that since I had added a space to the end of the prompt, I got a different result. Without the space actually gives a better result.

albertwujj · 2019-05-24T08:11:08Z

Tested on the 3 prompts, plus Unconditional. Results of this fork are the exact same as the actual repo.

top_k set to 1 and seeded with 1957, in addition to adding

tf.random.set_random_seed(1957)
np.random.seed(1957)

to the top of the script just in case.

albertwujj · 2019-05-24T18:15:31Z

The fork is complete. Barring any further tests you may have. Output matches exactly.

albertwujj · 2019-05-24T20:17:38Z

Just curious, may some of the changes (like shape invariant) cause complications for library users? Or have you just not gotten to looking and checking yet?

albertwujj · 2019-05-24T23:50:01Z

I can write a test script with the two versions of sample.py. Where should I put it? Do you have an exact way you want me to go about it?

WuTheFWasThat · 2019-05-31T04:48:32Z

ah sorry, didn't get around to looking for awhile. looks good!

…ation to while loop' (openai#119) * do initial run on full context * decrement while loop iterations * add context to output * remove first param * removing first param: change shape invariant

albertwujj added 2 commits April 11, 2019 21:20

do initial run on full context

8b24ef6

decrement while loop iterations

b1ad9af

albertwujj mentioned this pull request May 23, 2019

interactive_conditional_samples not checking if prompt length is greater than hparams.n_ctx / 2 #121

Open

WuTheFWasThat self-requested a review May 23, 2019 23:34

WuTheFWasThat reviewed May 23, 2019

View reviewed changes

Merge branch 'master' of github.com:openai/gpt-2

3a11c4a

merge fix sample.py TODO

add context to output

0e42a57

remove first param

96b941f

removing first param: change shape invariant

1f6881c

WuTheFWasThat merged commit c0859d7 into openai:master May 31, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix TODO in sample.sample_sequences- Avoid 'leaving last token calculation to while loop' #119

Fix TODO in sample.sample_sequences- Avoid 'leaving last token calculation to while loop' #119

albertwujj commented Apr 12, 2019 •

edited

Loading

bladedsupernova commented Apr 14, 2019

albertwujj commented Apr 14, 2019

bladedsupernova commented Apr 14, 2019

WuTheFWasThat left a comment

WuTheFWasThat May 23, 2019

WuTheFWasThat May 23, 2019

WuTheFWasThat May 23, 2019

albertwujj May 24, 2019

albertwujj May 24, 2019

albertwujj commented May 24, 2019 •

edited

Loading

WuTheFWasThat commented May 24, 2019

albertwujj commented May 24, 2019

WuTheFWasThat commented May 24, 2019

albertwujj commented May 24, 2019

albertwujj commented May 24, 2019 •

edited

Loading

albertwujj commented May 24, 2019

albertwujj commented May 24, 2019 •

edited

Loading

albertwujj commented May 24, 2019

albertwujj commented May 24, 2019 •

edited

Loading

albertwujj commented May 24, 2019 •

edited

Loading

albertwujj commented May 24, 2019 •

edited

Loading

albertwujj commented May 24, 2019 •

edited

Loading

albertwujj commented May 24, 2019 •

edited

Loading

albertwujj commented May 24, 2019 •

edited

Loading

albertwujj commented May 24, 2019 •

edited

Loading

albertwujj commented May 24, 2019 •

edited

Loading

albertwujj commented May 24, 2019

albertwujj commented May 24, 2019

albertwujj commented May 24, 2019 •

edited

Loading

albertwujj commented May 24, 2019 •

edited

Loading

albertwujj commented May 24, 2019

albertwujj commented May 24, 2019

albertwujj commented May 24, 2019

WuTheFWasThat commented May 31, 2019

Fix TODO in sample.sample_sequences- Avoid 'leaving last token calculation to while loop' #119

Fix TODO in sample.sample_sequences- Avoid 'leaving last token calculation to while loop' #119

Conversation

albertwujj commented Apr 12, 2019 • edited Loading

bladedsupernova commented Apr 14, 2019

albertwujj commented Apr 14, 2019

bladedsupernova commented Apr 14, 2019

WuTheFWasThat left a comment

Choose a reason for hiding this comment

WuTheFWasThat May 23, 2019

Choose a reason for hiding this comment

WuTheFWasThat May 23, 2019

Choose a reason for hiding this comment

WuTheFWasThat May 23, 2019

Choose a reason for hiding this comment

albertwujj May 24, 2019

Choose a reason for hiding this comment

albertwujj May 24, 2019

Choose a reason for hiding this comment

albertwujj commented May 24, 2019 • edited Loading

WuTheFWasThat commented May 24, 2019

albertwujj commented May 24, 2019

WuTheFWasThat commented May 24, 2019

albertwujj commented May 24, 2019

albertwujj commented May 24, 2019 • edited Loading

albertwujj commented May 24, 2019

albertwujj commented May 24, 2019 • edited Loading

albertwujj commented May 24, 2019

albertwujj commented May 24, 2019 • edited Loading

albertwujj commented May 24, 2019 • edited Loading

albertwujj commented May 24, 2019 • edited Loading

albertwujj commented May 24, 2019 • edited Loading

albertwujj commented May 24, 2019 • edited Loading

albertwujj commented May 24, 2019 • edited Loading

albertwujj commented May 24, 2019 • edited Loading

albertwujj commented May 24, 2019 • edited Loading

albertwujj commented May 24, 2019

albertwujj commented May 24, 2019

albertwujj commented May 24, 2019 • edited Loading

albertwujj commented May 24, 2019 • edited Loading

albertwujj commented May 24, 2019

albertwujj commented May 24, 2019

albertwujj commented May 24, 2019

WuTheFWasThat commented May 31, 2019

albertwujj commented Apr 12, 2019 •

edited

Loading

albertwujj commented May 24, 2019 •

edited

Loading

albertwujj commented May 24, 2019 •

edited

Loading

albertwujj commented May 24, 2019 •

edited

Loading

albertwujj commented May 24, 2019 •

edited

Loading

albertwujj commented May 24, 2019 •

edited

Loading

albertwujj commented May 24, 2019 •

edited

Loading

albertwujj commented May 24, 2019 •

edited

Loading

albertwujj commented May 24, 2019 •

edited

Loading

albertwujj commented May 24, 2019 •

edited

Loading

albertwujj commented May 24, 2019 •

edited

Loading

albertwujj commented May 24, 2019 •

edited

Loading

albertwujj commented May 24, 2019 •

edited

Loading

albertwujj commented May 24, 2019 •

edited

Loading