-
Notifications
You must be signed in to change notification settings - Fork 910
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Word2Vec tests #173
Word2Vec tests #173
Conversation
- Fixes a weird bug where tensors were spuriously disposed by tfjs in testing environment due to there being multiple tfjs instances present on the page from the different builds - Improves stack traces to make debugging easier by disabling minification in karma webpack. - Only webpack the main package, this means that the test files have to use the global ml5 object.
- validates that there are no leaked tensors - validates basic functionality of nearest - validates that add, subtract and average return things - Fixes memory leaks in add, subtract, average, and addOrSubtract functions - Adds a general dispose to the Word2Vec class
describe('subtract', () => { | ||
it('returns a value', () => { | ||
let word1 = word2vecInstance.getRandomWord(); | ||
let word2 = word2vecInstance.getRandomWord(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it technically possible the same word could be picked twice?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is, but the probability is 0.0001%
Thank you @meiamsome, amazing! ❤️ |
FYI I made a "tests" label which we can add to issues and pull requests related to unit testing! |
'src/**/*_test.js', | ||
], | ||
preprocessors: { | ||
'src/**/*_test.js': ['webpack'], | ||
'src/index.js': ['webpack'], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We had this config wrong the whole time? Why aren't the *_test.js
files being processed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So what was happening is each *_test.js
file was including all of the dependencies and webpacking that. That would probably be fine, but for some reason tfjs was struggling with that - somehow the Tensors from one instance were going through functions from the other instance and not being checked correctly, as they wouldn't be an instance of the Tensor class from that version of tfjs. This just builds the dependencies of all test suites (aka the whole library) and includes that first, so you only get one version of all the deps
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When you pass a Tensor through multiple instances of TensorFlow.js, we fail when doing input validation for op calls because "instanceof Tensor" refers to the wrong Tensor instance (from the other instantiation of TensorFlow.js). This problem is artificial and not actually fundamental -- I will send an PR to fix this in TensorFlow.js core, however it's less pressing now that you've solved it here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
amazing, thanks for the detailed description @meiamsome!
Thanks @meiamsome! this is great! This will help get a bunch of PR merged.
@nsthorat, I think this solves the issues we were discussing. |
--single-run is usually not what you want. Two ways to just run one test:
|
@nsthorat, I'm just curious why is it better to run single tests instead of a |
You can run --single-run, but the test suite dies after running the single test. Totally up to you, but it slows things down to have to continually restart a browser. |
Also nice work on this PR! I was going to make the change in tfjs-core to allow multiple instances of tensorflowjs, but sharing a ML5 and transitively a tensorflowjs instance is what you want to do. |
src/Word2vec/index.js
Outdated
return Word2Vec.nearest(this.model, sum, inputs.length, inputs.length + max); | ||
return tf.tidy(() => { | ||
const sum = Word2Vec.addOrSubtract(this.model, inputs, 'ADD'); | ||
console.log(sum); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you forgot the remove this log
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch :)
Re: |
Sure, regarding Totally your call. |
I think this is ready to merge! Let's just add the |
@cvalenzuela Ok, I added that for you |
awesome, thanks @meiamsome! |
woweeee amazing work @meiamsome @cvalenzuela @nsthorat! |
This tweaks the karma config to webpack the ml5 package similarly to the standard webpack outputs. This is because there is some weird behaviour in tfjs (I think, not 100%) that results in tensors being returned from functions like
tf.add
being disposed before being returned when there is more than one instance on the page. This means that the testing files can no longer import the files they are testing, instead they have to rely on the globalml5
variable. This is likely the issue seen in #151 where Tensors aren't being correctly type-checked in tfjs, as that's what I narrowed the weird behaviour that caused the premature disposal I saw here to as well.I then added a basic test suite for Word2Vec, which flagged up the fact that there were a lot of leaking tensors, so I fixed those.