Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Simplifications and some fun stuff for the MNIST Gluon tutorial #13094

Merged
merged 10 commits into from
Dec 6, 2018
Merged

Simplifications and some fun stuff for the MNIST Gluon tutorial #13094

merged 10 commits into from
Dec 6, 2018

Conversation

kohr-h
Copy link
Contributor

@kohr-h kohr-h commented Nov 2, 2018

Description

This PR removes some of the (IMO) unnecessary complexity from the tutorial (one of the first that new users click on) by always using 1 GPU or 1 CPU. This makes the MLP training code much more straightforward (no splitting, no inner loop).
I also think that there was too much staccato of obvious comments in that part, while the truly non-obvious parts were left uncommented (e.g., what is ag.record()?). I tried to straighten up that part a bit, although I think some comments are still a bit on the obvious side.

Furthermore, I added a small section where some of the mislabeled (apparently it's one "l", need to fix that) images are plotted together with the predicted and true labels. I think that's more fun than just looking at an accuracy score.

I'm not finished with the whole tutorial yet, so it's WIP, but I'd like to know what you think before I put more work into this PR.

Note: I've included the 2 new images in the same directory as the tutorial for convenience, they would obviously need to go somewhere else.

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

  • Changes are complete (i.e. I finished coding on this PR)

@kohr-h kohr-h requested a review from szha as a code owner November 2, 2018 19:30
@ankkhedia
Copy link
Contributor

@kohr-h Thanks for your contribiution!

@mxnet-label-bot [pr-work-in-progress, Gluon]

@marcoabreu marcoabreu added Gluon pr-work-in-progress PR is still work in progress labels Nov 2, 2018
Copy link
Contributor

@ThomasDelteil ThomasDelteil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the much needed update @kohr-h, can I bother you to update the data loading to be pure Gluon rather than using the old DataIter API ? That means no need to reset the iterator between epoch and the data comes out of the iterator as a tuple (data, label) rather than a batch.data and batch.label

train_dataset = mx.gluon.data.vision.MNIST(train=True).transform_first(mx.gluon.data.vision.transforms.ToTensor())
test_dataset = mx.gluon.data.vision.MNIST(train=False).transform_first(mx.gluon.data.vision.transforms.ToTensor())

train_data = mx.gluon.data.DataLoader(train_dataset, shuffle=True, last_batch='rollover', batch_size=batch_size, num_workers=2)
test_data = mx.gluon.data.DataLoader(train_dataset, shuffle=False, last_batch='rollover', batch_size=batch_size, num_workers=2)

docs/tutorials/gluon/mnist.md Outdated Show resolved Hide resolved
docs/tutorials/gluon/mnist.md Outdated Show resolved Hide resolved
docs/tutorials/gluon/mnist.md Outdated Show resolved Hide resolved
docs/tutorials/gluon/mnist.md Outdated Show resolved Hide resolved
@kohr-h
Copy link
Contributor Author

kohr-h commented Nov 6, 2018

Thanks for the much needed update @kohr-h, can I bother you to update the data loading to be pure Gluon rather than using the old DataIter API ? That means no need to reset the iterator between epoch and the data comes out of the iterator as a tuple (data, label) rather than a batch.data and batch.label

No problem. I've started doing that, more to come.

@kalyc
Copy link
Contributor

kalyc commented Nov 13, 2018

Thanks for your contribution @kohr-h
Could you please update the PR as per discussions above?

@kohr-h
Copy link
Contributor Author

kohr-h commented Nov 14, 2018

Thanks for the reminder @kalyc. I got too many balls in the air. Will get this done asap.

@kohr-h
Copy link
Contributor Author

kohr-h commented Nov 28, 2018

I'm done with the rewording and restructuring of the example.

Note that I also made the CNN part more basic by moving from a custom nn.Block subclass to a nn.Sequential model. I don't think it added any value, and there are other tutorials that explain this in more detail.

Ready for review.

Copy link
Contributor

@ThomasDelteil ThomasDelteil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of nit-picking on the formatting but overall when these are addressed I think we should be good to merge. Great work on the wording and explanations. Thanks for the contribution 👍

docs/tutorials/gluon/mnist.md Outdated Show resolved Hide resolved
docs/tutorials/gluon/mnist.md Outdated Show resolved Hide resolved
docs/tutorials/gluon/mnist.md Outdated Show resolved Hide resolved
docs/tutorials/gluon/mnist.md Outdated Show resolved Hide resolved
docs/tutorials/gluon/mnist.md Show resolved Hide resolved
docs/tutorials/gluon/mnist.md Outdated Show resolved Hide resolved
docs/tutorials/gluon/mnist.md Outdated Show resolved Hide resolved
docs/tutorials/gluon/mnist.md Outdated Show resolved Hide resolved
docs/tutorials/gluon/mnist.md Outdated Show resolved Hide resolved
docs/tutorials/gluon/mnist.md Show resolved Hide resolved
@kohr-h kohr-h changed the title WIP: Simplifications and some fun stuff for the MNIST Gluon tutorial Simplifications and some fun stuff for the MNIST Gluon tutorial Nov 29, 2018
- Apply hybrid blocks
- Move outputs outside of code blocks and mark for notebooks
  to ignore
- Remove images, use external link
- Fix a few formulations
@kohr-h
Copy link
Contributor Author

kohr-h commented Nov 29, 2018

I think I addressed all the comments from the review.

@vandanavk
Copy link
Contributor

@sandeep-krishnamurthy @stu1130 for review

Copy link
Contributor

@sandeep-krishnamurthy sandeep-krishnamurthy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great. Thank you.
1 question - Why activation function in last layer?

docs/tutorials/gluon/mnist.md Outdated Show resolved Hide resolved
docs/tutorials/gluon/mnist.md Outdated Show resolved Hide resolved
@kohr-h
Copy link
Contributor Author

kohr-h commented Dec 4, 2018

1 question - Why activation function in last layer?

No real reason. I think I didn't change the original logic, but I can check again. If a different last activation is more reasonable for this tutorial, I can put it in, no problem.

@sandeep-krishnamurthy
Copy link
Contributor

Sigmoid activation is a better fit here to output the probability of 10 digits.

@kohr-h
Copy link
Contributor Author

kohr-h commented Dec 5, 2018

@sandeep-krishnamurthy I switched both nets to sigmoid activation. Note though that they perform worse with the same settings. I've tested a few learning rates, and the current ones improve things a bit. But maybe it's okay for the scope of the tutorial that the training isn't perfectly fine-tuned.

docs/tutorials/gluon/mnist.md Outdated Show resolved Hide resolved
docs/tutorials/gluon/mnist.md Outdated Show resolved Hide resolved
@kohr-h
Copy link
Contributor Author

kohr-h commented Dec 5, 2018

Okay, now we're in the right ballpark with the results. Thanks @ThomasDelteil for the catch.

@ThomasDelteil ThomasDelteil merged commit 8bbac82 into apache:master Dec 6, 2018
TaoLv added a commit that referenced this pull request Dec 6, 2018
TaoLv added a commit that referenced this pull request Dec 6, 2018
…icense file" (#13558)

* Revert "Chi_square_check for discrete distribution fix (#13543)"

This reverts commit cf6e8cb.

* Revert "Updated docs for randint operator (#13541)"

This reverts commit e0ff3c3.

* Revert "Simplifications and some fun stuff for the MNIST Gluon tutorial (#13094)"

This reverts commit 8bbac82.

* Revert "Fix #13521 (#13537)"

This reverts commit f6b4665.

* Revert "Add a retry to qemu_provision (#13551)"

This reverts commit f6f8401.

* Revert "[MXNET-769] Use MXNET_HOME in a tempdir in windows to prevent access denied due t… (#13531)"

This reverts commit bd8e0f8.

* Revert "[MXNET-1249] Fix Object Detector Performance with GPU (#13522)"

This reverts commit 1c8972c.

* Revert "Fixing a 404 in the ubuntu setup doc (#13542)"

This reverts commit cb0db29.

* Revert "Bumped minor version from 1.4.0 to 1.5.0 on master, updated License file (#13478)"

This reverts commit 40db619.
@kohr-h kohr-h deleted the gluon_mnist branch December 7, 2018 10:00
zhaoyao73 pushed a commit to zhaoyao73/incubator-mxnet that referenced this pull request Dec 13, 2018
…he#13094)

* Simplify mnist Gluon tutorial and add mislabelled sample plotting

* Add mnist Gluon tutorial images

* Gluon MNIST tutorial: Use modern Gluon constructs, fix some wordings

* [Gluon] Move to data loaders and improve wording in MNIST tutorial

* Fix broken links

* Fix spelling of mislabeled

* Final rewordings and code simplifications

* Fix things according to review

- Apply hybrid blocks
- Move outputs outside of code blocks and mark for notebooks
  to ignore
- Remove images, use external link
- Fix a few formulations

* Change activations to sigmoid in MNIST tutorial

* Remove superfluous last layer activations in MNIST tutorial
zhaoyao73 pushed a commit to zhaoyao73/incubator-mxnet that referenced this pull request Dec 13, 2018
…icense file" (apache#13558)

* Revert "Chi_square_check for discrete distribution fix (apache#13543)"

This reverts commit cf6e8cb.

* Revert "Updated docs for randint operator (apache#13541)"

This reverts commit e0ff3c3.

* Revert "Simplifications and some fun stuff for the MNIST Gluon tutorial (apache#13094)"

This reverts commit 8bbac82.

* Revert "Fix apache#13521 (apache#13537)"

This reverts commit f6b4665.

* Revert "Add a retry to qemu_provision (apache#13551)"

This reverts commit f6f8401.

* Revert "[MXNET-769] Use MXNET_HOME in a tempdir in windows to prevent access denied due t… (apache#13531)"

This reverts commit bd8e0f8.

* Revert "[MXNET-1249] Fix Object Detector Performance with GPU (apache#13522)"

This reverts commit 1c8972c.

* Revert "Fixing a 404 in the ubuntu setup doc (apache#13542)"

This reverts commit cb0db29.

* Revert "Bumped minor version from 1.4.0 to 1.5.0 on master, updated License file (apache#13478)"

This reverts commit 40db619.
zhaoyao73 added a commit to zhaoyao73/incubator-mxnet that referenced this pull request Dec 13, 2018
* upstream/master: (54 commits)
  Add notes about debug with libstdc++ symbols (apache#13533)
  add cpp example inception to nightly test (apache#13534)
  Fix exception handling api doc (apache#13519)
  fix link for gluon model zoo (apache#13583)
  ONNX import/export: Size (apache#13112)
  Update MXNetTutorialTemplate.ipynb (apache#13568)
  fix the situation where idx didn't align with rec (apache#13550)
  Fix use-before-assignment in convert_dot (apache#13511)
  License update  (apache#13565)
  Update version to v1.5.0 including clojure package (apache#13566)
  Fix flaky test test_random:test_randint_generator (apache#13498)
  Add workspace cleaning after job finished (apache#13490)
  Adding test for softmaxoutput (apache#13116)
  apache#13441 [Clojure] Add Spec Validations for the Random namespace (apache#13523)
  Revert "Bumped minor version from 1.4.0 to 1.5.0 on master, updated License file" (apache#13558)
  Chi_square_check for discrete distribution fix (apache#13543)
  Updated docs for randint operator (apache#13541)
  Simplifications and some fun stuff for the MNIST Gluon tutorial (apache#13094)
  Fix apache#13521 (apache#13537)
  Add a retry to qemu_provision (apache#13551)
  ...
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Gluon pr-work-in-progress PR is still work in progress
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants