Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade to Tensorflow 2 #1596

Merged
merged 3 commits into from
Jun 30, 2021
Merged

Upgrade to Tensorflow 2 #1596

merged 3 commits into from
Jun 30, 2021

Conversation

joshua-cogliati-inl
Copy link
Contributor

@joshua-cogliati-inl joshua-cogliati-inl commented Jun 28, 2021


Pull Request Description

What issue does this change request address?

Closes #1595

What are the significant changes in functionality due to this change request?

Upgrades the Tensorflow version from 1.15 to 2.0.

This mainly requires removing the old Session api that Tensorflow 2 does not support.


For Change Control Board: Change Request Review

The following review must be completed by an authorized member of the Change Control Board.

  • 1. Review all computer code.
  • 2. If any changes occur to the input syntax, there must be an accompanying change to the user manual and xsd schema. If the input syntax change deprecates existing input files, a conversion script needs to be added (see Conversion Scripts).
  • 3. Make sure the Python code and commenting standards are respected (camelBack, etc.) - See on the wiki for details.
  • 4. Automated Tests should pass, including run_tests, pylint, manual building and xsd tests. If there are changes to Simulation.py or JobHandler.py the qsub tests must pass.
  • 5. If significant functionality is added, there must be tests added to check this. Tests should cover all possible options. Multiple short tests are preferred over one large test. If new development on the internal JobHandler parallel system is performed, a cluster test must be added setting, in XML block, the node <internalParallel> to True.
  • 6. If the change modifies or adds a requirement or a requirement based test case, the Change Control Board's Chair or designee also needs to approve the change. The requirements and the requirements test shall be in sync.
  • 7. The merge request must reference an issue. If the issue is closed, the issue close checklist shall be done.
  • 8. If an analytic test is changed/added is the the analytic documentation updated/added?
  • 9. If any test used as a basis for documentation examples (currently found in raven/tests/framework/user_guide and raven/docs/workshop) have been changed, the associated documentation must be reviewed and assured the text matches the example.

@moosebuild
Copy link

Job Test Fedora 32 on f8b90f4 : invalidated by @joshua-cogliati-inl

restarted civet

Switching dependency to tensorflow 2.0

Removed unneeded Session code.

acc renamed to accuracy

Switching to tensorflow random.set_seed.

Removing old code needed for tensorflow 1
@joshua-cogliati-inl
Copy link
Contributor Author

@wangcj05 and @dylanjm This is ready for review.

Copy link
Collaborator

@wangcj05 wangcj05 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you Josh!. Just one minor comment for your consideration.

@@ -40,7 +40,7 @@ Note all install methods after "main" take
<matplotlib>3.2</matplotlib>
<statsmodels/>
<cloudpickle>1.6</cloudpickle>
<tensorflow>1.15</tensorflow>
<tensorflow>2.0</tensorflow>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we loose the constraint on the version?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll try that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(With the warning that unless we switch to default to the conda-forge channel it won't make much difference (except for pip installs) since conda's regular channel doesn't have anything newer than 2.0)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, tensorflow 2.3 gives different classification results resulting in a diff error. https://civet.inl.gov/job/768524/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tensorflow 2.1 also gives different values. https://civet.inl.gov/job/768517/
It looks like tensorflow 2.1 and 2.3 are the same.
So once we can get everything to something newer than 2.0 we should be fine, but for now I think I will revert to 2.0 since most of our conda machines are stuck at that.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is good. Thanks.

@moosebuild
Copy link

Job Test qsubs on 90ebf62 : invalidated by @joshua-cogliati-inl

failed tests/cluster_tests/RavenRunsRaven/ROM

@joshua-cogliati-inl
Copy link
Contributor Author

joshua-cogliati-inl commented Jun 29, 2021

I tried <tensorflow source='forge'>2.4</tensorflow> but that results in conflicts. Unless we upgrade a bunch of libraries, and switch to conda-forge by default.

wangcj05
wangcj05 previously approved these changes Jun 29, 2021
@wangcj05
Copy link
Collaborator

Hi @joshua-cogliati-inl , what do you think if we move our default conda channel to conda-forge channel (not for this PR but for future)? I see the following benefits:

  1. more choices for libraries
  2. I have tried it before, and it seems it is much easier to resolve the library conflicts with much less installation time

@joshua-cogliati-inl
Copy link
Contributor Author

Hi @joshua-cogliati-inl , what do you think if we move our default conda channel to conda-forge channel (not for this PR but for future)? I see the following benefits:

1. more choices for libraries

2. I have tried it before, and it seems it is much easier to resolve the library conflicts with much less installation time

I think that it would make sense. It would let us use tensorflow 2.4 and python 3.8, both of which are held back by the other conda version right now.

@moosebuild
Copy link

Job Mingw Test on 1955050 : invalidated by @joshua-cogliati-inl

mingw was busy

Copy link
Collaborator

@wangcj05 wangcj05 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changes are good.

@wangcj05
Copy link
Collaborator

Checklist is satisfied, and all tests passed.

@wangcj05 wangcj05 merged commit 44bbfd3 into devel Jun 30, 2021
@wangcj05 wangcj05 deleted the cogljj/upgrade_tensorflow branch June 30, 2021 15:28
@wangcj05
Copy link
Collaborator

@joshua-cogliati-inl I have merged your PR. Nice work and thanks.

@moosebuild
Copy link

Job Test CentOS 7 on 1955050 : invalidated by @joshua-cogliati-inl

checking civet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[TASK] Upgrade to Tensorflow 2
3 participants