Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added orca log file reading capability #47

Merged
merged 6 commits into from
Mar 14, 2019
Merged

Conversation

evohringer
Copy link
Contributor

This adds a orca log output file reader which reads in the SCF energy and atomic numbers and coordinates.
This new function is tested in test_orca for which a new log file was created in the data directory.
I also changed the horton-convert script.

Please provide feedback since this would be a draft for other QC output file readers.

Refs: #43

@codecov
Copy link

codecov bot commented Mar 7, 2019

Codecov Report

Merging #47 into master will increase coverage by 0.08%.
The diff coverage is 96.92%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #47      +/-   ##
==========================================
+ Coverage   92.58%   92.67%   +0.08%     
==========================================
  Files          33       35       +2     
  Lines        3114     3179      +65     
  Branches      386      392       +6     
==========================================
+ Hits         2883     2946      +63     
- Misses        156      158       +2     
  Partials       75       75
Impacted Files Coverage Δ
iodata/orca.py 100% <100%> (ø)
iodata/test/test_orca.py 93.33% <93.33%> (ø)
iodata/test/test_log.py
iodata/log.py
iodata/gaussianlog.py 98.07% <0%> (ø)
iodata/test/test_gaussianlog.py 96.82% <0%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 738b146...bf49b8e. Read the comment docs.

@tovrstra
Copy link
Member

tovrstra commented Mar 7, 2019

I'm going to copy-paste a few of the test outputs for convenience, so you can see what it complains about:

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
###                               pycodestyle                                ###
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
USING              : 2.5.0
RUNNING            : pycodestyle iodata/orca.py iodata/test/test_orca.py --config=.pycodestylerc
68:24     iodata/orca.py  E225 missing whitespace around operator
71:13     iodata/orca.py  E265 block comment should start with '# '
79:1      iodata/orca.py  E302 expected 2 blank lines, found 1
103:1     iodata/orca.py  E303 too many blank lines (3)
129:22    iodata/orca.py  E231 missing whitespace after ','
130:22    iodata/orca.py  E231 missing whitespace after ','
131:22    iodata/orca.py  E231 missing whitespace after ','
137:22    iodata/orca.py  E231 missing whitespace after ','
138:22    iodata/orca.py  E231 missing whitespace after ','
139:22    iodata/orca.py  E231 missing whitespace after ','
50:1      iodata/test/test_orca.py  E302 expected 2 blank lines, found 1
56:1      iodata/test/test_orca.py  E302 expected 2 blank lines, found 1
68:1      iodata/test/test_orca.py  E302 expected 2 blank lines, found 1
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
###                                pydocstyle                                ###
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
USING              : 3.0.0
RUNNING            : pydocstyle iodata/orca.py iodata/test/test_orca.py
39:-      iodata/orca.py  D414 Section has no content ('Notes')
80:-      iodata/orca.py  D208 Docstring is over-indented
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
###                                whitespace                                ###
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
115:-     iodata/test/data/water_orca.out  trailing whitespace
201:-     iodata/test/data/water_orca.out  trailing whitespace
290:-     iodata/test/data/water_orca.out  trailing whitespace
291:-     iodata/test/data/water_orca.out  trailing whitespace
339:-     iodata/test/data/water_orca.out  trailing whitespace
340:-     iodata/test/data/water_orca.out  trailing whitespace
341:-     iodata/test/data/water_orca.out  trailing whitespace
342:-     iodata/test/data/water_orca.out  trailing whitespace
343:-     iodata/test/data/water_orca.out  trailing whitespace
344:-     iodata/test/data/water_orca.out  trailing whitespace
345:-     iodata/test/data/water_orca.out  trailing whitespace
346:-     iodata/test/data/water_orca.out  trailing whitespace
410:-     iodata/test/data/water_orca.out  trailing whitespace
416:-     iodata/test/data/water_orca.out  trailing whitespace

and

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
###                                  pylint                                  ###
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
USING              : pylint 2.3.1astroid 2.2.4
RUNNING            : pylint iodata/orca.py iodata/test/test_orca.py --jobs=2 --output-format=json
29:-      iodata/orca.py  unused-import Unused set_four_index_element imported from utils
68:22     iodata/orca.py  bad-whitespace Exactly one space required after assignment
                words =line.split()
                      ^
129:21    iodata/orca.py  bad-whitespace Exactly one space required after comma
        coordinates[i,0] = float(words[5])
                     ^
130:21    iodata/orca.py  bad-whitespace Exactly one space required after comma
        coordinates[i,1] = float(words[6])
                     ^
131:21    iodata/orca.py  bad-whitespace Exactly one space required after comma
        coordinates[i,2] = float(words[7])
                     ^
134:4     iodata/orca.py  unreachable Unreachable code
137:21    iodata/orca.py  bad-whitespace Exactly one space required after comma
        coordinates[i,0] = float(words[5])
                     ^
138:21    iodata/orca.py  bad-whitespace Exactly one space required after comma
        coordinates[i,1] = float(words[6])
                     ^
139:21    iodata/orca.py  bad-whitespace Exactly one space required after comma
        coordinates[i,2] = float(words[7])
                     ^
25:-      iodata/test/test_orca.py  unused-import Unused import os
56:-      iodata/test/test_orca.py  missing-docstring Missing function docstring
72:-      iodata/test/test_orca.py  superfluous-parens Unnecessary parens after 'assert' keyword

@tovrstra
Copy link
Member

tovrstra commented Mar 7, 2019

@evohringer Can you also add an exclusion for data files to the whitespace linter? This can be done in the following place:
https://github.com/theochem/iodata/blob/master/.cardboardlint.yml#L11

The following should make more sense:

- whitespace:
    filefilter: ['- iodata/test/data/*']

@tovrstra
Copy link
Member

tovrstra commented Mar 8, 2019

The tests bumped into a few more things. See https://travis-ci.org/theochem/iodata/jobs/503215662#L912

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
###                                  pylint                                  ###
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
USING              : pylint 2.3.1astroid 2.2.4
RUNNING            : pylint iodata/orca.py iodata/test/test_orca.py --jobs=2 --output-format=json
29:-      iodata/orca.py  unused-import Unused set_four_index_element imported from utils
131:4     iodata/orca.py  unreachable Unreachable code
25:-      iodata/test/test_orca.py  unused-import Unused import os
58:-      iodata/test/test_orca.py  missing-docstring Missing function docstring
75:-      iodata/test/test_orca.py  superfluous-parens Unnecessary parens after 'assert' keyword

@evohringer
Copy link
Contributor Author

evohringer commented Mar 8, 2019

Sorry @tovrstra to bother you with this. Is there a way I can check this before making a pull request?

@tovrstra
Copy link
Member

tovrstra commented Mar 8, 2019

Good point. Yes, but it is not as easy as it should be. (I'm doing something about it.) You can run the following:

pip install --upgrade git+https://github.com/theochem/cardboardlint.git@master#egg=cardboardlint
pip install --upgrade pylint codecov pycodestyle pydocstyle
python setup.py build_ext -i
cardboardlinter -r origin/master

The last line will do the actual work, with the first two lines installing software, which you may already have. The third line does an in-place build, which is needed to keep pylint happy with the Python extension in iodata.

Can you give this a try? If something breaks, please let me know.

@evohringer
Copy link
Contributor Author

It works , great. Thanks.

but it does not print out any errors:
`RUNNING : git diff -U0 origin/master --relative

###                                  import                                  ###
###                                namespace                                 ###
###                                  pylint                                  ###

USING : pylint 1.6.4, astroid 2.2.4

###                               pycodestyle                                ###

USING : 2.5.0

###                                pydocstyle                                ###

USING : 3.0.0

###                                whitespace                                ###

@tovrstra
Copy link
Member

tovrstra commented Mar 8, 2019

That probably is probably because you used the master branch (and also origin/master) for your development. Do you still have a reference in your repo to the master branch of theochem/iodata? That would be needed to get the same test output locally.

It is in general a good idea to use the master branch of cloned repos only to follow up the upstream commits. For clarity, it is recommended to make a new "feature" branch in which you make your changes. This way, you can always easily compare with what you started from, or update the master branch with new upstream developments.

This is a quick summary of that workflow, assuming you would start from scratch, just to sketch the idea:

# Clone the primary repo first
git clone [email protected]:theochem/iodata.git
# Now `origin` refers to theochem/iodata
# Add your fork as the second remote (after making the fork on github.com)
cd iodata
git remote add qcmm [email protected]:qcmm/iodata.git
# Make a feature branch, in which a new feature will be developed
git checkout -b orca
# After some changes commit these:
git add ...
git commit ...
# Push the feature branch to  your repo
git push qcmm feature
# A link will be printed on your terminal to make a pull request
# If needed make more commits in the feature branch and push them to qcmm/iodata.
# The pull request will be updated automatically.

With this setup origin/master still refers to the one from theochem/iodata.

You can check your current remotes as follows, with the output I typically get as comments:

git remote show
# origin
# tovrstra
git remote show origin
#* remote origin
#  Fetch URL: [email protected]:theochem/iodata.git
#  Push  URL: [email protected]:theochem/iodata.git
#  HEAD branch: master
#  Remote branches:
#    master            tracked
#  Local branch configured for 'git pull':
#    master rebases onto remote master
#  Local ref configured for 'git push':
#    master pushes to master (up to date)
git remote show tovrstra
#* remote tovrstra
#  Fetch URL: [email protected]:tovrstra/iodata.git
#  Push  URL: [email protected]:tovrstra/iodata.git
#  HEAD branch: master
#  Remote branches:
#    master  new (next fetch will store in remotes/tovrstra)
#  Local refs configured for 'git push':
#    master pushes to master (fast-forwardable)

(I removed some branches for clarity.)
With this setup, I only pull commits from origin (short name for theochem/iodata), never push to it, so I don't even need write access to it. It can pull and push with the second remote, i.e. tovrstra. I guess you now have

git remote show origin
#* remote origin
#  Fetch URL: git:https://github.com/QCMM/iodata.git
#  Push  URL: git:https://github.com/QCMM/iodata.git
#  HEAD branch: master
#  Remote branches:
#    master            tracked
#  Local ref configured for 'git push':
#    master pushes to master (local out of date)

The following commands will show how to switch in a new clone of your fork. A verbose command prompt helps a lot to see on which branch you are working. (See e.g. https://gist.github.com/kevin-smets/8568070 for an OSX example.)

# start somewhere outside a git repo.
git clone https://github.com/QCMM/iodata.git
# rewire the remotes
git remote rename origin qcmm
git remote add origin [email protected]:theochem/iodata.git
# rename your development branch
git branch -m master orca
# get back the master branch from theochem/iodata
git fetch origin master:master
# run the linters
./setup.py build_ext -i
cardboardlinter -r master
# You can commit fixes now.
# force-push the master to your repo
git push qcmm master:master -f
# Now you can close this PR.
# Time to push the feature branch to your fork:
git push qcmm orca
# You should get a link on your terminal to make a PR for this branch.

@@ -36,7 +36,7 @@ def parse_args():
' This only works if the input contains sufficient'
' data for the output')
parser.add_argument('-V', '--version', action='version',
version="%%(prog)s (HORTON version %s)" % version.__version__)
version="%%(prog)s (HORTON version 2.0)" )
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We used this in the past so you could define version numbers using git tags. It was too easy to neglect to update a line somewhere and have inconsistent version numbers in the program.

The downside is that if you didn't use the travisCI infrastructure, the version.py file didn't get generated. It's probably better to use a try: ... except ImportError ... fallback to get around this instead of hard coding the number.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we should make this easier. I'm working on something (roberto), but is not quite ready yet.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did add a try: ... except ImporError as suggested by Matt. Hope this is fine. Thanks @matt-chan

@evohringer
Copy link
Contributor Author

evohringer commented Mar 11, 2019

Thanks Toon for the nice guide. If this is ok I would try it for the next implementation and then I try from scratch as you suggested.

One thing which is bothering me is that installing IOData locally does not allow me to run the tests in the installed version because the "path" of iodata is not set. I guess this is done automatically in travis right?? Is there a way to make two steps locally before submitting:
1.) run nosetests
2.) run cardboarlint (for this I now know how to do that and will do it for the next implementation)

Sorry to bother with this stupid things. I hope that conversation also help others implementing new features in iodata.

@tovrstra
Copy link
Member

Sure, no problem.

I don't fully understand your question regarding the path, but I'll try to give some answer. At the moment, the following steps are needed to run the tests in the source tree. The instructions are given for a fresh clone of the repo, which I just tested:

git clone [email protected]:theochem/iodata.git
cd iodata
python3 tools/gitversion.py python > iodata/version.py
python3 -m pip install pytest cython --user 
python3 setup.py build_ext -i
python3 -m pytest iodata

I agree, this is not intuitive at all, and this should become easier, which I'm working on. Let me clarify a few things, so you can get around it until we have a better way:

  • On OSX, python scripts installed with pip with the --user option, tend to get installed in a directory that is not present in the PATH variable. You either have to modify PATH to fix this or you can run most programs as python -m [name_of_module] [args]. I used that trick above to increase the chances that it will work on OSX. I don't know which directory you would have to add to the PATH because I have no access to OSX. This is an example of that problem, but not sure if it is representative: https://travis-ci.com/theochem/roberto/jobs/183614464#L103 (line 103)

  • If python3 is already the default python on your system, you may just use python instead of python3.

  • We recently started using pytest instead of nosetests. It is a more up-to-date testing framework compared to nosetests.

P.S. I'll take the quote of my message out of your post for clarity.

Copy link
Member

@tovrstra tovrstra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just went through it and found some small things to comment on, but nothing dramatic. Code looks good.

- pycodestyle:
config: .pycodestylerc
- pydocstyle:
- whitespace:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you remove this file? I think it is a backup copy that accidentally committed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

try:
version="%%(prog)s (HORTON version %s)" % version.__version__
except ImportError:
raise ImportError('No version.py file found'))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

version.py should normally be present. (See my last post on how to add it.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is ok I will leave for people who just try it out after installation

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, until our build and development workflow is simplified, we can have this. It is convenient.

iodata/orca.py Show resolved Hide resolved
def test_load_water_number():
# test if IOData has atomic numbers
with path('iodata.test.data', 'water_orca.out') as fn_xyz:
mol = IOData.from_file(str(fn_xyz))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

str function may not be needed. I've seen it before but normally path objects get converted to strings when needed, safe for some corner cases maybe?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deleted str

mol : IOData
IOdata dictionary.

Returns
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to document return if nothing is returned.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deleted Return

"""
assert mol.numbers[0] == 8
assert mol.numbers[1] == 1
assert mol.numbers[2] == 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again a very minor suggestion, mainly sharing it because you might like the trick:

# The following line does not start with assert, because this is done inside the assert_equal function.
np.testing.assert_equal(mol.numbers, [8, 1, 1])

The module np.testing contains a lot of convenient tricks to write assertions for arrays. See https://docs.scipy.org/doc/numpy-1.13.0/reference/routines.testing.html

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to np.testing. This is a nice feature. I will have a look.

def check_water(mol):
"""Checks if atomic numbers and coordinates obtained from orca out file are correct.

Parameters
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you remove one space of indentation from this and following lines in the docstring. (travis tests fail on this)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please replace Checks with Check? First line should be in imperative mood.

@FarnazH
Copy link
Member

FarnazH commented Mar 12, 2019

A few small things:

  1. Can you please use iodata HEADER for orca.py and horton-convert. It can be found here: https://github.com/theochem/iodata/blob/master/HEADER
  2. Probably horton-convert should be renamed as iodata-convert.

@evohringer
Copy link
Contributor Author

evohringer commented Mar 12, 2019

A few small things:
1) Can you please use iodata HEADER for orca.py and horton-convert. It can be found here: https://github.com/theochem/iodata/blob/master/HEADER
2) Probably horton-convert should be renamed as iodata-convert.

All done. Thanks for checking.

@tovrstra
Copy link
Member

LGTM. Thanks!

@tovrstra
Copy link
Member

@evohringer I'll let you merge. (You should see a green button.) It is generally better to let the author of PR perform the merge, because he/she can decide best when the PR is final.

@evohringer
Copy link
Contributor Author

No green button to merge. Everything is green but I can not find the button.

The last part says:
This branch has no conflicts with the base branch
Only those with write access to this repository can merge pull requests.

@tovrstra
Copy link
Member

OK that's right, only few have write access to avoid unfortunate accidents. I'll do it.

@tovrstra tovrstra merged commit 99cff80 into theochem:master Mar 14, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants