Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configuring discovery for paths with namespace packages #5147

Closed
0cjs opened this issue Apr 19, 2019 · 21 comments
Closed

Configuring discovery for paths with namespace packages #5147

0cjs opened this issue Apr 19, 2019 · 21 comments
Labels
topic: collection related to the collection phase

Comments

@0cjs
Copy link
Contributor

0cjs commented Apr 19, 2019

The structure of Python namespace packages cannot be determined just by looking at the filesystem. Consider the following directory tree:

src/
    core.py
    aaa/
        test_core.py
    bbb/
        core.py
        test_core.py

If src/ (but nothing underneath it) is in sys.path, we have the following modules:

core
aaa
aaa.test_core
bbb
bbb.core
bbb.test_core

where aaa and bbb are namespace modules. We might also have other parts of the namespace modules outside of this tree, e.g. by adding othersrc/ to sys.path so that import bbb.other will work if othersrc/bbb/other.py exists.)

It's easy to configure pytest to have src in the path, but when discovering and loading src/aaa/test_core.py, as described here, it adds src/aaa to sys.path and imports src/aaa/test_core.py using a different module structure than the programmer may have intended: import test_core. Further, the same thing happens again with src/bbb/test_core.py and now we get a conflict because there's already a test_core module.

It would be nice to be able to configure pytest so that we could tell it that src/ may contain namespace packages and, if it does, they are all intended to be rooted at src/. That is, when discovering src/{aaa,bbb}/test_core.py do not add anything to sys.path but instead generate the fully qualified module name based on the file's position in the hierarchy relative to src/.

Another, more aggressive, option might be to see if the file is under any of the paths already in sys.path and determine the fully qualified module name by the hierarchy relative to the first path in sys.path under which the file falls.

The Stack Overflow question How do I Pytest a project using PEP 420 namespace packages? also discusses the issue.

The way I put these here makes this sound pretty simple (at least as far as description, if not implementation), but there may be other subtleties I'm missing here, about which I welcome discussion.

@asottile
Copy link
Member

I'd like to see pytest work better in this situation as well. I think once we can leave python2 behind this becomes even easier (and more common at the same time) as __init__.py files become entirely optional.

I'm not quite sure the best way to get to that state without breaking everything though 🤔

In my opinion ideally the "namespace package" situation would be the default, and the class of errors around test filename collision would disappear. Though I haven't put any thought or effort into seeing how this would work in code, @nicoddemus would know better

@asottile asottile added the topic: collection related to the collection phase label Apr 20, 2019
@0cjs
Copy link
Contributor Author

0cjs commented Apr 21, 2019

I've done a test implementation based on hoefling's answer to the SO question above except instead of having an independently-specified list of namespace package dirs I simply use sys.path, returning the top-level package or dir under the first entry from sys.path that is above the package in question. If no entry in sys,path is above the package we're trying to load (say, becuase someone's asked pytest to discover tests in a directory outside of any in sys.path, I simply pass the path on to the old code and let it do its thing.

I've only given it brief testing, but it seems to be working pretty well. I will report back again after further testing.

I don't know why I didn't think of this before, because it actually seems to be the most obvious and sensible solution (or at least partial solution) to the issue since, when the module is under sys.path we're generating a name that would load that exact module if pytest weren't running. In somewhat pathological situations it might not be the name that the developer would use, but it will definitely not be a name that could ever be used by another module. And it also, when the module can be found in the path, avoids the confusing and shadowing created by adding more locations to the path. (I use --import-mode=append, but even I've been bitten by a module unexpectedly loading in tests when it wouldn't in production.)

My current thinking on the matter is that py.path.LocalPath.pypkgpath() should take an optional second argument (defaulting to an empty list when not provided) that would be the list of directories to consider potential roots of namespace modules. You could pass in nothing to preserve the current behaviour, sys.path to get the (I think sensible) behaviour described here, or any other paths you care to configure. Getting that argument down to it probably means also adding it to pyimport() in the same module, but after that we're out of py-land and in pytest code so the rest is a matter of adding new configuration.

While using sys.path in this way actually seems like sensible default behaviour to me, it might be something we'd want either to phase in gradually (add the option first, and make it the default a few versions later) or just leave with a seemingly-poor default in the name of backward compatibility.

@girtsf
Copy link

girtsf commented May 6, 2019

I just ran into the same thing. Something like a command line option that could be added to pytest.ini to force package imports from rootdir sounds good to me.

@0cjs
Copy link
Contributor Author

0cjs commented May 6, 2019

I don't think that using rootdir for this is a good idea: rootdir is about configuration, not import path, and can even change from run to run depending on your current working directory and what paths you give to pytest for discovery.

BTW, for those wondering where I'm at with all of this, my hack above seems to have worked well over the initial week since I implemented it, but the last ten days have been a long holiday in Japan (Golden Week) and so all work on the project using this has stopped.

Assuming my change goes for another week or so without any issues, I will look at trying to find the time to code up a PR for this. (It's unfortunately not trivial in part because it requires changes to both py and pytest.) If anybody's got any special thoughts on how to go about doing this beyond what any decent developer would do, let me know.

@girtsf
Copy link

girtsf commented Nov 22, 2019

I would love to help get a PR out if somebody pointed me in the right direction on what the recommended approach is.

@RonnyPfannschmidt
Copy link
Member

we have a concept of "collection roots" floating around, however its not yet formalized

for me personally thats something i'd like to tackle sometime after sorting out the collection tree structure

many of the details are currently unclear to me

@Traktormaster
Copy link

Hi!
I've recently found out about pytest not respecting PEP420 namespaces.

To circumvent the issue I've patched the package discovery to treat paths that are already included in sys.path as namespaces. (essentially forcing import with fully qualified module name from a sys.path entry)
You can read my initial thoughts on it in detail here.

In my opinion it would be favorable to have valid PEP420 namespaces working as expected by default. With my patch, this would (I think) always be the case unless the sys.path is not set up correctly. By correctly, I mean that all places where one would import such packages are added to PYTHONPATH or sys.path directly.
One improvement on my patch would be to check if the directories in question have no files in them at all indeed. (make the PEP420 namespace check a little more correct)

@0cjs
Copy link
Contributor Author

0cjs commented Apr 1, 2020

Using existing paths in sys.path seems a good solution for Python ≤3.4, where modules can be loaded only "through" a path. However, for 3.5 and above I would prefer to see discovery use the newer loading mechanisms that allow discovered test files to be loaded as "free-floating" modules (i.e., not entered at all in sys.modules) without them needing to go through a path to generate a module name for use in sys.modules, as I did in the code in #6966. This should avoid all conflicts and preserve the path that the tests were started with, rather than runnning tests in a system with modified path. (If tests were relying on the path being changed, the test system should be changed to make explicit that those paths are part of the test environment.)

@blueyed
Copy link
Contributor

blueyed commented Apr 2, 2020

@0cjs
FWIW: it is also possible to restore sys.path after importing - but I agree that it would be better to skip/replace pylib's method here completely maybe.

@0cjs
Copy link
Contributor Author

0cjs commented Apr 2, 2020

FWIW: it is also possible to restore sys.path after importing....

@blueyed That hadn't occurred to me! I wouldn't want to do that for Python ≥3.5, where we have better methods of dealing with that, but it seems to me a perfectly reasonable idea to consider when running earlier versions of Python, in the hope of maintaining more compatibility between pytest on ≤3.4 and ≥3.5.

I suppose this would also involve removing the module from sys.modules after loading it as well, to also emulate the behaviour of "nameless" modules, and avoid accidental "'loaded' from cache" for subsequent (but different) files that map to the same module name.

This would require a bit of thought, but possibly might work quite well.

@blueyed
Copy link
Contributor

blueyed commented Apr 2, 2020

@0cjs
As for sys.path (from one of my stashes):

diff --git a/src/_pytest/python.py b/src/_pytest/python.py
index 372631f10..265bf0689 100644
--- a/src/_pytest/python.py
+++ b/src/_pytest/python.py
@@ -515,6 +515,7 @@ def _importtestmodule(self) -> ModuleType:
         # we assume we are only called once per module
         importmode = self.config.getoption("--import-mode")
         fspath = self.fspath
+        old_sys_path = sys.path[:] if importmode != "importlib" else None
         try:
             mod = fspath.pyimport(ensuresyspath=importmode)
         except SyntaxError:
@@ -554,6 +555,9 @@ def _importtestmodule(self) -> ModuleType:
                 "or @pytest.mark.skipif decorators instead, and to skip a "
                 "module use `pytestmark = pytest.mark.{skip,skipif}."
             )
+        finally:
+            if old_sys_path:
+                sys.path = old_sys_path
         self.config.pluginmanager.consider_module(mod)
         return mod

Feel free to pick it up into a PR.
Tested in blueyed#338 already (with some required test adjustments).

@blueyed
Copy link
Contributor

blueyed commented Apr 2, 2020

As for pytest you do not have to consider anything before Python 3.5 btw.

@0cjs
Copy link
Contributor Author

0cjs commented Apr 2, 2020

@blueyed Is the above code for pytest 4.x? Because, as you pointed out to me also above (thanks!), pytest 5.x no longer supports Python 3.4 or 2.7.

For more modern versions of Python, wouldn't it make sense to use the more modern important mechanisms that neither require nor touch sys.path or sys.modules, as I mention above and implement here (along with some other stuff that may not be needed or wanted, as discussed in #6966). I'd appreciate your comments on the use of importlib in that code.

@asottile
Copy link
Member

asottile commented Apr 2, 2020

I'm fairly certain you need to touch sys.modules at least as tests can and do import from other tests

blueyed added a commit to blueyed/pytest that referenced this issue Apr 3, 2020
blueyed added a commit to blueyed/pytest that referenced this issue Apr 3, 2020
blueyed added a commit to blueyed/pytest that referenced this issue Apr 3, 2020
@0cjs
Copy link
Contributor Author

0cjs commented Apr 3, 2020

@asottile writes:

I'm fairly certain you need to touch sys.modules at least as tests can and do import from other tests

Importing other tests as modules isn't a problem if you do it in the normal way. Say you have src/foo/bar_test.py in your tree, and you do not have src/foo/__init__.py. Note that bar_test.py is not a module (and has no module name), it's just a file. (It becomes a module and is assigned a module name only when loaded into a Python interpreter through an import statement or similar import mechanism.)

So with src/ in your sys.path, the normal way of importing and using a property value assigned as a global in that imported namespace module (with value = 1 or whatever at the top level in the file) is:

import foo.bar_test
assert 1 == foo.bar_test.value

This works fine currently, and will continue to work with all variants my proposed changes.

EDIT: The following is not correctly expressing what I'm trying to say, nor is the code correct; see further discussion below for details.

What will not work in my proposal is relying on src/foo/bar_test.py having been automatically imported by discovery as (top-level) module bar_test:

# No import of bar_test up to this point.
assert 1 == bar_test.value    # Fails with: NameError: name 'bar_test' is not defined

I (and I think most sensible people) would argue that that's absolutely fine. If you want to use it, import it explicitly, rather than relying on magic that breaks other things. (E.g., in the current code the above will—sometimes!—break (and often in mysterious ways) if you also have an src/bar/bar_test.py file.

@asottile
Copy link
Member

asottile commented Apr 3, 2020

I (and I think most sensible people)

please don't go after intelligence to try and make your point, it's not productive to conversation


right, but it's a major breaking change to adjust the module hierarchy here

that is currently importable as from bar_test import ... (all pytest tests are floated up to "top-level" namespace unless the directories are flushed with __init__.py)

your proposal would break existing imports. large breaking changes need very good reason to move and a migration / deprecation period.

your example where bar_test is a NameError is (as far as I know) already the case today (you'd have to stub into builtins to make that not the case) so I'm not sure what's being demonstrated there ?

@0cjs
Copy link
Contributor Author

0cjs commented Apr 5, 2020

@asottile I'm not sure we have any real disagreement here; I think that perhaps taking a more charitable interpretation of my remarks when there's ambiguity or potential errors may help keep conversations working well. You can start with taking my "most sensible people" comment as, ""people who prefer well-defined behaviour that's in line with what the Python interpreter does."

As I've mentioned several times previously, both here and in #6966, I'm well aware that what I'm proposing breaks certain kinds of changes that pytest makes to the module hierarchy, such that code is different when it runs under pytest as opposed to when it's not.

that is currently importable as from bar_test import ... (all pytest tests are floated up to "top-level" namespace unless the directories are flushed with __init__.py)

Exactly. I argue that this is behaviour that's worth changing. I believe that it's not well-defined (in particular, selecting different subsets of tests to collect can change what module is made available under a given name), and I see no indication that this was ever desired behaviour, rather than an unfortunate side-effect of the implementation (in large part due to what was available in earlier Python APIs).

your proposal would break existing imports.

Some existing imports, yes. I think that the long-term (and even short-term) benefits to the community outwigh the cost of the change.

large breaking changes need very good reason to move and a migration / deprecation period.

Yes, we are in full agreement here. (I had hoped that would be clear from my comments about backward compatibility here and in #6966, but perhaps it was not.)

your example where bar_test is a NameError is (as far as I know) already the case today (you'd have to stub into builtins to make that not the case) so I'm not sure what's being demonstrated there ?

Sorry, I got my explanation wrong in the post above. That second example would require an import bar_test. The problem I'm trying to deal with is where import bar_test would fail in the production environment, but (whether in test code or code under test) imports something in pytest.

@Traktormaster
Copy link

Today I updated pytest to 6.0.1 and it broke my testing. After a small dig I've found the culprit. The new method _pytest.pathlib.resolve_package_path does not respect the PEP420 namespaces.

Quick example:

# These are the import paths initially setup by environment and IDE, with packages added in comments.
sys.path = [
  "/myproj/component_a",
    # myproj (PEP420)
    #   abc (normal package)
    # tests (PEP420)
    #   myproj (PEP420)
    #     abc (normal package with conftest inside)
  "/python_venv/site-packages",
    # abc (installed dependency)
]

I have imports in my code like from abc import x which is supposed to import from the dependency package. When pytest imports the conftest it adds its directory to the import paths:

  "/myproj/component_a/tests/myproj",
  "/myproj/component_a",
  "/python_venv/site-packages",
]

Now the same import from abc import x will find my package called abc instead of the dependency.

As a workaround I've patched the mentioned function. It's very similar to how I've fixed PEP420 before pytest 6.0. #6966 (comment)

@john-bodley
Copy link

john-bodley commented Apr 19, 2022

I wonder if this issue is somewhat related to a problem I have when trying to test doctest modules (via the --doctest-modules argument) for a namespace package (foo.bar). The repo is of the form,

src/
    foo/
        bar/
            __init__.py
            utils.py
            baz.py
            ...
tests/
    foo/
        bar/
            conftest.py
            test_baz.py
            ...     

where utils.py contains doc tests. If I simply run,

pytest --doctest-modules --collect-only 

the src/foo/bar/utils.py file is not included and I have to explicitly run either:

pytest --doctest-modules --collect-only .

or

pytest --doctest-modules --collect-only src tests

though this actually results in certain tests failing—likely because of the import structure.

edgarrmondragon added a commit to meltano/sdk that referenced this issue Mar 15, 2023
<h2>Rationale</h2>

<p>Implicit namespace packages are directories of Python files without an <code>__init__.py</code>.
    They’re valid and importable, but they break <em>many</em> tools, such as:</p>

<ul>
    <li><p><a href="https://bugs.python.org/issue23882" rel="">unittest test discovery</a> (and by extension, Django’s test runner)</p></li>
    <li><p><a href="nedbat/coveragepy#1024" rel="">Coverage.py</a></p></li>
    <li><p>Mypy without its <a href="https://mypy.readthedocs.io/en/latest/command_line.html#import-discovery" rel="">–namespace-packages option</a></p></li>
    <li><p><a href="pytest-dev/pytest#5147" rel="">pytest</a></p></li>
</ul>

https://pypi.org/project/flake8-no-pep420/
edgarrmondragon added a commit to meltano/sdk that referenced this issue Mar 16, 2023
<h2>Rationale</h2>

<p>Implicit namespace packages are directories of Python files without an <code>__init__.py</code>.
    They’re valid and importable, but they break <em>many</em> tools, such as:</p>

<ul>
    <li><p><a href="https://bugs.python.org/issue23882" rel="">unittest test discovery</a> (and by extension, Django’s test runner)</p></li>
    <li><p><a href="nedbat/coveragepy#1024" rel="">Coverage.py</a></p></li>
    <li><p>Mypy without its <a href="https://mypy.readthedocs.io/en/latest/command_line.html#import-discovery" rel="">–namespace-packages option</a></p></li>
    <li><p><a href="pytest-dev/pytest#5147" rel="">pytest</a></p></li>
</ul>

https://pypi.org/project/flake8-no-pep420/
wosc added a commit to ZeitOnline/vivi that referenced this issue Jul 27, 2023
@dalleyg
Copy link

dalleyg commented Mar 14, 2024

I've also been having trouble with this, in particular that pytest does not run doctests for packages that use PEP-420 namespaces. My current workaround is to use the older pkgutils-style namespaces, despite their downsides.

@nicoddemus
Copy link
Member

nicoddemus commented Mar 15, 2024

I've also been having trouble with this, in particular that pytest does not run doctests for packages that use PEP-420 namespaces.

It should be supported in 8.1.1, in fact we have explicit tests for this case now:

pytest/testing/test_pathlib.py

Lines 1011 to 1040 in 2e5da5d

def test_import_using_normal_mechanism_first_integration(
self, monkeypatch: MonkeyPatch, pytester: Pytester, ns_param: bool
) -> None:
"""
Same test as above, but verify the behavior calling pytest.
We should not make this call in the same test as above, as the modules have already
been imported by separate import_path() calls.
"""
core_py, test_path1, test_path2 = self.create_installed_doctests_and_tests_dir(
pytester.path, monkeypatch
)
result = pytester.runpytest(
"--import-mode=importlib",
"-o",
f"consider_namespace_packages={ns_param}",
"--doctest-modules",
"--pyargs",
"app",
"./.tests",
)
result.stdout.fnmatch_lines(
[
f"{core_py.relative_to(pytester.path)} . *",
f"{test_path1.relative_to(pytester.path)} . *",
f"{test_path2.relative_to(pytester.path)} . *",
"* 3 passed*",
]
)

@dalleyg feel free to open a new issue with a MWE if this is not working for you yet as expected.

I will close this issue for now, this use case should be supported in 8.1+.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic: collection related to the collection phase
Projects
None yet
Development

No branches or pull requests

9 participants