Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running graphein with pymol in parallel #261

Closed
OliviaViessmann opened this issue Feb 9, 2023 · 13 comments
Closed

Running graphein with pymol in parallel #261

OliviaViessmann opened this issue Feb 9, 2023 · 13 comments

Comments

@OliviaViessmann
Copy link

Describe the bug
I would like to run grapheins create_mesh() function in parallel on multiple workers. I assume that I need to spin up multiple pymol sessions MolViewer() for each worker specifying a dedicated PORT. However I am not sure how to set this from the "outside" -- this might actually be a feature request.

To Reproduce
Steps to reproduce the behavior:
trying to run something like
Parallel(n_jobs = 8).it(create_mesh)(pdb) for pdb in pdbs
This gets stuck if run naively.

Expected behavior
Want to specify ports for each worker so that I can run pymol sessions on each of them

  • OS: Ubuntu 20.04.4 LTS
  • Python Version 3.8.16
  • Graphein Version [e.g. 22] & how it was installed
    git pull + pip install, version: 1.5.2
@a-r-j
Copy link
Owner

a-r-j commented Feb 9, 2023

Hi @OliviaViessmann I'm trying to take a look at this but I'm struggling to get Pytorch3d working on my dev machine.

Did you run into any issues?

facebookresearch/pytorch3d#1406

@OliviaViessmann
Copy link
Author

OliviaViessmann commented Feb 9, 2023

Nope, I didn't. It runs fine for me. No issues with Pytorch3d on my end.

@a-r-j
Copy link
Owner

a-r-j commented Feb 9, 2023

If you checkout the PR the ports should now be configurable by the ProteinMeshConfig. I suppose you could zip the configs with the relevant ports with the PDBs and pass them both as args to create_mesh.

@OliviaViessmann
Copy link
Author

Hi a-r-j,
thanks a ton for looking into this and making the adaptions. I am running on the new PR and configured a port, but I think PORT=9123 is still hard coded somewhere. Did it work for you? Am I missing anything?
Here is the code snippet I use:
pymol_commands = {"pymol_commands": ["set surface_quality, 2", "show surface"]}
pymol_config = ProteinMeshConfig(**pymol_commands, pymol_port=9999)
verts_x, faces_x, aux = create_mesh(pdb_file=pdb_file_x, config=pymol_config)
I put a print statement in get_obj_file() to double check the port is set, but somewhere it spins up a pymol session with default setting, because I get:

xml-rpc server running on host localhost, port 9123
A PyMOL RPC server is already running.
xml-rpc server running on host localhost, port 9123

a-r-j added a commit that referenced this issue Feb 10, 2023
@a-r-j
Copy link
Owner

a-r-j commented Feb 10, 2023

Yep, I missed a spot!

@OliviaViessmann
Copy link
Author

Thanks!!

@a-r-j
Copy link
Owner

a-r-j commented Feb 14, 2023

Has this resolved the issue @OliviaViessmann? If so, I will merge the PR shortly.

Also, if you could share a short snippet I can turn into a test that would be super helpful :)

@OliviaViessmann
Copy link
Author

OliviaViessmann commented Feb 14, 2023

It is working 50/50. It now does run in parallel, but it does not run on the ports specified, but increments from 9123 up.
Here is a minimum snippet with port printouts

import socket
def is_port_in_use(port: int) -> bool:
    with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
        return s.connect_ex(("localhost", port)) == 0

def func(pdb_file: str):
   pymol_commands = {
        "pymol_commands": [
            "show surface", ]
    }
    port = random.randint(1025, 65535)
    while not is_port_in_use(port=port):
        port = random.randint(1025, 65535)
    print(port)
    pymol_config = ProteinMeshConfig(**pymol_commands, port=port)
    verts, faces, aux = create_mesh(pdb_file=pdb_file, config=pymol_config)
    return verts
        
def main():
    parallel_iter = Parallel(n_jobs=8).it(
        delayed(func)(pdb_file) for pdb_file in pdb_files
    )

Here is an exemplar prinout of ports and pymol outputs:

6379
xml-rpc server running on host localhost, port 9124
xml-rpc server running on host localhost, port 9125
xml-rpc server running on host localhost, port 9126
xml-rpc server running on host localhost, port 9127
xml-rpc server could not be started
xml-rpc server could not be started
xml-rpc server could not be started
xml-rpc server could not be started
9004

@a-r-j
Copy link
Owner

a-r-j commented Feb 14, 2023

Thanks!! Hmm, I'll try to check it out this week. A quick heads up though: the config param added in #262 is pymol_port rather than port

@OliviaViessmann
Copy link
Author

Sorry, yes, mistake on my end. I have the correct version running with pymol_port = port -- just did a crappy job at copy/pasting with manual edit...
I am also printing the port inside the graphein create_mesh() function with print("pymol port: ", config.pymol_port) and it is correctly set in there, but it still ramps up servers on the 912x ports

pymol port:  34873
xml-rpc server could not be started
pymol port:  9007
xml-rpc server running on host localhost, port 9125
pymol port:  9124
xml-rpc server running on host localhost, port 9126
xml-rpc server running on host localhost, port 9125

Thanks for looking into it!

@a-r-j
Copy link
Owner

a-r-j commented Feb 14, 2023

I did some digging and this looks like a pymol limitation, rather than a graphein limitation:

https://github.com/schrodinger/pymol-open-source/blob/d0a3380636e3d4079a0320b372a330dcf797d660/modules/pymol/rpc.py#L23

We need to be able to set the port on the pymol listener and, sadly, we don't have easy access to it. Also, the max retries limits the number of servers you can run.

I suppose one way to go is to patch your local pymol install. You could for example set the port via an env var that pymol would read instead of the hardcoded 9123 and make the following modification to the Graphein viewer class:

class MolViewer(object):
    def __init__(self, host=HOST, port=PORT):
        self.host = host
        self.port = int(port)
        self._process = None

    def __del__(self):
        self.stop()

    def __getattr__(self, key):
        if not self._process_is_running():
            self.start(["-cKQ"])

        return getattr(self._server, key)

    def _process_is_running(self):
        return self._process is not None and self._process.poll() is None

    def start(self, args=("-Q",), exe="pymol"):
        """Start the PyMOL RPC server and connect to it
        Start simple GUI (-xi), suppress all output (-Q):
            >>> viewer.start(["-xiQ"])
        Start headless (-cK), with some output (-q):
            >>> viewer.start(["-cKq"])
        """
        if self._process_is_running():
            print("A PyMOL RPC server is already running.")
            return

        assert isinstance(args, (list, tuple))



       ########################## CHANGE HERE


        env = os.environ.copy()
        env["PYMOL_XMLRPC_PORT"] = str(self.port)
        self._process = subprocess.Popen([exe, "-R"] + list(args), env=env)


       ########################## END CHANGE



        self._server = Server(uri="https://%s:%d/RPC2" % (self.host, self.port))

        # wait for the server
        while True:
            try:
                self._server.bg_color("white")
                break
            except IOError:
                time.sleep(0.1)

    def stop(self):
        if self._process_is_running():
            self._process.terminate()

    def display(self, width=0, height=0, ray=False, timeout=120):
        """Display PyMol session
        :param width: width in pixels (0 uses current viewport)
        :param height: height in pixels (0 uses current viewport)
        :param ray: use ray tracing (if running PyMOL headless, this parameter
        has no effect and ray tracing is always used)
        :param timeout: timeout in seconds
        Returns
        -------
        fig : IPython.display.Image
        """
        from IPython.display import Image, display
        from ipywidgets import IntProgress

        progress_max = int((timeout * 20) ** 0.5)
        progress = None
        filename = tempfile.mktemp(".png")

        try:
            self._server.png(filename, width, height, -1, int(ray))

            for i in range(1, progress_max):
                if os.path.exists(filename):
                    break

                if progress is None:
                    progress = IntProgress(min=0, max=progress_max)
                    display(progress)

                progress.value += 1
                time.sleep(i / 10.0)

            if not os.path.exists(filename):
                raise RuntimeError("timeout exceeded")

            return Image(filename)
        finally:
            if progress is not None:
                progress.close()

            try:
                os.unlink(filename)
            except:
                pass

Alternatively, I also came across this which seems to be a similar RPC component that reads from an env var.

@OliviaViessmann
Copy link
Author

Ahhhh, the lines you sent totally explain the behaviour about the ports being set between 9123 to 9128.
Ok, I might try out the local pymol patch. Or give up and manually throw this on a bunch of batch machines. Not sure what will end up being faster :)
Thanks for the workaround suggestion -- if I end up trying it I will report back on it!

Want me to close this as "not planned"?

@a-r-j
Copy link
Owner

a-r-j commented Feb 14, 2023

Keen to hear how it goes :)

@a-r-j a-r-j closed this as not planned Won't fix, can't repro, duplicate, stale Feb 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants