Running graphein with pymol in parallel #261

OliviaViessmann · 2023-02-09T16:06:59Z

Describe the bug
I would like to run grapheins create_mesh() function in parallel on multiple workers. I assume that I need to spin up multiple pymol sessions MolViewer() for each worker specifying a dedicated PORT. However I am not sure how to set this from the "outside" -- this might actually be a feature request.

To Reproduce
Steps to reproduce the behavior:
trying to run something like
Parallel(n_jobs = 8).it(create_mesh)(pdb) for pdb in pdbs
This gets stuck if run naively.

Expected behavior
Want to specify ports for each worker so that I can run pymol sessions on each of them

OS: Ubuntu 20.04.4 LTS
Python Version 3.8.16
Graphein Version [e.g. 22] & how it was installed
git pull + pip install, version: 1.5.2

The text was updated successfully, but these errors were encountered:

a-r-j · 2023-02-09T20:53:51Z

Hi @OliviaViessmann I'm trying to take a look at this but I'm struggling to get Pytorch3d working on my dev machine.

Did you run into any issues?

facebookresearch/pytorch3d#1406

OliviaViessmann · 2023-02-09T21:59:30Z

Nope, I didn't. It runs fine for me. No issues with Pytorch3d on my end.

a-r-j · 2023-02-09T22:54:34Z

If you checkout the PR the ports should now be configurable by the ProteinMeshConfig. I suppose you could zip the configs with the relevant ports with the PDBs and pass them both as args to create_mesh.

OliviaViessmann · 2023-02-10T15:46:56Z

Hi a-r-j,
thanks a ton for looking into this and making the adaptions. I am running on the new PR and configured a port, but I think PORT=9123 is still hard coded somewhere. Did it work for you? Am I missing anything?
Here is the code snippet I use:
pymol_commands = {"pymol_commands": ["set surface_quality, 2", "show surface"]}
pymol_config = ProteinMeshConfig(**pymol_commands, pymol_port=9999)
verts_x, faces_x, aux = create_mesh(pdb_file=pdb_file_x, config=pymol_config)
I put a print statement in get_obj_file() to double check the port is set, but somewhere it spins up a pymol session with default setting, because I get:

xml-rpc server running on host localhost, port 9123
A PyMOL RPC server is already running.
xml-rpc server running on host localhost, port 9123

a-r-j · 2023-02-10T18:11:35Z

Yep, I missed a spot!

OliviaViessmann · 2023-02-11T20:30:45Z

Thanks!!

a-r-j · 2023-02-14T00:34:19Z

Has this resolved the issue @OliviaViessmann? If so, I will merge the PR shortly.

Also, if you could share a short snippet I can turn into a test that would be super helpful :)

OliviaViessmann · 2023-02-14T15:30:47Z

It is working 50/50. It now does run in parallel, but it does not run on the ports specified, but increments from 9123 up.
Here is a minimum snippet with port printouts

import socket
def is_port_in_use(port: int) -> bool:
    with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
        return s.connect_ex(("localhost", port)) == 0

def func(pdb_file: str):
   pymol_commands = {
        "pymol_commands": [
            "show surface", ]
    }
    port = random.randint(1025, 65535)
    while not is_port_in_use(port=port):
        port = random.randint(1025, 65535)
    print(port)
    pymol_config = ProteinMeshConfig(**pymol_commands, port=port)
    verts, faces, aux = create_mesh(pdb_file=pdb_file, config=pymol_config)
    return verts
        
def main():
    parallel_iter = Parallel(n_jobs=8).it(
        delayed(func)(pdb_file) for pdb_file in pdb_files
    )

Here is an exemplar prinout of ports and pymol outputs:

6379
xml-rpc server running on host localhost, port 9124
xml-rpc server running on host localhost, port 9125
xml-rpc server running on host localhost, port 9126
xml-rpc server running on host localhost, port 9127
xml-rpc server could not be started
xml-rpc server could not be started
xml-rpc server could not be started
xml-rpc server could not be started
9004

a-r-j · 2023-02-14T15:39:50Z

Thanks!! Hmm, I'll try to check it out this week. A quick heads up though: the config param added in #262 is pymol_port rather than port

OliviaViessmann · 2023-02-14T16:01:24Z

Sorry, yes, mistake on my end. I have the correct version running with pymol_port = port -- just did a crappy job at copy/pasting with manual edit...
I am also printing the port inside the graphein create_mesh() function with print("pymol port: ", config.pymol_port) and it is correctly set in there, but it still ramps up servers on the 912x ports

pymol port:  34873
xml-rpc server could not be started
pymol port:  9007
xml-rpc server running on host localhost, port 9125
pymol port:  9124
xml-rpc server running on host localhost, port 9126
xml-rpc server running on host localhost, port 9125

Thanks for looking into it!

a-r-j · 2023-02-14T17:16:45Z

I did some digging and this looks like a pymol limitation, rather than a graphein limitation:

https://github.com/schrodinger/pymol-open-source/blob/d0a3380636e3d4079a0320b372a330dcf797d660/modules/pymol/rpc.py#L23

We need to be able to set the port on the pymol listener and, sadly, we don't have easy access to it. Also, the max retries limits the number of servers you can run.

I suppose one way to go is to patch your local pymol install. You could for example set the port via an env var that pymol would read instead of the hardcoded 9123 and make the following modification to the Graphein viewer class:

class MolViewer(object):
    def __init__(self, host=HOST, port=PORT):
        self.host = host
        self.port = int(port)
        self._process = None

    def __del__(self):
        self.stop()

    def __getattr__(self, key):
        if not self._process_is_running():
            self.start(["-cKQ"])

        return getattr(self._server, key)

    def _process_is_running(self):
        return self._process is not None and self._process.poll() is None

    def start(self, args=("-Q",), exe="pymol"):
        """Start the PyMOL RPC server and connect to it
        Start simple GUI (-xi), suppress all output (-Q):
            >>> viewer.start(["-xiQ"])
        Start headless (-cK), with some output (-q):
            >>> viewer.start(["-cKq"])
        """
        if self._process_is_running():
            print("A PyMOL RPC server is already running.")
            return

        assert isinstance(args, (list, tuple))



       ########################## CHANGE HERE


        env = os.environ.copy()
        env["PYMOL_XMLRPC_PORT"] = str(self.port)
        self._process = subprocess.Popen([exe, "-R"] + list(args), env=env)


       ########################## END CHANGE



        self._server = Server(uri="https://%s:%d/RPC2" % (self.host, self.port))

        # wait for the server
        while True:
            try:
                self._server.bg_color("white")
                break
            except IOError:
                time.sleep(0.1)

    def stop(self):
        if self._process_is_running():
            self._process.terminate()

    def display(self, width=0, height=0, ray=False, timeout=120):
        """Display PyMol session
        :param width: width in pixels (0 uses current viewport)
        :param height: height in pixels (0 uses current viewport)
        :param ray: use ray tracing (if running PyMOL headless, this parameter
        has no effect and ray tracing is always used)
        :param timeout: timeout in seconds
        Returns
        -------
        fig : IPython.display.Image
        """
        from IPython.display import Image, display
        from ipywidgets import IntProgress

        progress_max = int((timeout * 20) ** 0.5)
        progress = None
        filename = tempfile.mktemp(".png")

        try:
            self._server.png(filename, width, height, -1, int(ray))

            for i in range(1, progress_max):
                if os.path.exists(filename):
                    break

                if progress is None:
                    progress = IntProgress(min=0, max=progress_max)
                    display(progress)

                progress.value += 1
                time.sleep(i / 10.0)

            if not os.path.exists(filename):
                raise RuntimeError("timeout exceeded")

            return Image(filename)
        finally:
            if progress is not None:
                progress.close()

            try:
                os.unlink(filename)
            except:
                pass

Alternatively, I also came across this which seems to be a similar RPC component that reads from an env var.

OliviaViessmann · 2023-02-14T17:27:40Z

Ahhhh, the lines you sent totally explain the behaviour about the ports being set between 9123 to 9128.
Ok, I might try out the local pymol patch. Or give up and manually throw this on a bunch of batch machines. Not sure what will end up being faster :)
Thanks for the workaround suggestion -- if I end up trying it I will report back on it!

Want me to close this as "not planned"?

a-r-j · 2023-02-14T17:52:14Z

Keen to hear how it goes :)

a-r-j mentioned this issue Feb 9, 2023

make pymol host and port configurable via config #261 #262

Closed

5 tasks

a-r-j added a commit that referenced this issue Feb 10, 2023

add config to session configuration #261

5284cde

a-r-j closed this as not planned Won't fix, can't repro, duplicate, stale Feb 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Running graphein with pymol in parallel #261

Running graphein with pymol in parallel #261

OliviaViessmann commented Feb 9, 2023

a-r-j commented Feb 9, 2023

OliviaViessmann commented Feb 9, 2023 •

edited

Loading

a-r-j commented Feb 9, 2023

OliviaViessmann commented Feb 10, 2023

a-r-j commented Feb 10, 2023

OliviaViessmann commented Feb 11, 2023

a-r-j commented Feb 14, 2023

OliviaViessmann commented Feb 14, 2023 •

edited

Loading

a-r-j commented Feb 14, 2023 •

edited

Loading

OliviaViessmann commented Feb 14, 2023

a-r-j commented Feb 14, 2023

OliviaViessmann commented Feb 14, 2023

a-r-j commented Feb 14, 2023

Running graphein with pymol in parallel #261

Running graphein with pymol in parallel #261

Comments

OliviaViessmann commented Feb 9, 2023

a-r-j commented Feb 9, 2023

OliviaViessmann commented Feb 9, 2023 • edited Loading

a-r-j commented Feb 9, 2023

OliviaViessmann commented Feb 10, 2023

a-r-j commented Feb 10, 2023

OliviaViessmann commented Feb 11, 2023

a-r-j commented Feb 14, 2023

OliviaViessmann commented Feb 14, 2023 • edited Loading

a-r-j commented Feb 14, 2023 • edited Loading

OliviaViessmann commented Feb 14, 2023

a-r-j commented Feb 14, 2023

OliviaViessmann commented Feb 14, 2023

a-r-j commented Feb 14, 2023

OliviaViessmann commented Feb 9, 2023 •

edited

Loading

OliviaViessmann commented Feb 14, 2023 •

edited

Loading

a-r-j commented Feb 14, 2023 •

edited

Loading