slow in my environment #15

TinoPlayStuff · 2022-06-29T10:40:58Z

hi,

I tried this api, its very convenient but relatively slow.
So I modify it to use "session":

add one line under import requests as

import requests
re_session = requests.Session()

modify another line

response: requests.models.Response = getattr(requests, method)( 
=> response: requests.models.Response = getattr(re_session, method)(

It's much faster now, but still slower than using subprocess.Popen to call curl for the same thing.

Is this the limitation of the python requests module or maybe I did something wrong?

The text was updated successfully, but these errors were encountered:

marph91 · 2022-06-29T11:23:56Z

Hi @TinoPlayStuff,

thanks for the feedback. I didn't think about the speed for now. In general I would expect requests to be almost as fast as curl. It seems to be an issue/limitation of joppy.

Could you add a reproducer for your issue? I. e. some sample requests and how fast they are on your machine with curl and joppy? I couldn't observe a big difference when using sessions in the testsuite, but this could be due to the testsuite structure.

TinoPlayStuff · 2022-06-29T17:01:02Z

in run_py.zip, there are two python scripts, run.py and run_request.py.
They do basically the same thing. The main difference between them is that the "run.py" is totally based on curl, while in run_requests.py, many of the http stuff have been replaced with joppy api.

To run them, you have to put your joplin token in run.py.tok
and edit settings in run.py or run_requests.py as:

# <- setting
TOK_FILE = "run.py.tok"  # file contains joplin token
PUBTAG = "published"  # note with this tag will be extracted
TAGHIDE = {"published", "publishedx"} # test tag
N_FDR = "./_posts"  # where to put the exported posts
R_FDR = "./_resources"  # where to put resource files (.jpg, .png, ...)
URL = "http:https://localhost:41184/"
# -> setting

with such setting, all notes with tag "published" (defined as PUBTAG) and related resource files will be put into ./_posts and ./_resources (defined as N_FDR and R_FDR respectively)
The scripts will report start time and end time.

the joppy in this zip is modified to use requests.session
In my environment with a forty notes test, the curl version is 20% faster.
If use original joppy, it's too long and I didn't wait it finish

run_py.zip

marph91 · 2022-06-30T06:28:15Z

Unfortunately I couldn't run your script out of the box. It's recommended to pass the Popen as sequence instead of a string (https://docs.python.org/3/library/subprocess.html#subprocess.Popen) to avoid OS dependent problems.

However, I did a "dry" look at the script. Notes:

At first I thought it's about the pagination. But you resolved the pagination manually most of the time. Since your test data is only 40 notes, it shouldn't make a difference anyway. Are these plain notes or do they have tags and resources attached?
It seems like a 20 % speed difference between curl and requests is in the expected range: https://stackoverflow.com/a/32899936/7410886
It seems reasonable that using a session is faster. However, I can't see a significant speedup at my local tests. Will try further.

See: - #15 - https://requests.readthedocs.io/en/latest/user/advanced/#session-objects

TinoPlayStuff · 2022-06-30T10:05:17Z

Thanks for your explanation. Today I made some modifications and ... ...
now run_requests.py (with joppy using a session) is a little faster than run.py.

The main modification is that I now use shutil.copy2() to directly copy resource files from joplinprofile folder. Thus the original insufficient performance may be from that I used your get_resource_file function is an improper way.

FYI, I ran these scripts on a windows 11 machine with python 3.7. Without session, it takes more than one second to deal with one note and related resource files.

Thanks again for your convenient api

marph91 added a commit that referenced this issue Jun 30, 2022

use a session for speedup

4b1dc34

See: - #15 - https://requests.readthedocs.io/en/latest/user/advanced/#session-objects

marph91 mentioned this issue Jun 30, 2022

use a session for speedup #16

Merged

TinoPlayStuff closed this as completed Jun 30, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

slow in my environment #15

slow in my environment #15

TinoPlayStuff commented Jun 29, 2022

marph91 commented Jun 29, 2022

TinoPlayStuff commented Jun 29, 2022 •

edited

Loading

marph91 commented Jun 30, 2022

TinoPlayStuff commented Jun 30, 2022

slow in my environment #15

slow in my environment #15

Comments

TinoPlayStuff commented Jun 29, 2022

marph91 commented Jun 29, 2022

TinoPlayStuff commented Jun 29, 2022 • edited Loading

marph91 commented Jun 30, 2022

TinoPlayStuff commented Jun 30, 2022

TinoPlayStuff commented Jun 29, 2022 •

edited

Loading