Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

non_ascii_to_octal: Return the input string if it only contains printable ASCII characters #3199

Merged
merged 1 commit into from
Apr 25, 2024

Conversation

seisman
Copy link
Member

@seisman seisman commented Apr 24, 2024

Description of proposed changes

The non_ascii_to_octal function was added in #2584 to convert any non-ASCII characters in a string to the corresponding octal codes. It's applied to all argument strings and texts passed in Figure.text. In most cases, we don't use non-ASCII characters, so applying this function is not necessary.

This PR improve the non_ascii_to_octal function so that it simply return the input string if the input string doesn't contain any non-ASCII characters.

Here is the benchmark.

Main branch

In [1]: from pygmt.helpers import non_ascii_to_octal

In [2]: argstr = "".join([chr(c) for c in range(32, 127)])

In [3]: argstr
Out[3]: ' !"#$%&\'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~'

In [4]: %timeit non_ascii_to_octal(argstr)
243 µs ± 1.08 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

In [5]: %timeit non_ascii_to_octal(argstr + "α")
244 µs ± 1.03 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

This PR

In [1]: from pygmt.helpers import non_ascii_to_octal

In [2]: argstr = "".join([chr(c) for c in range(32, 127)])

In [3]: argstr
Out[3]: ' !"#$%&\'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~'

In [4]: %timeit non_ascii_to_octal(argstr)
5.82 µs ± 8.87 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

In [5]: %timeit non_ascii_to_octal(argstr + "α")
240 µs ± 233 ns per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

For strings that don't contain any non-ASCII characters, this PR is much faster (5.8 µs vs 243 µs).

@seisman seisman added the enhancement Improving an existing feature label Apr 24, 2024
@seisman seisman added this to the 0.12.0 milestone Apr 24, 2024
@seisman seisman added the run/benchmark Trigger the benchmark workflow in PRs label Apr 24, 2024
Copy link

codspeed-hq bot commented Apr 24, 2024

CodSpeed Performance Report

Merging #3199 will improve performances by 82.47%

Comparing non_ascii_to_octal/faster (1eca4e4) with main (99bc5d3)

Summary

⚡ 43 improvements
✅ 56 untouched benchmarks

Benchmarks breakdown

Benchmark main non_ascii_to_octal/faster Change
test_basemap 91.5 ms 72.3 ms +26.65%
test_binstats_no_outgrid 166.2 ms 128 ms +29.8%
test_colorbar 88.1 ms 74.7 ms +17.99%
test_config_format_date_map 266.4 ms 239.2 ms +11.35%
test_dimfilter_no_outgrid 79.8 ms 46.3 ms +72.14%
test_figure_repr 251 ms 176.9 ms +41.91%
test_grd2cpt 237.2 ms 210.6 ms +12.67%
test_grdclip_no_outgrid 72.5 ms 45.5 ms +59.27%
test_grdcontour_labels 133.1 ms 92.5 ms +43.86%
test_grdcut_dataarray_in_dataarray_out 94.6 ms 74 ms +27.85%
test_grdfill_dataarray_out 59.3 ms 45.8 ms +29.41%
test_grdfilter_dataarray_in_dataarray_out 79.7 ms 46.2 ms +72.63%
test_grdgradient_no_outgrid 65.9 ms 45.8 ms +44.1%
test_compute_bins_no_outfile 63.9 ms 43.7 ms +46.35%
test_equalize_grid_no_outgrid 65.6 ms 45.4 ms +44.31%
test_grdinfo 41.7 ms 28.3 ms +47.51%
test_grdproject_no_outgrid[+proj=merc +ellps=WGS84 +units=m +width=10] 82.5 ms 55.7 ms +48.21%
test_grdproject_no_outgrid[EPSG:3395 +width=10] 75.6 ms 48.8 ms +54.87%
test_grdproject_no_outgrid[M10c] 72.6 ms 45.8 ms +58.42%
test_grdsample_dataarray_out 78.6 ms 54.6 ms +44%
... ... ... ... ...

ℹ️ Only the first 20 benchmarks are displayed. Go to the app to view all benchmarks.

@seisman seisman added needs review This PR has higher priority and needs review. and removed run/benchmark Trigger the benchmark workflow in PRs labels Apr 24, 2024
@seisman seisman added final review call This PR requires final review and approval from a second reviewer and removed needs review This PR has higher priority and needs review. labels Apr 24, 2024
@seisman seisman merged commit bc673bc into main Apr 25, 2024
26 of 28 checks passed
@seisman seisman deleted the non_ascii_to_octal/faster branch April 25, 2024 00:21
@seisman seisman removed the final review call This PR requires final review and approval from a second reviewer label Apr 25, 2024
seisman added a commit to seisman/pygmt that referenced this pull request Apr 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Improving an existing feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants