-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UP031 may be undesirable in raw strings and regular expressions #14555
Comments
Thanks for bringing this up. This is an interesting case. I agree that the need for escaping the
To me it's not clear how escaping I'm interested in hearing more opinions. |
I'm not convinced the f-string is better than printf-style formatting in this particular case, but there is also the option of using template strings. The example snippet could be rewritten as import re
from string import Template
KEYWORD_LENGTH = 8
re.compile(Template("^[A-Z0-9_-]{0,$N}$$").substitute(N=KEYWORD_LENGTH)) That being said, template strings are more convenient if the string contains many literal |
No, the latter case doesn't require escaping at all, per the original description where I'd argue that users of raw strings are making a pretty clear statement that they would prefer to avoid having to use escapes. Although perhaps I'd also argue that f-strings should only be used when they subjectively look prettier than the alternative (and that all use cases for PEP 701 look bad) and that therefore UP031 is too unreliable due to the inability to control the heuristics used and should simply be disabled in project configs. So maybe it doesn't matter. |
My point here was mainly that I don't see a difference between converting a normal string containing curly braces to an f-string to converting a raw string containing curly braces to an f-string. "a { len: %d }" % LENGTH
r"a { len: %d }" % LENGTH For both cases, it's necessary to escape "a { len: {{LENGTH}} }"
r"a { len: {{LENGTH}} }"
Yeah, that's a hard metric to get right for all users, but I'm all ears if someone has suggestions. I'd suggest that users feel free to use noqa comments if they feel that it arguably looks worse in the few one-off cases where this happens. |
IME, complex regex patterns with literal curly braces are where UP031 can cause the most trouble, e.g. this code snippet... fields_m = re.finditer(
r'''(?x)
(?P<key>%s)\s*:\s*function\s*\((?P<args>(?:%s|,)*)\){(?P<code>[^}]+)}
''' % (_FUNC_NAME_RE, _NAME_RE),
fields) ...would become... fields_m = re.finditer(
rf'''(?x)
(?P<key>{_FUNC_NAME_RE})\s*:\s*function\s*\((?P<args>(?:{_NAME_RE}|,)*)\){{(?P<code>[^}}]+)}}
''',
fields) ...making it significantly harder to read and maintain (IMO) |
Here's a real life example taken from astropy
This code isn't flagged for rule UP031 under ruff 0.7
but running the latest ruff 0.8.0:
On the other hand, this is the form of that string that I need to write in order to satisfy the newly expended rule UP031
rf"^[A-Z0-9_-]{{0,{KEYWORD_LENGTH}}}$"
note that I needed to escape raw
{
and}
to{{
and}}
respectively. I don't think this makes the code more readable, and may actually be counterproductive. Admittedly this is a small example were it doesn't hurt too much but there could legitimately be much more involved cases in the wild.Would it be possible to change this rule so that it ignores raw strings altogether ?
The text was updated successfully, but these errors were encountered: