Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

file based filter #99

Open
wants to merge 45 commits into
base: main
Choose a base branch
from
Open
Changes from 1 commit
Commits
Show all changes
45 commits
Select commit Hold shift + click to select a range
c13cf11
file based filter
xinzhengzhang Dec 23, 2022
0ea3fd6
Unify --file parsing logic
cpsauer Dec 24, 2022
4244747
file based filter: deal with header lib
xinzhengzhang Mar 18, 2023
0befc09
End loop if header is found & Add max attempts limit
xinzhengzhang Mar 3, 2023
92e83c1
Improve my language for --file docstring
cpsauer Mar 18, 2023
bf0405e
rm dupe itertools import
cpsauer Mar 18, 2023
9d7f9ab
use variable above
xinzhengzhang Mar 27, 2023
9903410
check file if resolved in a smaller scope
xinzhengzhang Mar 27, 2023
d7e1e31
Improve --file editor wrapper message
cpsauer Apr 24, 2023
d6f60cb
Fix no output error message and spacing.
cpsauer Apr 24, 2023
2b69f05
Fix small old typo
cpsauer May 2, 2023
242d601
Prevent template substitution in comment
cpsauer May 2, 2023
e17428f
ws
cpsauer May 2, 2023
5be98b1
Further improve --file editor wrapper message
cpsauer May 2, 2023
2e980ab
Alphabetize threading import
cpsauer May 2, 2023
8d5f2fe
Document interaction of --file and exclude_* template arguments
cpsauer May 2, 2023
e288d49
Entangled cleanups around thread structure
cpsauer May 2, 2023
4a251e5
Reduce unnecessary work with in header finding with --file flag
cpsauer May 2, 2023
0d86038
improve comments in _get_command
cpsauer May 2, 2023
da95ea6
Mark problems, fix typos, fix depleted iterator bug
cpsauer May 2, 2023
2a10207
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 2, 2023
f4fa941
Fix aquery inputs() statement not working for headers
cpsauer May 3, 2023
2e10a53
Add a few more WIP notes
cpsauer May 3, 2023
0851bca
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 3, 2023
0f120b6
Fix typo
cpsauer May 3, 2023
9d23866
Add note about allpaths issue
cpsauer May 3, 2023
2d09c75
Refine and note discussion points in _filter_through_headers
cpsauer May 4, 2023
0722615
Improve TODOs
cpsauer May 4, 2023
14e707e
Enrich todos and notes after another read through of the bazel [ac]qu…
cpsauer May 4, 2023
bd7bde1
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 4, 2023
89f583a
refresh.template.py: Estimate the number of actions that can be run b…
xinzhengzhang May 5, 2023
ef33509
refresh.template.py: quoting target_statement
xinzhengzhang May 5, 2023
f8b046e
refersh.template.py: running a preliminary query to get specified fi…
xinzhengzhang May 5, 2023
9ce1716
refresh.template.py: simplify/unify logic around target_statement_can…
xinzhengzhang May 5, 2023
40b9131
refresh.template.py: changed the _convert_compile_commands to return …
xinzhengzhang May 5, 2023
6a6777a
normalize file path if the --file is a subpath of the cwd
xinzhengzhang May 5, 2023
0453be4
refresh.template.py: for --file, don't continue traversing targets af…
xinzhengzhang May 5, 2023
97644fd
Revert "refresh.template.py: for --file, don't continue traversing ta…
xinzhengzhang May 6, 2023
e8a986b
Revert "normalize file path if the --file is a subpath of the cwd"
xinzhengzhang May 6, 2023
e67c0bd
Revert "refresh.template.py: changed the _convert_compile_commands to…
xinzhengzhang May 6, 2023
b785c3e
Revert "refresh.template.py: simplify/unify logic around target_state…
xinzhengzhang May 6, 2023
a152055
Revert "refersh.template.py: running a preliminary query to get spec…
xinzhengzhang May 6, 2023
1d8f62e
Revert "refresh.template.py: quoting target_statement"
xinzhengzhang May 6, 2023
33b2310
Revert "refresh.template.py: Estimate the number of actions that can …
xinzhengzhang May 6, 2023
84822ba
Add link to deduped aspects cquery/aquery aspects issue
cpsauer May 12, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
refresh.template.py: for --file, don't continue traversing targets af…
…ter the first command has been found
  • Loading branch information
xinzhengzhang committed May 5, 2023
commit 0453be435112dfec20ff549273ab69347c28df4b
178 changes: 92 additions & 86 deletions refresh.template.py
Original file line number Diff line number Diff line change
Expand Up @@ -942,95 +942,104 @@ def _get_compile_commands_for_aquery(aquery_target_statement: str, additional_aq
return _convert_compile_commands(parsed_aquery_output, focused_on_file)


def _get_commands(target: str, flags: str):
"""Return compile_commands.json entries for a given target and flags, gracefully tolerating errors."""
log_info(f">>> Analyzing commands used in {target}")

# Parse the --file= flag, if any, passing along all other arguments to aquery
additional_flags = shlex.split(flags) + [arg for arg in sys.argv[1:] if not arg.startswith('--file=')]
file_flags = [arg[len('--file='):] for arg in sys.argv[1:] if arg.startswith('--file=')]
if len(file_flags) > 1:
log_error(">>> At most one --file flag is supported.")
sys.exit(1)
if any(arg.startswith('--file') for arg in additional_flags):
log_error(">>> Only the --file=<file_target> form is supported.")
sys.exit(1)
def _get_commands(pairs: list):
"""Return compile_commands.json entries for a given list of target and flags, gracefully tolerating errors."""

all_compile_commands = []

for target, flags in pairs:
log_info(f">>> Analyzing commands used in {target}")
# Parse the --file= flag, if any, passing along all other arguments to aquery
additional_flags = shlex.split(flags) + [arg for arg in sys.argv[1:] if not arg.startswith('--file=')]
file_flags = [arg[len('--file='):] for arg in sys.argv[1:] if arg.startswith('--file=')]
if len(file_flags) > 1:
log_error(">>> At most one --file flag is supported.")
sys.exit(1)
if any(arg.startswith('--file') for arg in additional_flags):
log_error(">>> Only the --file=<file_target> form is supported.")
sys.exit(1)


# Screen the remaining flags for obvious issues to help people debug.

# Detect anything that looks like a build target in the flags, and issue a warning.
# Note that positional arguments after -- are all interpreted as target patterns.
# And that we have to look for targets. Checking for a - prefix is not enough. Consider the case of `-c opt`, leading to a false positive.
if ('--' in additional_flags
or any(re.match(r'-?(@|:|//)', f) for f in additional_flags)):
log_warning(""">>> The flags you passed seem to contain targets.
Try adding them as targets in your refresh_compile_commands rather than flags.
[Specifying targets at runtime isn't supported yet, and in a moment, Bazel will likely fail to parse without our help. If you need to be able to specify targets at runtime, and can't easily just add them to your refresh_compile_commands, please open an issue or file a PR. You may also want to refer to https://github.com/hedronvision/bazel-compile-commands-extractor/issues/62.]""")

# Quick (imperfect) effort at detecting flags in the targets.
# Can't detect flags starting with -, because they could be subtraction patterns.
if any(target.startswith('--') for target in shlex.split(target)):
log_warning(""">>> The target you specified seems to contain flags.
Try adding them as flags in your refresh_compile_commands rather than targets.
In a moment, Bazel will likely fail to parse.""")


# Then, actually query Bazel's compile actions for that configured target
target_statement_candidates = []
file_path = None

# Screen the remaining flags for obvious issues to help people debug.

# Detect anything that looks like a build target in the flags, and issue a warning.
# Note that positional arguments after -- are all interpreted as target patterns.
# And that we have to look for targets. Checking for a - prefix is not enough. Consider the case of `-c opt`, leading to a false positive.
if ('--' in additional_flags
or any(re.match(r'-?(@|:|//)', f) for f in additional_flags)):
log_warning(""">>> The flags you passed seem to contain targets.
Try adding them as targets in your refresh_compile_commands rather than flags.
[Specifying targets at runtime isn't supported yet, and in a moment, Bazel will likely fail to parse without our help. If you need to be able to specify targets at runtime, and can't easily just add them to your refresh_compile_commands, please open an issue or file a PR. You may also want to refer to https://github.com/hedronvision/bazel-compile-commands-extractor/issues/62.]""")

# Quick (imperfect) effort at detecting flags in the targets.
# Can't detect flags starting with -, because they could be subtraction patterns.
if any(target.startswith('--') for target in shlex.split(target)):
log_warning(""">>> The target you specified seems to contain flags.
Try adding them as flags in your refresh_compile_commands rather than targets.
In a moment, Bazel will likely fail to parse.""")


# Then, actually query Bazel's compile actions for that configured target
target_statement_candidates = []
file_path = None
compile_commands = []

if file_flags:
file_path = file_flags[0]
rel_path = os.path.relpath(file_path, os.getcwd())
if not rel_path.startswith(".."):
log_info(f">>> Detected file path {file_path} is relative path changed to {rel_path}")
file_path = rel_path

target_statement = f"deps('{target}')"
if file_path.endswith(_get_files.source_extensions):
target_statement_candidates.append(f"inputs('{re.escape(file_path)}', {target_statement})")
if file_flags:
file_path = file_flags[0]
rel_path = os.path.relpath(file_path, os.getcwd())
if not rel_path.startswith(".."):
log_info(f">>> Detected file path {file_path} is relative path changed to {rel_path}")
file_path = rel_path

target_statement = f"deps('{target}')"
if file_path.endswith(_get_files.source_extensions):
target_statement_candidates.append(f"inputs('{re.escape(file_path)}', {target_statement})")
else:
fname = os.path.basename(file_path)
label_candidates = subprocess.check_output(['bazel', 'query', f"filter('{fname}$', {target_statement})"], stderr = subprocess.PIPE, text = True).split()
# TODO compatible with windows file path
file_candidates = list(filter(lambda label: file_path in label.replace(':', '/'), label_candidates))
file_statement = '|'.join(file_candidates) if len(file_candidates) > 0 else fname

header_target_statement = f"let v = {target_statement} in attr(hdrs, '{file_statement}', $v) + attr(srcs, '{file_statement}', $v)" # Bazel does not list headers as direct inputs, but rather hides them behind "middlemen", necessitating a query like this.
target_statement_candidates.extend([
header_target_statement,
f"allpaths({target}, {header_target_statement})", # Ordering is ideal, breadth-first from the deepest dependency, despite the docs. TODO (1) There's a bazel bug that produces extra actions, not on the path but downstream, so we probably want to pass --noinclude_aspects per https://github.com/bazelbuild/bazel/issues/18289 to eliminate them (at the cost of some valid aspects). (2) We might want to benchmark with --infer_universe_scope (if supported) and --universe-scope=target with query allrdeps({header_target_statement}, <maybe some limited depth>) or rdeps, checking speed but also ordering (the docs indicate it is likely to be lost, which is a problem) and for inclusion of the header target. We'd guess it'll have the same aspects bug as allpaths. (3) We probably also also want to *just* run this query, not the whole list, since it captures the former and is therefore unlikely to add much latency, since a given header is probabably either used internally to the target (find on first match) for header-only (must traverse all paths in all targets until you get a match) for all top-level targets, and since we can separate out the last, see below.
target_statement,
])
else:
fname = os.path.basename(file_path)
label_candidates = subprocess.check_output(['bazel', 'query', f"filter('{fname}$', {target_statement})"], stderr = subprocess.PIPE, text = True).split()
# TODO compatible with windows file path
file_candidates = list(filter(lambda label: file_path in label.replace(':', '/'), label_candidates))
file_statement = '|'.join(file_candidates) if len(file_candidates) > 0 else fname

header_target_statement = f"let v = {target_statement} in attr(hdrs, '{file_statement}', $v) + attr(srcs, '{file_statement}', $v)" # Bazel does not list headers as direct inputs, but rather hides them behind "middlemen", necessitating a query like this.
target_statement_candidates.extend([
header_target_statement,
f"allpaths({target}, {header_target_statement})", # Ordering is ideal, breadth-first from the deepest dependency, despite the docs. TODO (1) There's a bazel bug that produces extra actions, not on the path but downstream, so we probably want to pass --noinclude_aspects per https://github.com/bazelbuild/bazel/issues/18289 to eliminate them (at the cost of some valid aspects). (2) We might want to benchmark with --infer_universe_scope (if supported) and --universe-scope=target with query allrdeps({header_target_statement}, <maybe some limited depth>) or rdeps, checking speed but also ordering (the docs indicate it is likely to be lost, which is a problem) and for inclusion of the header target. We'd guess it'll have the same aspects bug as allpaths. (3) We probably also also want to *just* run this query, not the whole list, since it captures the former and is therefore unlikely to add much latency, since a given header is probabably either used internally to the target (find on first match) for header-only (must traverse all paths in all targets until you get a match) for all top-level targets, and since we can separate out the last, see below.
target_statement,
])
else:
if {exclude_external_sources}:
# For efficiency, have bazel filter out external targets (and therefore actions) before they even get turned into actions or serialized and sent to us. Note: this is a different mechanism than is used for excluding just external headers.
target_statement_candidates.append(f"filter('^(//|@//)',{target_statement})")

found = False
for target_statement in target_statement_candidates:
commands = _get_compile_commands_for_aquery(target_statement, additional_flags, file_path)
compile_commands.extend(commands) # If we did the work to generate a command, we'll update it, whether it's for the requested file or not.
if {exclude_external_sources}:
# For efficiency, have bazel filter out external targets (and therefore actions) before they even get turned into actions or serialized and sent to us. Note: this is a different mechanism than is used for excluding just external headers.
target_statement_candidates.append(f"filter('^(//|@//)',{target_statement})")

found = False
compile_commands = []
for target_statement in target_statement_candidates:
commands = _get_compile_commands_for_aquery(target_statement, additional_flags, file_path)
compile_commands.extend(commands) # If we did the work to generate a command, we'll update it, whether it's for the requested file or not.
if file_flags:
if any(command.file.endswith(file_path) for command in commands):
found = True
break
log_info(f""">>> Couldn't quickly find a compile command for {file_path} in {target} under {target_statement}
Continuing gracefully...""")

all_compile_commands.extend(compile_commands)

if file_flags:
if any(command.file.endswith(file_path) for command in commands):
found = True
if found:
log_success(f">>> Finished extracting commands for {target} with --file {file_path}")
break
log_info(f""">>> Couldn't quickly find a compile command for {file_path} in {target} under {target_statement}
Continuing gracefully...""")

if file_flags and not found:
log_warning(f""">>> Couldn't quickly find a compile command for {file_path} in {target}
Continuing gracefully...""")
else:
log_warning(f""">>> Couldn't quickly find a compile command for {file_path} in {target}
Continuing gracefully...""")

if not compile_commands:
log_warning(f""">>> Bazel lists no applicable compile commands for {target}
If this is a header-only library, please instead specify a test or binary target that compiles it (search "header-only" in README.md).
Continuing gracefully...""")
if not compile_commands:
log_warning(f""">>> Bazel lists no applicable compile commands for {target}
If this is a header-only library, please instead specify a test or binary target that compiles it (search "header-only" in README.md).
Continuing gracefully...""")

log_success(f">>> Finished extracting commands for {target}")
return compile_commands
log_success(f">>> Finished extracting commands for {target}")
return all_compile_commands


def _ensure_external_workspaces_link_exists():
Expand Down Expand Up @@ -1156,16 +1165,13 @@ def _ensure_cwd_is_workspace_root():
_ensure_gitignore_entries_exist()
_ensure_external_workspaces_link_exists()

# TODO for --file, don't continue traversing targets after the first command has been found. Probably push this looping and template expansion inside of _get_commands().
target_flag_pairs = [
# Begin: template filled by Bazel
{target_flag_pairs}
# End: template filled by Bazel
]

compile_command_entries = []
for (target, flags) in target_flag_pairs:
compile_command_entries.extend(_get_commands(target, flags))
compile_command_entries = _get_commands(target_flag_pairs)

if not compile_command_entries:
log_error(">>> Not writing to compile_commands.json, since no commands were extracted.")
Expand Down