-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sourcery refactored main branch #1
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Due to GitHub API limits, only the first 60 comments can be shown.
@@ -27,7 +27,6 @@ | |||
def encode_question(question, api_name): | |||
"""Encode multiple prompt instructions into a single string.""" | |||
|
|||
prompts = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function encode_question
refactored with the following changes:
- Merge consecutive list appends into a single extend (
merge-list-appends-into-extend
) - Move assignment closer to its usage within a block (
move-assign-in-block
) - Merge extend into list declaration (
merge-list-extend
) - Unwrap a constant iterable constructor (
unwrap-iterable-construction
) - Inline variable that is immediately returned (
inline-immediately-returned-variable
)
for idx, line in enumerate(f): | ||
for line in f: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lines 133-183
refactored with the following changes:
- Remove unnecessary calls to
enumerate
when the index is not used [×2] (remove-unused-enumerate
) - Replace f-string with no interpolated values with string (
remove-redundant-fstring
)
@@ -26,7 +26,6 @@ | |||
def encode_question(question, api_name): | |||
"""Encode multiple prompt instructions into a single string.""" | |||
|
|||
prompts = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function encode_question
refactored with the following changes:
- Merge consecutive list appends into a single extend (
merge-list-appends-into-extend
) - Move assignment closer to its usage within a block (
move-assign-in-block
) - Merge extend into list declaration (
merge-list-extend
) - Unwrap a constant iterable constructor (
unwrap-iterable-construction
) - Inline variable that is immediately returned (
inline-immediately-returned-variable
)
result = get_response((question, question_id, api_name, model, retrieved_doc), api_key) | ||
return result | ||
return get_response( | ||
(question, question_id, api_name, model, retrieved_doc), api_key | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function process_entry
refactored with the following changes:
- Inline variable that is immediately returned (
inline-immediately-returned-variable
)
if os.path.exists(args.retriever + '_dataset_index.json'): | ||
if os.path.exists(f'{args.retriever}_dataset_index.json'): | ||
print('data index already saved') | ||
os.environ["OPENAI_API_KEY"] = args.api_key | ||
index = retriever.load_from_disk(args.retriever + '_dataset_index.json') | ||
index = retriever.load_from_disk(f'{args.retriever}_dataset_index.json') | ||
else: | ||
print('data index being created') | ||
os.environ["OPENAI_API_KEY"] = args.api_key | ||
documents = JSONLReader().load_data(args.api_dataset) | ||
index = retriever.from_documents(documents) | ||
retriever.save_to_disk(index, args.retriever + '_dataset_index.json') | ||
retriever.save_to_disk(index, f'{args.retriever}_dataset_index.json') | ||
elif args.retriever == "bm25": | ||
from rank_bm25 import BM25Okapi | ||
corpus = [] | ||
with open(args.api_dataset, 'r') as f: | ||
for line in f: | ||
corpus.append(json.loads(line)) | ||
corpus.extend(json.loads(line) for line in f) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lines 120-154
refactored with the following changes:
- Use f-string instead of string concatenation [×3] (
use-fstring-for-concatenation
) - Replace a for append loop with list extend (
for-append-to-extend
) - Remove unnecessary calls to
enumerate
when the index is not used [×2] (remove-unused-enumerate
)
for line in f: | ||
api_database.append(json.loads(line)) | ||
|
||
api_database.extend(json.loads(line) for line in f) | ||
# Read the question answer pair datasest | ||
qa_pairs = [] | ||
with open(args.apibench, 'r') as f: | ||
for line in f: | ||
qa_pairs.append(json.loads(line)["api_data"]) | ||
|
||
qa_pairs.extend(json.loads(line)["api_data"] for line in f) | ||
# Read the language model response datasest | ||
llm_responses = [] | ||
with open(args.llm_responses, 'r') as f: | ||
for line in f: | ||
llm_responses.append(json.loads(line)) | ||
|
||
llm_responses.extend(json.loads(line) for line in f) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function parse_dataset
refactored with the following changes:
- Replace a for append loop with list extend [×3] (
for-append-to-extend
)
if ":" not in output: | ||
start = 0 | ||
else: | ||
start = output.index(":") | ||
if ")" not in output: | ||
end = -2 | ||
else: | ||
end = output.rindex(")") | ||
start = 0 if ":" not in output else output.index(":") | ||
end = -2 if ")" not in output else output.rindex(")") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function main
refactored with the following changes:
- Replace if statement with if expression [×2] (
assign-if-exp
) - Remove redundant pass statement (
remove-redundant-pass
)
node_stack = [] | ||
sub_tree_sexp_list = [] | ||
depth = 1 | ||
text = root_node.text | ||
node_stack.append([root_node, depth]) | ||
while len(node_stack) != 0: | ||
node_stack = [[root_node, depth]] | ||
while node_stack: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function get_all_sub_trees
refactored with the following changes:
- Move assignment closer to its usage within a block (
move-assign-in-block
) - Simplify sequence length comparison (
simplify-len-comparison
) - Merge append into list declaration (
merge-list-append
)
candidate_tree = parser.parse(bytes(candidate, "utf8")).root_node | ||
return candidate_tree | ||
return parser.parse(bytes(candidate, "utf8")).root_node |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function ast_parse
refactored with the following changes:
- Inline variable that is immediately returned (
inline-immediately-returned-variable
)
args_list = [] | ||
for child in node.children[0].children[0].children[1].children: | ||
if "repo_or_dir" in child.text.decode() or "model" in child.text.decode(): | ||
args_list.append(child.children[2].text) | ||
return args_list | ||
return [ | ||
child.children[2].text | ||
for child in node.children[0].children[0].children[1].children | ||
if "repo_or_dir" in child.text.decode() | ||
or "model" in child.text.decode() | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function get_args
refactored with the following changes:
- Convert for loop into list comprehension (
list-comprehension
) - Inline variable that is immediately returned (
inline-immediately-returned-variable
)
ast_match = True | ||
for arg in args_list: | ||
if arg.decode().lstrip("'").rstrip("'") not in candidate_tree.text.decode(): | ||
ast_match = False | ||
break | ||
ast_match = all( | ||
arg.decode().lstrip("'").rstrip("'") | ||
in candidate_tree.text.decode() | ||
for arg in args_list | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function ast_check
refactored with the following changes:
- Use any() instead of for loop (
use-any
) - Invert any/all to simplify comparisons (
invert-any-all
)
for line in f: | ||
api_database.append(json.loads(line)) | ||
|
||
api_database.extend(json.loads(line) for line in f) | ||
# Read the question answer pair dataset | ||
qa_pairs = [] | ||
with open(args.apibench, "r") as f: | ||
for line in f: | ||
qa_pairs.append(json.loads(line)["api_data"]) | ||
|
||
qa_pairs.extend(json.loads(line)["api_data"] for line in f) | ||
# Read the language model response dataset | ||
llm_responses = [] | ||
with open(args.llm_responses, "r") as f: | ||
for line in f: | ||
llm_responses.append(json.loads(line)) | ||
|
||
llm_responses.extend(json.loads(line) for line in f) | ||
# Parse all APIs to AST trees | ||
ast_database = [] | ||
with concurrent.futures.ThreadPoolExecutor() as executor: | ||
ast_trees = executor.map(ast_parse, (data["api_call"] for data in api_database)) | ||
for ast_tree in ast_trees: | ||
ast_database.append(ast_tree) | ||
|
||
ast_database.extend(iter(ast_trees)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function parse_dataset
refactored with the following changes:
- Replace a for append loop with list extend [×4] (
for-append-to-extend
) - Simplify generator expression (
simplify-generator
)
else: | ||
output = output[1].split("api_provider")[0] | ||
if ":" not in output: | ||
start = 0 | ||
else: | ||
start = output.index(":") | ||
if ")" not in output: | ||
end = -2 | ||
else: | ||
end = output.rindex(")") | ||
api_call = output[start + 2 : end + 1] | ||
output = output[1].split("api_provider")[0] | ||
start = 0 if ":" not in output else output.index(":") | ||
end = -2 if ")" not in output else output.rindex(")") | ||
api_call = output[start + 2 : end + 1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function process_response
refactored with the following changes:
- Remove unnecessary else after guard condition (
remove-unnecessary-else
) - Replace if statement with if expression [×2] (
assign-if-exp
)
closest_ref_len = min( | ||
ref_lens, key=lambda ref_len: (abs(ref_len - hyp_len), ref_len) | ||
) | ||
return closest_ref_len | ||
return min(ref_lens, key=lambda ref_len: (abs(ref_len - hyp_len), ref_len)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function closest_ref_length
refactored with the following changes:
- Inline variable that is immediately returned (
inline-immediately-returned-variable
)
_msg = str( | ||
"\nThe hypothesis contains 0 counts of {}-gram overlaps.\n" | ||
"Therefore the BLEU score evaluates to 0, independently of\n" | ||
"how many N-gram overlaps of lower order it contains.\n" | ||
"Consider using lower n-gram order or use " | ||
"SmoothingFunction()" | ||
).format(i + 1) | ||
_msg = f"\nThe hypothesis contains 0 counts of {i + 1}-gram overlaps.\nTherefore the BLEU score evaluates to 0, independently of\nhow many N-gram overlaps of lower order it contains.\nConsider using lower n-gram order or use SmoothingFunction()" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function SmoothingFunction.method0
refactored with the following changes:
- Remove unnecessary casts to int, str, float or bool (
remove-unnecessary-cast
) - Replace call to format with f-string (
use-fstring-for-formatting
)
m = {} | ||
# Requires an precision value for an addition ngram order. | ||
p_n_plus1 = p_n + [modified_precision(references, hypothesis, 5)] | ||
m[-1] = p_n[0] + 1 | ||
m = {-1: p_n[0] + 1} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function SmoothingFunction.method5
refactored with the following changes:
- Move assignment closer to its usage within a block (
move-assign-in-block
) - Merge dictionary assignment with declaration (
merge-dict-assign
)
if i in [0, 1]: # Skips the first 2 orders of ngrams. | ||
if i in [0, 1]: | ||
continue | ||
else: | ||
pi0 = 0 if p_n[i - 2] == 0 else p_n[i - 1] ** 2 / p_n[i - 2] | ||
# No. of ngrams in translation that matches the reference. | ||
m = p_i.numerator | ||
# No. of ngrams in translation. | ||
l = sum(1 for _ in ngrams(hypothesis, i + 1)) | ||
# Calculates the interpolated precision. | ||
p_n[i] = (m + self.alpha * pi0) / (l + self.alpha) | ||
pi0 = 0 if p_n[i - 2] == 0 else p_n[i - 1] ** 2 / p_n[i - 2] | ||
# No. of ngrams in translation that matches the reference. | ||
m = p_i.numerator | ||
# No. of ngrams in translation. | ||
l = sum(1 for _ in ngrams(hypothesis, i + 1)) | ||
# Calculates the interpolated precision. | ||
p_n[i] = (m + self.alpha * pi0) / (l + self.alpha) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function SmoothingFunction.method6
refactored with the following changes:
- Remove unnecessary else after guard condition (
remove-unnecessary-else
)
This removes the following comments ( why? ):
# Skips the first 2 orders of ngrams.
do_first_statement=['for_in_clause'] | ||
def_statement=['default_parameter'] | ||
states=states.copy() | ||
states=states.copy() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function DFG_python
refactored with the following changes:
- Move assignments closer to their usage (
move-assign
) - Hoist repeated code outside conditional statement (
hoist-statement-from-if
) - Simplify sequence length comparison [×4] (
simplify-len-comparison
) - Replace unused for index with underscore [×2] (
for-index-underscore
)
@@ -199,7 +198,6 @@ def DFG_java(root_node,index_to_code,states): | |||
for_statement=['for_statement'] | |||
enhanced_for_statement=['enhanced_for_statement'] | |||
while_statement=['while_statement'] | |||
do_first_statement=[] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function DFG_java
refactored with the following changes:
- Move assignments closer to their usage (
move-assign
) - Hoist repeated code outside conditional statement (
hoist-statement-from-if
) - Replace unused for index with underscore [×2] (
for-index-underscore
)
if s.startswith('/'): | ||
return " " # note: a space and not an empty string | ||
else: | ||
return s | ||
return " " if s.startswith('/') else s | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function remove_comments_and_docstrings
refactored with the following changes:
- Replace if statement with if expression (
assign-if-exp
)
This removes the following comments ( why? ):
# note: a space and not an empty string
else: | ||
code_tokens=[] | ||
for child in root_node.children: | ||
code_tokens+=tree_to_token_index(child) | ||
return code_tokens | ||
code_tokens=[] | ||
for child in root_node.children: | ||
code_tokens+=tree_to_token_index(child) | ||
return code_tokens |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function tree_to_token_index
refactored with the following changes:
- Remove unnecessary else after guard condition (
remove-unnecessary-else
)
x = 1; pass; del x | ||
x = 1 | ||
del x | ||
def foo(): | ||
# verify statements that end with semi-colons | ||
x = 1; pass; del x; | ||
x = 1 | ||
del x; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function GrammarTests.testSimpleStmt
refactored with the following changes:
- Remove redundant pass statement [×2] (
remove-redundant-pass
)
@@ -394,7 +397,6 @@ def testContinueStmt(self): | |||
msg = "ok" | |||
try: | |||
continue | |||
msg = "continue failed to continue inside try" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function GrammarTests.testContinueStmt
refactored with the following changes:
- Remove unreachable code (
remove-unreachable-code
)
# 'if' test ':' suite ('elif' test ':' suite)* ['else' ':' suite] | ||
if 1: pass | ||
if 1: pass | ||
else: pass | ||
if 0: pass | ||
elif 0: pass | ||
if 0: pass | ||
elif 0: pass | ||
elif 0: pass | ||
elif 0: pass | ||
else: pass | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function GrammarTests.testIf
refactored with the following changes:
- Remove redundant conditional [×8] (
remove-redundant-if
) - Remove redundant pass statement (
remove-redundant-pass
)
This removes the following comments ( why? ):
# 'if' test ':' suite ('elif' test ':' suite)* ['else' ':' suite]
else: pass | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function GrammarTests.testWhile
refactored with the following changes:
- Remove redundant pass statement (
remove-redundant-pass
)
x = ''; y = ""; self.assert_(len(x) == 0 and x == y) | ||
x = '\''; y = "'"; self.assert_(len(x) == 1 and x == y and ord(x) == 39) | ||
x = '"'; y = "\""; self.assert_(len(x) == 1 and x == y and ord(x) == 34) | ||
x = '' | ||
y = "" | ||
self.assert_(not x and x == y) | ||
x = '\'' | ||
y = "'" | ||
self.assert_(len(x) == 1 and x == y and ord(x) == 39) | ||
x = '"' | ||
y = "\"" | ||
self.assert_(len(x) == 1 and x == y and ord(x) == 34) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function GrammarTests.testStringLiterals
refactored with the following changes:
- Simplify comparison to string length (
simplify-str-len-comparison
) - Replaces an empty collection equality with a boolean operation (
simplify-empty-collection-comparison
)
x = 1; pass; del x | ||
x = 1 | ||
del x | ||
def foo(): | ||
# verify statements that end with semi-colons | ||
x = 1; pass; del x; | ||
x = 1 | ||
del x; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function GrammarTests.testSimpleStmt
refactored with the following changes:
- Remove redundant pass statement [×2] (
remove-redundant-pass
)
@@ -395,7 +398,6 @@ def testContinueStmt(self): | |||
msg = "ok" | |||
try: | |||
continue | |||
msg = "continue failed to continue inside try" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function GrammarTests.testContinueStmt
refactored with the following changes:
- Remove unreachable code (
remove-unreachable-code
)
# 'if' test ':' suite ('elif' test ':' suite)* ['else' ':' suite] | ||
if 1: pass | ||
if 1: pass | ||
else: pass | ||
if 0: pass | ||
elif 0: pass | ||
if 0: pass | ||
elif 0: pass | ||
elif 0: pass | ||
elif 0: pass | ||
else: pass | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function GrammarTests.testIf
refactored with the following changes:
- Remove redundant conditional [×8] (
remove-redundant-if
) - Remove redundant pass statement (
remove-redundant-pass
)
This removes the following comments ( why? ):
# 'if' test ':' suite ('elif' test ':' suite)* ['else' ':' suite]
else: pass | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function GrammarTests.testWhile
refactored with the following changes:
- Remove redundant pass statement (
remove-redundant-pass
)
Hello @sourcery-ai[bot]! Thanks for opening this PR. We checked the lines you've touched for PEP 8 issues, and found:
|
Branch
main
refactored by Sourcery.If you're happy with these changes, merge this Pull Request using the Squash and merge strategy.
See our documentation here.
Run Sourcery locally
Reduce the feedback loop during development by using the Sourcery editor plugin:
Review changes via command line
To manually merge these changes, make sure you're on the
main
branch, then run:Help us improve this pull request!