Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sourcery refactored main branch #1

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Conversation

sourcery-ai[bot]
Copy link

@sourcery-ai sourcery-ai bot commented Nov 10, 2023

Branch main refactored by Sourcery.

If you're happy with these changes, merge this Pull Request using the Squash and merge strategy.

See our documentation here.

Run Sourcery locally

Reduce the feedback loop during development by using the Sourcery editor plugin:

Review changes via command line

To manually merge these changes, make sure you're on the main branch, then run:

git fetch origin sourcery/main
git merge --ff-only FETCH_HEAD
git reset HEAD^

Help us improve this pull request!

Copy link
Author

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Due to GitHub API limits, only the first 60 comments can be shown.

@@ -27,7 +27,6 @@
def encode_question(question, api_name):
"""Encode multiple prompt instructions into a single string."""

prompts = []
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function encode_question refactored with the following changes:

Comment on lines -133 to +136
for idx, line in enumerate(f):
for line in f:
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lines 133-183 refactored with the following changes:

@@ -26,7 +26,6 @@
def encode_question(question, api_name):
"""Encode multiple prompt instructions into a single string."""

prompts = []
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function encode_question refactored with the following changes:

Comment on lines -91 to +96
result = get_response((question, question_id, api_name, model, retrieved_doc), api_key)
return result
return get_response(
(question, question_id, api_name, model, retrieved_doc), api_key
)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function process_entry refactored with the following changes:

Comment on lines -120 to +138
if os.path.exists(args.retriever + '_dataset_index.json'):
if os.path.exists(f'{args.retriever}_dataset_index.json'):
print('data index already saved')
os.environ["OPENAI_API_KEY"] = args.api_key
index = retriever.load_from_disk(args.retriever + '_dataset_index.json')
index = retriever.load_from_disk(f'{args.retriever}_dataset_index.json')
else:
print('data index being created')
os.environ["OPENAI_API_KEY"] = args.api_key
documents = JSONLReader().load_data(args.api_dataset)
index = retriever.from_documents(documents)
retriever.save_to_disk(index, args.retriever + '_dataset_index.json')
retriever.save_to_disk(index, f'{args.retriever}_dataset_index.json')
elif args.retriever == "bm25":
from rank_bm25 import BM25Okapi
corpus = []
with open(args.api_dataset, 'r') as f:
for line in f:
corpus.append(json.loads(line))
corpus.extend(json.loads(line) for line in f)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lines 120-154 refactored with the following changes:

Comment on lines -87 to +93
for line in f:
api_database.append(json.loads(line))

api_database.extend(json.loads(line) for line in f)
# Read the question answer pair datasest
qa_pairs = []
with open(args.apibench, 'r') as f:
for line in f:
qa_pairs.append(json.loads(line)["api_data"])

qa_pairs.extend(json.loads(line)["api_data"] for line in f)
# Read the language model response datasest
llm_responses = []
with open(args.llm_responses, 'r') as f:
for line in f:
llm_responses.append(json.loads(line))

llm_responses.extend(json.loads(line) for line in f)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function parse_dataset refactored with the following changes:

Comment on lines -133 to +126
if ":" not in output:
start = 0
else:
start = output.index(":")
if ")" not in output:
end = -2
else:
end = output.rindex(")")
start = 0 if ":" not in output else output.index(":")
end = -2 if ")" not in output else output.rindex(")")
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function main refactored with the following changes:

Comment on lines -23 to +27
node_stack = []
sub_tree_sexp_list = []
depth = 1
text = root_node.text
node_stack.append([root_node, depth])
while len(node_stack) != 0:
node_stack = [[root_node, depth]]
while node_stack:
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function get_all_sub_trees refactored with the following changes:

Comment on lines -49 to +48
candidate_tree = parser.parse(bytes(candidate, "utf8")).root_node
return candidate_tree
return parser.parse(bytes(candidate, "utf8")).root_node
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function ast_parse refactored with the following changes:

Comment on lines -57 to +60
args_list = []
for child in node.children[0].children[0].children[1].children:
if "repo_or_dir" in child.text.decode() or "model" in child.text.decode():
args_list.append(child.children[2].text)
return args_list
return [
child.children[2].text
for child in node.children[0].children[0].children[1].children
if "repo_or_dir" in child.text.decode()
or "model" in child.text.decode()
]
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function get_args refactored with the following changes:

Comment on lines -78 to +81
ast_match = True
for arg in args_list:
if arg.decode().lstrip("'").rstrip("'") not in candidate_tree.text.decode():
ast_match = False
break
ast_match = all(
arg.decode().lstrip("'").rstrip("'")
in candidate_tree.text.decode()
for arg in args_list
)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function ast_check refactored with the following changes:

Comment on lines -93 to +105
for line in f:
api_database.append(json.loads(line))

api_database.extend(json.loads(line) for line in f)
# Read the question answer pair dataset
qa_pairs = []
with open(args.apibench, "r") as f:
for line in f:
qa_pairs.append(json.loads(line)["api_data"])

qa_pairs.extend(json.loads(line)["api_data"] for line in f)
# Read the language model response dataset
llm_responses = []
with open(args.llm_responses, "r") as f:
for line in f:
llm_responses.append(json.loads(line))

llm_responses.extend(json.loads(line) for line in f)
# Parse all APIs to AST trees
ast_database = []
with concurrent.futures.ThreadPoolExecutor() as executor:
ast_trees = executor.map(ast_parse, (data["api_call"] for data in api_database))
for ast_tree in ast_trees:
ast_database.append(ast_tree)

ast_database.extend(iter(ast_trees))
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function parse_dataset refactored with the following changes:

Comment on lines -130 to +124
else:
output = output[1].split("api_provider")[0]
if ":" not in output:
start = 0
else:
start = output.index(":")
if ")" not in output:
end = -2
else:
end = output.rindex(")")
api_call = output[start + 2 : end + 1]
output = output[1].split("api_provider")[0]
start = 0 if ":" not in output else output.index(":")
end = -2 if ")" not in output else output.rindex(")")
api_call = output[start + 2 : end + 1]
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function process_response refactored with the following changes:

closest_ref_len = min(
ref_lens, key=lambda ref_len: (abs(ref_len - hyp_len), ref_len)
)
return closest_ref_len
return min(ref_lens, key=lambda ref_len: (abs(ref_len - hyp_len), ref_len))
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function closest_ref_length refactored with the following changes:

Comment on lines -432 to +429
_msg = str(
"\nThe hypothesis contains 0 counts of {}-gram overlaps.\n"
"Therefore the BLEU score evaluates to 0, independently of\n"
"how many N-gram overlaps of lower order it contains.\n"
"Consider using lower n-gram order or use "
"SmoothingFunction()"
).format(i + 1)
_msg = f"\nThe hypothesis contains 0 counts of {i + 1}-gram overlaps.\nTherefore the BLEU score evaluates to 0, independently of\nhow many N-gram overlaps of lower order it contains.\nConsider using lower n-gram order or use SmoothingFunction()"
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function SmoothingFunction.method0 refactored with the following changes:

Comment on lines -514 to +507
m = {}
# Requires an precision value for an addition ngram order.
p_n_plus1 = p_n + [modified_precision(references, hypothesis, 5)]
m[-1] = p_n[0] + 1
m = {-1: p_n[0] + 1}
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function SmoothingFunction.method5 refactored with the following changes:

Comment on lines -538 to +536
if i in [0, 1]: # Skips the first 2 orders of ngrams.
if i in [0, 1]:
continue
else:
pi0 = 0 if p_n[i - 2] == 0 else p_n[i - 1] ** 2 / p_n[i - 2]
# No. of ngrams in translation that matches the reference.
m = p_i.numerator
# No. of ngrams in translation.
l = sum(1 for _ in ngrams(hypothesis, i + 1))
# Calculates the interpolated precision.
p_n[i] = (m + self.alpha * pi0) / (l + self.alpha)
pi0 = 0 if p_n[i - 2] == 0 else p_n[i - 1] ** 2 / p_n[i - 2]
# No. of ngrams in translation that matches the reference.
m = p_i.numerator
# No. of ngrams in translation.
l = sum(1 for _ in ngrams(hypothesis, i + 1))
# Calculates the interpolated precision.
p_n[i] = (m + self.alpha * pi0) / (l + self.alpha)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function SmoothingFunction.method6 refactored with the following changes:

This removes the following comments ( why? ):

# Skips the first 2 orders of ngrams.

Comment on lines -28 to +29
do_first_statement=['for_in_clause']
def_statement=['default_parameter']
states=states.copy()
states=states.copy()
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function DFG_python refactored with the following changes:

@@ -199,7 +198,6 @@ def DFG_java(root_node,index_to_code,states):
for_statement=['for_statement']
enhanced_for_statement=['enhanced_for_statement']
while_statement=['while_statement']
do_first_statement=[]
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function DFG_java refactored with the following changes:

Comment on lines -53 to +54
if s.startswith('/'):
return " " # note: a space and not an empty string
else:
return s
return " " if s.startswith('/') else s

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function remove_comments_and_docstrings refactored with the following changes:

This removes the following comments ( why? ):

# note: a space and not an empty string

Comment on lines -70 to +71
else:
code_tokens=[]
for child in root_node.children:
code_tokens+=tree_to_token_index(child)
return code_tokens
code_tokens=[]
for child in root_node.children:
code_tokens+=tree_to_token_index(child)
return code_tokens
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function tree_to_token_index refactored with the following changes:

Comment on lines -275 to +281
x = 1; pass; del x
x = 1
del x
def foo():
# verify statements that end with semi-colons
x = 1; pass; del x;
x = 1
del x;

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function GrammarTests.testSimpleStmt refactored with the following changes:

@@ -394,7 +397,6 @@ def testContinueStmt(self):
msg = "ok"
try:
continue
msg = "continue failed to continue inside try"
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function GrammarTests.testContinueStmt refactored with the following changes:

Comment on lines -528 to +530
# 'if' test ':' suite ('elif' test ':' suite)* ['else' ':' suite]
if 1: pass
if 1: pass
else: pass
if 0: pass
elif 0: pass
if 0: pass
elif 0: pass
elif 0: pass
elif 0: pass
else: pass
pass
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function GrammarTests.testIf refactored with the following changes:

This removes the following comments ( why? ):

# 'if' test ':' suite ('elif' test ':' suite)* ['else' ':' suite]

Comment on lines -544 to -545
else: pass

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function GrammarTests.testWhile refactored with the following changes:

Comment on lines -928 to +921
x = ''; y = ""; self.assert_(len(x) == 0 and x == y)
x = '\''; y = "'"; self.assert_(len(x) == 1 and x == y and ord(x) == 39)
x = '"'; y = "\""; self.assert_(len(x) == 1 and x == y and ord(x) == 34)
x = ''
y = ""
self.assert_(not x and x == y)
x = '\''
y = "'"
self.assert_(len(x) == 1 and x == y and ord(x) == 39)
x = '"'
y = "\""
self.assert_(len(x) == 1 and x == y and ord(x) == 34)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function GrammarTests.testStringLiterals refactored with the following changes:

Comment on lines -276 to +282
x = 1; pass; del x
x = 1
del x
def foo():
# verify statements that end with semi-colons
x = 1; pass; del x;
x = 1
del x;

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function GrammarTests.testSimpleStmt refactored with the following changes:

@@ -395,7 +398,6 @@ def testContinueStmt(self):
msg = "ok"
try:
continue
msg = "continue failed to continue inside try"
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function GrammarTests.testContinueStmt refactored with the following changes:

Comment on lines -529 to +531
# 'if' test ':' suite ('elif' test ':' suite)* ['else' ':' suite]
if 1: pass
if 1: pass
else: pass
if 0: pass
elif 0: pass
if 0: pass
elif 0: pass
elif 0: pass
elif 0: pass
else: pass
pass
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function GrammarTests.testIf refactored with the following changes:

This removes the following comments ( why? ):

# 'if' test ':' suite ('elif' test ':' suite)* ['else' ':' suite]

Comment on lines -545 to -546
else: pass

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function GrammarTests.testWhile refactored with the following changes:

@pep8speaks
Copy link

Hello @sourcery-ai[bot]! Thanks for opening this PR. We checked the lines you've touched for PEP 8 issues, and found:

Line 45:40: E231 missing whitespace after ','

Line 43:40: E231 missing whitespace after ','

Line 124:32: E203 whitespace before ':'

Line 429:80: E501 line too long (261 > 79 characters)
Line 538:13: E741 ambiguous variable name 'l'

Line 30:1: E302 expected 2 blank lines, found 1
Line 89:42: E231 missing whitespace after ','
Line 91:21: E225 missing whitespace around operator
Line 93:20: E225 missing whitespace around operator
Line 93:43: E231 missing whitespace after ','
Line 96:33: E225 missing whitespace around operator
Line 96:38: E231 missing whitespace after ','

Line 29:11: E225 missing whitespace around operator
Line 54:16: E225 missing whitespace around operator
Line 60:30: E225 missing whitespace around operator
Line 61:26: E231 missing whitespace after ','
Line 61:39: E231 missing whitespace after ':'
Line 61:45: E231 missing whitespace after ','
Line 92:16: E225 missing whitespace around operator
Line 148:20: E225 missing whitespace around operator
Line 151:20: E225 missing whitespace around operator
Line 166:20: E225 missing whitespace around operator
Line 175:26: E231 missing whitespace after ','
Line 175:39: E231 missing whitespace after ':'
Line 175:45: E231 missing whitespace after ','
Line 178:27: E225 missing whitespace around operator
Line 226:16: E225 missing whitespace around operator
Line 232:30: E225 missing whitespace around operator
Line 233:26: E231 missing whitespace after ','
Line 233:39: E231 missing whitespace after ':'
Line 233:45: E231 missing whitespace after ','
Line 258:26: E231 missing whitespace after ','
Line 258:39: E231 missing whitespace after ':'
Line 258:45: E231 missing whitespace after ','
Line 319:16: E225 missing whitespace around operator
Line 321:25: E225 missing whitespace around operator
Line 321:54: E231 missing whitespace after ','
Line 327:30: E225 missing whitespace around operator
Line 329:16: E225 missing whitespace around operator
Line 344:20: E225 missing whitespace around operator
Line 353:26: E231 missing whitespace after ','
Line 353:39: E231 missing whitespace after ':'
Line 353:45: E231 missing whitespace after ','
Line 356:27: E225 missing whitespace around operator

Line 280:18: E703 statement ends with a semicolon
Line 761:53: E231 missing whitespace after ','
Line 761:55: E231 missing whitespace after ','
Line 761:80: E501 line too long (85 > 79 characters)
Line 826:80: E501 line too long (84 > 79 characters)
Line 852:16: E231 missing whitespace after ','
Line 854:17: E703 statement ends with a semicolon

Line 281:18: E703 statement ends with a semicolon
Line 762:53: E231 missing whitespace after ','
Line 762:55: E231 missing whitespace after ','
Line 762:80: E501 line too long (85 > 79 characters)
Line 827:80: E501 line too long (84 > 79 characters)
Line 853:16: E231 missing whitespace after ','
Line 855:17: E703 statement ends with a semicolon

Line 320:9: E731 do not assign a lambda expression, use a def
Line 344:18: E703 statement ends with a semicolon
Line 550:48: E701 multiple statements on one line (colon)
Line 716:53: E231 missing whitespace after ','
Line 716:55: E231 missing whitespace after ','
Line 716:80: E501 line too long (85 > 79 characters)
Line 781:80: E501 line too long (84 > 79 characters)
Line 807:16: E231 missing whitespace after ','
Line 809:17: E703 statement ends with a semicolon
Line 852:80: E501 line too long (81 > 79 characters)

Line 320:9: E731 do not assign a lambda expression, use a def
Line 344:18: E703 statement ends with a semicolon
Line 550:48: E701 multiple statements on one line (colon)
Line 716:53: E231 missing whitespace after ','
Line 716:55: E231 missing whitespace after ','
Line 716:80: E501 line too long (85 > 79 characters)
Line 781:80: E501 line too long (84 > 79 characters)
Line 807:16: E231 missing whitespace after ','
Line 809:17: E703 statement ends with a semicolon
Line 852:80: E501 line too long (81 > 79 characters)

Line 653:9: E731 do not assign a lambda expression, use a def
Line 694:18: E703 statement ends with a semicolon
Line 1303:53: E231 missing whitespace after ','
Line 1303:55: E231 missing whitespace after ','
Line 1303:80: E501 line too long (85 > 79 characters)
Line 1368:80: E501 line too long (84 > 79 characters)
Line 1394:16: E231 missing whitespace after ','
Line 1396:17: E703 statement ends with a semicolon
Line 1439:80: E501 line too long (81 > 79 characters)
Line 1503:30: E701 multiple statements on one line (colon)

Line 21:1: W191 indentation contains tabs

Line 68:16: E225 missing whitespace around operator
Line 70:20: E225 missing whitespace around operator

Line 71:21: E116 unexpected indentation (comment)
Line 72:21: E116 unexpected indentation (comment)
Line 73:21: E116 unexpected indentation (comment)
Line 74:21: E116 unexpected indentation (comment)
Line 74:52: W291 trailing whitespace

Line 35:1: E302 expected 2 blank lines, found 1

Line 425:80: E501 line too long (261 > 79 characters)
Line 534:13: E741 ambiguous variable name 'l'

Line 52:80: E501 line too long (96 > 79 characters)

Line 51:80: E501 line too long (96 > 79 characters)
Line 127:80: E501 line too long (84 > 79 characters)
Line 133:80: E501 line too long (81 > 79 characters)

Line 212:80: E501 line too long (81 > 79 characters)

Line 68:80: E501 line too long (82 > 79 characters)

Line 53:80: E501 line too long (80 > 79 characters)
Line 59:80: E501 line too long (83 > 79 characters)
Line 64:80: E501 line too long (81 > 79 characters)
Line 74:80: E501 line too long (83 > 79 characters)
Line 91:80: E501 line too long (92 > 79 characters)
Line 99:80: E501 line too long (81 > 79 characters)
Line 104:80: E501 line too long (83 > 79 characters)

Line 82:80: E501 line too long (82 > 79 characters)

Line 73:80: E501 line too long (82 > 79 characters)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
2 participants