use evals to evaluate LLAMA #1

lucemia · 2023-03-15T02:56:21Z

No description provided.

lucemia · 2023-03-15T06:09:58Z

evals-llama/evals/elsuite/basic/match.py

Lines 28 to 38 in a7fe8e0

 def eval_sample(self, sample: Any, *_): 

 prompt = sample["input"] 

 if self.num_few_shot > 0: 

 assert is_chat_prompt(sample["input"]), "few shot requires chat prompt" 

 prompt = sample["input"][:-1] 

 for s in self.few_shot[: self.num_few_shot]: 

 prompt += s["sample"] 

 prompt += sample["input"][-1:] 

 return evals.check_sampled_text(self.model_spec, prompt, expected=sample["ideal"])

lucemia · 2023-03-15T06:14:14Z

> /Users/chienhsundavidchen/repo/evals-llama/evals/cli/oaieval.py(204)run()
-> result = eval.run(recorder)
(Pdb) eval
<evals.elsuite.basic.match.Match object at 0x13043fbe0>
(Pdb) eval.run
<bound method Match.run of <evals.elsuite.basic.match.Match object at 0x13043fbe0>>
(Pdb) eval_spec
EvalSpec(cls='evals.elsuite.basic.match:Match', args={'samples_jsonl': 'test_match/samples.jsonl'}, key='test-match.s1.simple-v0', group='test-basic')
(Pdb) args.eval
args = Namespace(model='gpt-3.5-turbo', eval='test-match', embedding_model='', ranking_model='', extra_eval_params='', max_samples=None, cache=True, visible=None, seed=20220722, user='', record_path=None, log_to_file=None, debug=False, local_run=True, dry_run=False, dry_run_logging=True)

lucemia · 2023-03-15T06:15:09Z

sample
{'input': [{'role': 'system', 'content': 'Complete the phrase as concisely as possible.'}, {'role': 'user', 'content': 'OpenAI was founded in 20'}], 'ideal': '15'}
(Pdb) self.model_spec
ModelSpec(name='gpt-3.5-turbo', model='gpt-3.5-turbo', is_chat=True, encoding=None, organization=None, api_key=None, extra_options={}, headers={}, strip_completion=True, n_ctx=4096, format=None, key=None, group=None)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

use evals to evaluate LLAMA #1

use evals to evaluate LLAMA #1

lucemia commented Mar 15, 2023

lucemia commented Mar 15, 2023 •

edited

Loading

lucemia commented Mar 15, 2023

lucemia commented Mar 15, 2023

use evals to evaluate LLAMA #1

use evals to evaluate LLAMA #1

Comments

lucemia commented Mar 15, 2023

lucemia commented Mar 15, 2023 • edited Loading

lucemia commented Mar 15, 2023

lucemia commented Mar 15, 2023

lucemia commented Mar 15, 2023 •

edited

Loading