Name		Name	Last commit message	Last commit date
parent directory ..
eval-data		eval-data
eval-scripts		eval-scripts
retrievers		retrievers
README.md		README.md
get_llm_responses.py		get_llm_responses.py
get_llm_responses_retriever.py		get_llm_responses_retriever.py

README.md

Gorilla

Get Started

Getting GPT-3.5-turbo, GPT-4 and Claude Responses (0-Shot)

To get LLM responses for the API calls, use the following command:

python get_llm_responses.py --model gpt-3.5-turbo --api_key $API_KEY --output_file gpt-3.5-turbo_torchhub_0_shot.jsonl --question_data eval-data/questions/torchhub/questions_torchhub_0_shot.jsonl --api_name torchhub

Getting Responses with Retrievers (`bm25` or `gpt`)

python get_llm_responses_retriever.py --retriever bm25 --model gpt-3.5-turbo --api_key $API_KEY --output_file gpt-3.5-turbo_torchhub_0_shot.jsonl --question_data eval-data/questions/torchhub/questions_torchhub_0_shot.jsonl --api_name torchhub --api_dataset ../data/api/torchhub_api.jsonl

Evaluate the Response with AST tree matching

After the responses of the LLM is generated, we can start to evaluate the generated responses with respect to our dataset:

cd eval-scripts
python ast_eval_th.py --api_dataset ../../data/api/torchhub_api.jsonl --apibench ../../data/apibench/torchhub_eval.json --llm_responses ../eval-data/responses/torchhub/response_torchhub_Gorilla_FT_0_shot.jsonl

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

eval

eval

README.md

Gorilla

Get Started

Getting GPT-3.5-turbo, GPT-4 and Claude Responses (0-Shot)

Getting Responses with Retrievers (`bm25` or `gpt`)

Evaluate the Response with AST tree matching

Files

eval

Directory actions

More options

Directory actions

More options

Latest commit

History

eval

Folders and files

parent directory

README.md

Gorilla

Get Started

Getting GPT-3.5-turbo, GPT-4 and Claude Responses (0-Shot)

Getting Responses with Retrievers (bm25 or gpt)

Evaluate the Response with AST tree matching

Getting Responses with Retrievers (`bm25` or `gpt`)