Inquiries on Experiment Reproduction #2

LeoYML · 2024-05-20T09:11:52Z

This is an interesting and outstanding piece of work.
How can I reproduce the experiments detailed in the README?
Additionally, how does the performance compare when using GPT-4?

holarissun · 2024-05-24T15:43:40Z

Thanks for your interest in our work!

The experiments are done with GPT3.5 API --- combining different prompting prefixes or postfixes with the queries in different datasets (e.g., the GSM8K dataset).

There are some updates on "prompt optimization" after the very early stages when people tried to find the "magic words" as prompting strategies:

According to a later paper by Google https://arxiv.org/abs/2309.03409
Optimal prompts can be different for different types of LLMs.

The take-away here is that the performance of the prompting strategy is LLM-dependent.
This ICLR'24 paper https://arxiv.org/pdf/2309.06553 introduces a systematic way of discovering the optimal prompts for different queries.

The take-away here is that the performance of the prompting strategy is query-dependent.

I hope this could help :)

Best,
Hao

LeoYML · 2024-05-24T17:32:58Z

Thank you very much for your quick and clear response.

holarissun · 2024-08-23T21:53:56Z

closing in active issue after 3 months :)

holarissun closed this as completed Aug 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inquiries on Experiment Reproduction #2

Inquiries on Experiment Reproduction #2

LeoYML commented May 20, 2024

holarissun commented May 24, 2024 •

edited

Loading

LeoYML commented May 24, 2024

holarissun commented Aug 23, 2024

Inquiries on Experiment Reproduction #2

Inquiries on Experiment Reproduction #2

Comments

LeoYML commented May 20, 2024

holarissun commented May 24, 2024 • edited Loading

LeoYML commented May 24, 2024

holarissun commented Aug 23, 2024

holarissun commented May 24, 2024 •

edited

Loading