Large language models (LLMs) are increasingly gaining relevance in every-day life, due to their apparent ability in solving tasks that demand intricate linguistic comprehension. Recent studies state that one of the key points that impact their outcome is the quality of the prompt used to interact with them. This work proposes a grammar-based evolutionary approach for exploring the prompt space of LLMs, driven by a fitness function that aims at optimizing the performance on a given task. We tested our technique by steering two state-of-the-art models through evolved prompts, and by comparing the performance they obtain on 8 benchmark tasks with that obtained when using other baseline prompts on the same tasks, showing that in most cases our prompts yield better results. Further, we defined a constrained mutation operator that limits the changes to specific grammar non-terminals, allowing to study and highlight the elements in the prompt that mostly affect the output of the LLM. Finally, a thorough discussion points out some issues that limit the relevance of the emerging prompt engineering discipline, given the existence of many effective prompt structures and the possible diversity that can be observed in the LLM output given the same input to the model.
(2024). Exploring the Prompt Space of Large Language Models through Evolutionary Sampling . Retrieved from https://hdl.handle.net/10446/275075
Exploring the Prompt Space of Large Language Models through Evolutionary Sampling
Saletta, Martina;
2024-01-01
Abstract
Large language models (LLMs) are increasingly gaining relevance in every-day life, due to their apparent ability in solving tasks that demand intricate linguistic comprehension. Recent studies state that one of the key points that impact their outcome is the quality of the prompt used to interact with them. This work proposes a grammar-based evolutionary approach for exploring the prompt space of LLMs, driven by a fitness function that aims at optimizing the performance on a given task. We tested our technique by steering two state-of-the-art models through evolved prompts, and by comparing the performance they obtain on 8 benchmark tasks with that obtained when using other baseline prompts on the same tasks, showing that in most cases our prompts yield better results. Further, we defined a constrained mutation operator that limits the changes to specific grammar non-terminals, allowing to study and highlight the elements in the prompt that mostly affect the output of the LLM. Finally, a thorough discussion points out some issues that limit the relevance of the emerging prompt engineering discipline, given the existence of many effective prompt structures and the possible diversity that can be observed in the LLM output given the same input to the model.File | Dimensione del file | Formato | |
---|---|---|---|
3638529.3654049.pdf
accesso aperto
Versione:
publisher's version - versione editoriale
Licenza:
Creative commons
Dimensione del file
831.36 kB
Formato
Adobe PDF
|
831.36 kB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate
Aisberg ©2008 Servizi bibliotecari, Università degli studi di Bergamo | Terms of use/Condizioni di utilizzo