IaC-Eval---first edition

| Dataset | 🏆 Leaderboard TBD | 📖 NeurIPS 2024 Paper |

IaC-Eval---first edition

IaC-Eval is a comprehensive framework for quantitatively evaluating the capabilities of large language models in IaC code generation. Infrastructure-as-Code (IaC) is an important component of cloud computing, that allows the definition of cloud infrastructure in high-level programs. Our framework targets Terraform specifically for now. We leave integration of other IaC tools as future work.

IaC-Eval also provides the first human-curated and challenging Infrastructure-as-Code (IaC) dataset containing 458 questions ranging from simple to difficult across various cloud services (targeting AWS for now), which can be found in our HuggingFace repository.

Installation

Install Terraform (also install AWS CLI and setup credentials)
Install Opa (make sure to add opa to path).
^*Obtain the following LLM model inference API keys as appropriate, depending on which of our currently supported models you want to perform evaluation on:

OpenAI API token: for GPT-3.5-Turbo and GPT-4
Google API token: for Gemini-1.0-Pro
Replicate API token: for CodeLlama and WizardCoder variants

^* Our evaluation against MagiCoder was performed on a manually deployed AWS SageMaker instance inference endpoint. We provide more details on our setup script, see evaluation/README.md, if that is of interest.

Using the Evaluation Pipeline

To access and utilize the evaluation pipeline, you need to switch to a specific branch of this repository and set up the environment. Follow these steps:

Ensure you have the main branch of the project checked out.
Install the Conda environment by running:
```
conda env create -f environment.yml
```
Activate the newly created Conda environment named iac-eval:
```
conda activate iac-eval
```
Note: before conda activate you might need to do conda init SHELL_NAME on your preferred shell (e.g. conda init bash). If you run into problems initializing the shell session, try referring to this GitHub issue for a fix.
(Optional) Preconfigure the retriever database (if you would like to use the RAG strategy): refer to instructions in retriever/README.md.
See instructions in evaluation/README.md for details on how to use the main pipeline: eval.py, and other scripts.

Note: You can run ./setup.sh to check if you have Terraform and OPA installed. It will also create and activate the necessary conda environment. The shell script assumes you are using bash, change #!/bin/SHELL to your preferred shell in the script.

Contributing

We welcome all forms of contribution! IaC-Eval aims to quantitatively and comprehensively evaluate the IaC code generation capabilities of large language models. If you find bugs or have ideas, please share them via GitHub Issues. This includes contributions to IaC-Eval's dataset, whose format can be found in it's HuggingFace repository.

Acknowledgments

https://github.com/openai/human-eval/tree/master

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
evaluation		evaluation
licenses		licenses
retriever		retriever
templates		templates
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md
environment.yml		environment.yml
setup.sh		setup.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

IaC-Eval---first edition

Installation

Using the Evaluation Pipeline

Contributing

Acknowledgments

About

Releases

Packages

Languages

License

autoiac-project/iac-eval

Folders and files

Latest commit

History

Repository files navigation

IaC-Eval---first edition

Installation

Using the Evaluation Pipeline

Contributing

Acknowledgments

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages