This repository is a playground for everything related to retrieval augmented generation (RAG). It contains scripts, notebooks, and other resources to experiment with RAG and its applications.
To get started you need to have Poetry installed. You can install Poetry by running the following command in the shell.
pip install poetry
When the installation is finished, run the following command in the shell in the root folder of this repository to install the dependencies and create a virtual environment for the project.
poetry install
After that, enter the Poetry environment by invoking the poetry shell command.
poetry shell
If everything went well, you should see the (rag-playground)
prefix in your shell prompt indicating that you are in
the
Poetry environment. You can now run the scripts and notebooks in this repository.
The repository is structured as follows:
bin
: Includes scripts that can be run from the command line; to download data, for example.data
: Includes data files.notebooks
: Includes Jupyter notebooks for experiments, sample applications, etc.secrets
: Includes secrets, such as API keys, that should not be shared publicly.src
: Includes reusable Python code that can be used in notebooks or other places.tests
: Includes tests for the scripts and notebooks.pyproject.toml
: The Poetry configuration file that includes the list of dependencies.LICENSE
: The license file.README.md
: This file.
Index | Notebook | Description |
---|---|---|
1 | embedding_and_indexing_documents | This notebook demonstrates how to embed and index documents. |
Index | Title | Authors | Year | Link |
---|---|---|---|---|
1 | Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks | Lewis et al. | 2020 | arXiv |
Most files in this repository are licensed under the MIT License - see the LICENSE file for details.